Comparison of different Named Entity Recognition (NER) tools and APIs.

  • NLTK (Python)
  • SpaCy (Python)
  • Stanford NER (Java)
  • Alchemy by IBM (API)
  • Indico.io (API)
  • Intellexer (API)
  • Cogito (API)
  • Saplo (API)
  • TextRazor (API)

To test all of these tools I have selected three pieces of text:

  • “Israeli PM condemns video of Jewish extremists celebrating toddler’s death”
  • “Duchy Originals forced to buy back shares from Prince Charles’s charitable foundation”
  • “The king was returning that day to his Versailles, a 118-room snowbird’s paradise that will become a winter White House if he is elected president. Mar-a-Lago is where Mr. Trump comes to escape, entertain and luxuriate in a Mediterranean-style manse, built 90 years ago by the cereal heiress Marjorie Merriweather Post.”

The aim of the whole blog is two-fold. Firstly it just serves to compare the libraries/API’s on NER. Secondly I was curious to explore if it may get ‘better’ when dealing with a larger body of text (i.e., a paragraph), although I could probably have selected a better paragraph to properly explore this.

To start we use two single sentences to get an idea of how well it would perform on news-headlines. The first one we are going to run through all the extractors is

“Duchy Originals forced to buy back shares from Prince Charles’s charitable foundation”

NLTK Duchy Originals forced to buy back shares from Prince Charles's charitable foundation
Stanford NER Duchy Originals forced to buy back shares from Prince Charles's charitable foundation
SpaCy Duchy Originals forced to buy back shares from Prince Charles's charitable foundation
Alchemy (API) Duchy Originals forced to buy back shares from Prince Charles's charitable foundation
Indico.io (API) Duchy Originals forced to buy back shares from Prince Charles's charitable foundation
Saplo (API) Duchy Originals forced to buy back shares from Prince Charles's charitable foundation
Intellexer (API) Duchy Originals forced to buy back shares from Prince Charles's charitable foundation
TextRazor (API) Duchy Originals forced to buy back shares from Prince Charles's charitable foundation
Cogito (API) Duchy Originals forced to buy back shares from Prince Charles's charitable foundation

The second sentence we’ll try is

“Tories in civil war as Duncan Smith attacks austerity programme”

NLTK Tories in civil war as Duncan Smith attacks austerity programme
Stanford NER Tories in civil war as Duncan Smith attacks austerity programme
SpaCy Tories in civil war as Duncan Smith attacks austerity programme
Alchemy (API) Tories in civil war as Duncan Smith attacks austerity programme
Indico.io (API) Tories in civil war as Duncan Smith attacks austerity programme
Saplo (API) Tories in civil war as Duncan Smith attacks austerity programme
Intellexer (API) Tories in civil war as Duncan Smith attacks austerity programme
TextRazor (API) Tories in civil war as Duncan Smith attacks austerity programme
Cogito (API) Tories in civil war as Duncan Smith attacks austerity programme

In addition to the sentences we explored earlier I’ll now see how well they can deal with a paragraph of text. The paragraph we will use for this is:

“The king was returning that day to his Versailles, a 118-room snowbird’s paradise that will become a winter White House if he is elected president. Mar-a-Lago is where Mr. Trump comes to escape, entertain and luxuriate in a Mediterranean-style manse, built 90 years ago by the cereal heiress Marjorie Merriweather Post.”

NLTK The king was returning that day to his Versailles, a 118-room snowbird’s paradise that will become a winter White House if he is elected president. Mar-a-Lago is where Mr. Trump comes to escape, entertain and luxuriate in a Mediterranean-style manse, built 90 years ago by the cereal heiress Marjorie Merriweather Post.
Stanford NER The king was returning that day to his Versailles, a 118-room snowbird’s paradise that will become a winter White House if he is elected president. Mar-a-Lago is where Mr. Trump comes to escape, entertain and luxuriate in a Mediterranean-style manse, built 90 years ago by the cereal heiress Marjorie Merriweather Post.
SpaCy
Alchemy (API) The king was returning that day to his Versailles, a 118-room snowbird’s paradise that will become a winter White House if he is elected president. Mar-a-Lago is where Mr. Trump comes to escape, entertain and luxuriate in a Mediterranean-style manse, built 90 years ago by the cereal heiress Marjorie Merriweather Post.
Indico.io (API) The king was returning that day to his Versailles, a 118-room snowbird’s paradise that will become a winter White House if he is elected president. Mar-a-Lago is where Mr. Trump comes to escape, entertain and luxuriate in a Mediterranean-style manse, built 90 years ago by the cereal heiress Marjorie Merriweather Post.
Saplo (API) The king was returning that day to his Versailles, a 118-room snowbird’s paradise that will become a winter White House if he is elected president. Mar-a-Lago is where Mr. Trump comes to escape, entertain and luxuriate in a Mediterranean-style manse, built 90 years ago by the cereal heiress Marjorie Merriweather Post.
Intellexer (API) The king was returning that day to his Versailles, a 118-room snowbird’s paradise that will become a winter White House if he is elected president. Mar-a-Lago is where Mr. Trump comes to escape, entertain and luxuriate in a Mediterranean-style manse, built 90 years ago by the cereal heiress Marjorie Merriweather Post.
TextRazor (API) The king was returning that day to his Versailles, a 118-room snowbird’s paradise that will become a winter White House if he is elected president. Mar-a-Lago is where Mr. Trump comes to escape, entertain and luxuriate in a Mediterranean-style manse, built 90 years ago by the cereal heiress Marjorie Merriweather Post.
Cogito (API) The king was returning that day to his Versailles, a 118-room snowbird’s paradise that will become a winter White House if he is elected president. Mar-a-Lago is where Mr. Trump comes to escape, entertain and luxuriate in a Mediterranean-style manse, built 90 years ago by the cereal heiress Marjorie Merriweather Post.

Conclusion

I don’t really have any conclusion, I merely wanted to see how different tools/API’s performed on text.