in Uncategorized

The importance of fine grained named entity recognition

Name entity recognition is usually viewed as a low level NLP task but could be crucial to other tasks such as named entity disambiguation and linking. It is also relevant for information retrieval and question and answering applications. Standard named entity recognition classes are usually person, location and miscellaneous. I used the AllenNLP demo application to run a quick NER test for the Hacksaw ridge storyline. The text was extracted from the IMDB website and the below image indicates the entities. Previous research led to the identification of three core classes – person, location and organisation.  During the Computational Natural Language Learning conference of 2003, a miscellaneous type was then added to the mix

The below reveals the four main entity classes or the non-fine grained, All four (person, organisation, location and miscellaneous) entity tags are highlighted. Desmond T. Doss is the name of the star character in the story and it is accurately identifies him as a person. When his surname was mentioned (Doss’s), it also has the accurate personal tag.  The miscellaneous tag was used for events like the ‘Battle of Okinawa’ and a thing ‘Congressional medal of honor.’

Whilst the stas Further research also introduced geopolitical entities such as weapons vehicles and facilities.  These were all contained in the article, “An empirical study on fine-grained named entity recognition”, and the authors further revealed that the apparent challenges of developing a fine-grained entity recognizer are because of the selection  of the tag set, creation of training data and the creation of a fast and accurate multi-class labelling algorithm.

With the benefit of AllenNLP, a fine-grained entity recognition was ran. The miscellaneous tag used for ‘the Congressional Medal of Honor’ phrase in  a standard NER (Named Entity recognition) task is different in a fine-grained NER. ‘Work of art’ is revealed as an entity tag and adds more meaning than a miscellaneous tag. 

Previous research on fine-grained named entity recognition has led to more in-depth tags. In these works, the main tags are divided into sub-tags to generate more meaning to the entities. For example, the ‘Person’ entity is broken down to sub-categories such as actor, architect, artist, athlete, author, coach, director, doctor, engineer, monarch, politician, religious leader, soldier and terrorist. The popular python NLP library SpaCy, also has a named entity recognition feature and some of the tags it supports are person, NORP (Nationalities or political or religious group), FAC (Building, airports, highways, bridges e.t.c), GPE (Countries, cities, states) and a lot more entities. One can easily state that a fine-grained named entity recognition application or library could be instrumental in narrative intelligence and relational reasoning. As the more detailed or fine-grained meaning the entities is a narrative can be expressed, the more enriched the story becomes and its ability to embody a string relational reasoning. 

Write a Comment

Comment