Name entity recognition is usually viewed as a low level NLP task but could be crucial to other tasks such as named entity disambiguation and linking. It is also relevant for information retrieval and question and answering applications. Standard named entity recognition classes are usually person, location and miscellaneous. I used the AllenNLP demo application to run a quick NER test for the Hacksaw ridge storyline. The text was extracted from the IMDB website and the below image indicates the entities. Previous research led to the identification of three core classes – person, location and organisation. During the Computational Natural Language Learning conference of 2003, a miscellaneous type was then added to the mix
The below reveals the four main entity classes or the non-fine grained, All four (person, organisation, location and miscellaneous) entity tags are highlighted. Desmond T. Doss is the name of the star character in the story and it is accurately identifies him as a person. When his surname was mentioned (Doss’s), it also has the accurate personal tag. The miscellaneous tag was used for events like the ‘Battle of Okinawa’ and a thing ‘Congressional medal of honor.’
Whilst the stas Further research also introduced geopolitical entities such as weapons vehicles and facilities. These were all contained in the article, “An empirical study on fine-grained named entity recognition”, and the authors further revealed that the apparent challenges of developing a fine-grained entity recognizer are because of the selection of the tag set, creation of training data and the creation of a fast and accurate multi-class labelling algorithm.
Continue reading