The role of the betweenness centrality measure in networks 

Ever wondered how to detect the most influential individual, station, motorway or node in a network? It is not a normal popularity test but a mathematical way for determining a node with the most impact in the flow of information within a network. A very good way of determining nodes that are great connectors for moving from one point of a graph to another. In a real-world situation, when these nodes are removed, the movement to other nodes in the graph becomes quite challenging. With betweenness centrality, the number of paths a node is a part of is also revealed. In a connected graph, the Betweenness Centrality algorithm calculates the shortest path between nodes in the given network. The weight between nodes is quite important in determining the shortest path as factors such as frequency, capacity, time, flow and influence determine these weights. 

Continue reading

The importance of fine grained named entity recognition

Name entity recognition is usually viewed as a low level NLP task but could be crucial to other tasks such as named entity disambiguation and linking. It is also relevant for information retrieval and question and answering applications. Standard named entity recognition classes are usually person, location and miscellaneous. I used the AllenNLP demo application to run a quick NER test for the Hacksaw ridge storyline. The text was extracted from the IMDB website and the below image indicates the entities. Previous research led to the identification of three core classes – person, location and organisation.  During the Computational Natural Language Learning conference of 2003, a miscellaneous type was then added to the mix

The below reveals the four main entity classes or the non-fine grained, All four (person, organisation, location and miscellaneous) entity tags are highlighted. Desmond T. Doss is the name of the star character in the story and it is accurately identifies him as a person. When his surname was mentioned (Doss’s), it also has the accurate personal tag.  The miscellaneous tag was used for events like the ‘Battle of Okinawa’ and a thing ‘Congressional medal of honor.’

Whilst the stas Further research also introduced geopolitical entities such as weapons vehicles and facilities.  These were all contained in the article, “An empirical study on fine-grained named entity recognition”, and the authors further revealed that the apparent challenges of developing a fine-grained entity recognizer are because of the selection  of the tag set, creation of training data and the creation of a fast and accurate multi-class labelling algorithm.

Continue reading