Ever wondered how to detect the most influential individual, station, motorway or node in a network? It is not a normal popularity test but a mathematical way for determining a node with the most impact in the flow of information within a network. A very good way of determining nodes that are great connectors for moving from one point of a graph to another. In a real-world situation, when these nodes are removed, the movement to other nodes in the graph becomes quite challenging. With betweenness centrality, the number of paths a node is a part of is also revealed. In a connected graph, the Betweenness Centrality algorithm calculates the shortest path between nodes in the given network. The weight between nodes is quite important in determining the shortest path as factors such as frequency, capacity, time, flow and influence determine these weights.

Continue reading# Exploring Dijkstra’s shortest path algorithm

Several graph algorithms can help reveal hidden patterns in connected data. These algorithms can be classified into several categories such as approximations (e.g clustering), assortativity (e,g average neighbour degree), communities (e.g K-Clique) and centrality (e.g shortest path). In this blog, we will be looking at one of the most popular shortest path algorithms known as the Dijkstra’s algorithm. Exploring an example table and code implementation for this algorithm. Shortest path algorithm can be relevant in a traffic network situation a user desires to discover the fastest way to move from a source to a destination. It is an iterative algorithm that provides us with the shortest path from an origin node to all other nodes in the graph. This algorithm can work in weighted and unweighted graph scenarios.

Continue reading# Different ways of representing Graphs

Graph data can be represented in different formats for onward computation. The choice of the graph representation hugely relies on the density of the graph, space required, speed and weight of edges. The main ways a graph can be represented are as an adjacency matrix, incidence matrix, adjacency list and incidence list.

**Adjacency Matrix:** This is one of the most popular ways a graph is represented. One of the core aims of this matrix is to assess if the pairs of vertices in a given graph are adjacent or not. In an adjacency matrix, row and columns represent vertices. The row sum equals the degree of the vertex that the row represents. It is advisable to use the adjacency matrix for weighted edges. You replace the standard ‘1’ with the respective weight. It is easier to represent directed graphs with edge weights through an adjacency matrix. Adjacency matrix works best with dense graphs. As dense graphs usually have twice the size of the edges to the given nodes.

# Introduction to a graph data structure in Python

Graphs play an important part in intelligent technologies such as recommendation engines, search engines, fraud detection applications, network mapping, customer journey applications, latency evaluation and dependency management. Graphs data structures are also powerful in social networks such as Twitter, Facebook, LinkedIn and a few others. Graphs in the context of the social graph are used to recommend friends, events, company pages, jobs and personalise the ad experience. These are some of the use cases of graph data structures and applications.

A quick reminder of the definition, graphs are considered to be the composition of vertices (nodes) with respective pairs of edges(also known as links or arcs). In a nutshell, they are viewed to comprise a determined set of nodes and edges which connect these given nodes.

Continue reading# The rise of eight different types of graph

We are witnessing the rise and adoption of graph databases across different verticals. Gartner acknowledged the five different types of graphs as social, intent, consumption, mobile and interest. In a presentation titled: Graph All the things! Introduction to graph databases, the team from Neo4j captured Gartner’s graph classifications in the illustration below. There is a slight difference in how one of the types of a graph is named by Neo4j in comparison to Gartner. To Neo4j it is a payment graph while to Gartner it is the consumption graph. There is the argument that a consumption graph is a better name as we do not necessarily pay for every consumption. We will now look at each graph and add some additional types

Continue reading# The world’s leading graph databases

The increase in data has led to a growing need for graph databases or technologies. With a graph database, the relationships that exist within the data can be stored, refined and queried properly. A graph database, therefore, is a database created to store data without restricting it to a pre-set model. The data in these graph-based technology expresses how each entity is related to others.

Nodes and edges are quite important when looking at graph databases as the later represents the relationship with the former. This nodes and edges setup, makes the retrieval and querying of relationships easier. Retrieving complex hierarchical structures is an advantage that these graph technologies have over relational databases. The software review forum G2 has a list of the top-rated graph databases in the market. The leading graph database technologies on G2 have had more reviews, a higher percentage of positive feedback, more data generated from other online networks and social platforms.

Continue reading# A domain specific knowledge base for experiences and events

I have been fascinated by the knowledge representation and reasoning field. My presentation on commonsense knowledge graphs instilled a greater desire for knowledge bases in the form of ontologies and knowledge graph. It became clear to me that it would be difficult to create a deeper knowledge base that embodies commonsense knowledge A knowledge base that will extend the meaning of entities by capturing analogical reasoning elements such as metaphors will have to be domain specific.

The Hasso Plattner Institut has a great lecture on Knowledge and Engineering with Semantic Web Technologies by Dr Harald Sack on YouTube is a great resource that has inspired me to look at domain-specific knowledge base. There are four types and categories of ontologies. These are viewed according to their level of generality and they are the top-level ontology, domain ontology, task ontology and application ontology.

A commonsense or distilled knowledge base for the experiences and events will be viewed as a domain ontology with a bit of task ontology. This is because some experiences can be an attraction within the tourism and travel domain (e.g Buckingham Palace) and also tasks like Buckingham Palace tours, Buckingham Palace visit, Buckingham Palace changing of the guards. These Buckingham Palace related terms can be linked to top-level ontology terms such as historic sites or royal palaces.

Continue reading# An introduction to comparable entity mining at a structural level.

The comparing of entities is usually important in human decision making.

We are constantly comparing entities daily from holiday destinations, new mobile phone and next family car. Comparing entities look at the relatedness of these objects or concepts. Relatedness does not look at only similarity (analogy) but other relational structures such as resistance (antithesis), discrepancy (anomaly) and incompatibility (antinomy).

Comparative entity mining is crucial to any keyword, concept, content, product and marketing strategy. This is because no product exists in isolation. It is therefore important for businesses to place themselves in the shoes of potential customers and explore the alternative products that are vying for the same attention and mind space. Conducting this exercise will help brands position their products in a more compellingly through through engaging branding and compelling storytelling.

Continue reading# From keyword research to topic modelling and then concept modelling

Keywords are very powerful in today’s digital landscape. We all love us some keywords, right? The monthly search volume and competitiveness of keywords guide our product, brand, content and advertising strategy. Several blogs have been written on keyword research strategy. In connection with these strategies are tools such as Google Keyword Planner, Ahrefs, Answer the public, Moz Keyword Explorer, Google Keyword Trends and other leading tools.

Google Trends is a great tool from Google to help gauge the search volume of keywords, topics and entities over time. A quick search for London Marathon reveals London Marathon as a search term and a topic. Whilst the search term focuses more on the ‘London Marathon’ searches carried out by users. London marathon as a topic is a collection of London Marathon related search terms and engagement with London Marathon related publications. Whilst Google Trends fails to provide an exact or bucketed search volume data, its scaling system of 1-10 indicates the popularity of a search term over a time range.

Continue reading# A Count-based and Predictive vector models in the Semantic Age

Computer applications such as search engines and dialogue systems usually process a large volume of text on a regular basis. These systems have progressed over the years due to advanced word embeddings, progressive research and modern machine learning libraries. It is believed that audio and visual datasets have dense high-dimensional datasets encoded as vectors of a separate raw pixel. Text generated from natural language is usually understood to contain vectors that are sparse when compared to video and visuals.

Continue reading