in Graph

Introduction to a graph data structure in Python

Graphs play an important part in intelligent technologies such as recommendation engines, search engines, fraud detection applications, network mapping, customer journey applications, latency evaluation and dependency management. Graphs data structures are also powerful in social networks such as Twitter, Facebook, LinkedIn and a few others. Graphs in the context of the social graph are used to recommend friends, events, company pages, jobs and personalise the ad experience. These are some of the use cases of graph data structures and applications. 

A quick reminder of the definition, graphs are considered to be the composition of vertices (nodes) with respective pairs of edges(also known as links or arcs). In a nutshell, they are viewed to comprise a determined set of nodes and edges which connect these given nodes. 

The example below adapted from GeekforGeeks indicates 5 nodes {0,1,2,3,4} and 7 edges {01,12,23,34,04,31,13,14}


Graphs can be plotted from scratch using a combination of NetworkX and Matplotlib in Python. 

We will create a simple graph from scratch for a better understanding of graph data structures. I will be using Google’s Colab Notebook to run this code. 

I start by importing the NetworkX graph library and Matplotlib, a popular plotting module from Python. The first node is added on line 11 and further nodes from 2 to 10 are added on line 13. One can easily check the number of nodes as the result generated is 10 in this instance. 

import networkx as nx
import matplotlib.pyplot as plt
G = nx.Graph()
G.add_nodes_from([2, 3, 4,5,6,7,8,9,10])
G.number_of_nodes()
10
G.add_edges_from([(1, 2), (1, 3),(2,3),(2,4),(1,7),(1,10),(4,5),(5,6),(6,7),(7,8),(8,9),(9,10)])
G.number_of_edges()
12

It now clearly shows we have 10 vertices or nodes with no edges. The next step is to add edges from the source node (1) to the subsequent nodes. You can easily check the number of edges after adding these to your graph. In this instance, the number of edges is 12 and the networkx.drawing module in conjunction with Matplotlib can be used to draw the graph G. 

nx.draw(G)
plt.show()

This is how the graph looks like with the combination of 10 nodes and 12 edges. It will be important to access the nearest neighbours to some of the nodes. 

Accessing graph edges and neighbours

The first node of this graph is 1 and the code snippet below highlights the nearest neighbours or the nodes that are directly connected to node 1. 

Node 1, is connected to 4 nodes while node 2 and 3 are connected to 3 and 2 nodes respectively. It clearly indicates node 1 has more connections than the other nodes in the example code below. The subscript notation of G[1]  or G.adj[1] can both be used to access the nearest edges.

G[1]
AtlasView({2: {}, 3: {}, 7: {}, 10: {}})
G[2]
AtlasView({1: {}, 3: {}, 4: {}})
G[3]
AtlasView({1: {}, 2: {}})
G.adj[1]
AtlasView({2: {}, 3: {}, 7: {}, 10: {}})
G.adj[2]
AtlasView({1: {}, 3: {}, 4: {}})
G.adj[3]
AtlasView({1: {}, 2: {}})

Fundamental Graph types 

There are basic graph types that are worth highlighting in this article and relating their respective class properties in NetworkX.

Directed Graph:  In graph theory, a directed graph (DiGraph) is made up of constituents of vertices connected by edges and there is a clear direction assigned to these edges. These arrows indicate which direction to move between the two nodes. The direction can either be unidirectional (single direction) or bidirectional (two-sided direction). In NetworkX, the DiGraph class provides additional properties relevant to directed edges. Some of these properties include DiGraph.out_edges(), DiGraph.in_degree(), DiGraph.predecessors(), DiGraph.successors() etc.

For clarity, a new directed graph was created with edges (1,2),(2,1) and (3,1). All of the three edges had assigned weighs of 0.65, 0.5 and 0.85 respectively. Our core focus was on node (1) and we determined its out_degree of 0.65 (1,2). The next thing was to determine its in_degree of 1.35 ((2,1) + (3,1)). The third step was to find out the total weight as the sum of in_degree and out_degree of the node (1). In addition, we checked the number of successors or neighbours of the node 1. Neighbours and successors are viewed to be the same properties and in this instance the count is 2 (nodes 2 and 3). The predecessor property identifies the actual connected nodes (2 and 3).

T = nx.DiGraph()
T.add_weighted_edges_from([(1, 2, 0.65),(2,1,0.5), (3, 1, 0.85)])
T.out_degree(1, weight='weight')
0.65
T.in_degree(1, weight='weight')
1.35
T.degree(1, weight='weight')
2.0
list(T.successors(1))
[2]
list(T.neighbors(1))
[2]
list(T.predecessors(1))
[3, 2]

Write a Comment

Comment