The post A simplified analysis of the 0-1 knapsack problem appeared first on Dateme Tubotamuno - Semantic Geek.

]]>**Diving into the Knapsack Problem**

Let’s now assume a deeper look into the logic of the Knapsack Problem. The decision version of the Knapsack problem is considered NP-Complete. Firstly, we will briefly examine the problem statement. Providing a set of objects with attributable value and weights (Vi, Wi), what is the maximum value that can be attained when the sum of the subsets of these objects are selected to be within the knapsack capacity. In a simpler manner, using the image below, the maximum capacity of the knapsack is 40kg and we have a variety of items with their respective values and weights. The idea is to pick enough items whose total will be less than 40kg with special consideration to the value. In essence, we want to add as many items that will cost us the least. It is similar to going shopping and you are not able to get all the items you desired to get because there is no carrier bag and all you have is a laptop backpack. The number of items you can purchase in this context will be limited to the capacity of your bag.

After selecting the appropriate items that can fit into your knapsack, you can then check your receipt to determine the overall value of the items. This grocery shopping experience is quite similar to the knapsack problem. The data that is inputted in the matrix of the Knapsack problem is the actual value (i.e price from the image above), The weight of the individual items are used to determine compatibility during the compute stage and the respective value is entered for the relevant items. We will look at a matrix representation of a knapsack problem scenario.

**Matrix representation of the knapsack problem **

We will represent the matrix of a knapsack problem below. More simply, we will use a different knapsack example with a maximum capacity of 8kg.

Each row of the above matrix represents an object or item for which there is a relevant value and weight. The column symbolises the sequential capacity of the knapsack to a maximum of 8. The idea is to input the relevant values at each interjection between the columns and the rows. The maximum capacity of the knapsack is realised in cells containing 8kg. As earlier stated, the weights and the relevant value of the object determines its suitability to be added to the knapsack. It is a combinatorial optimisation problem geared at discovering the optimal choice of objects.

The implementation of the knapsack problem in Python

Here is a python implementation of the knapsack problem inspired by this code.

```
# A Dynamic Programming implementation
# Codes for 0-1 Knapsack problem
# Returns the maximum value V that can
# be included in a knapsack of capacity W
def knapSack(W, wt, val, n):
K = [[0 for x in range(W + 1)] for x in range(n + 1)]
# Build table K[][] in bottom up manner
for i in range(n + 1):
for w in range(W + 1):
if i == 0 or w == 0:
K[i][w] = 0
elif wt[i-1] <= w:
K[i][w] = max(val[i-1]
+ K[i-1][w-wt[i-1]],
K[i-1][w])
else:
K[i][w] = K[i-1][w]
return K[n][W]
# The Driver code
val = [3, 13, 9, 7, 6]
wt = [14, 11, 10, 9, 5]
W = 40
n = len(val)
print(knapSack(W, wt, val, n))
Output:
35
```

The post A simplified analysis of the 0-1 knapsack problem appeared first on Dateme Tubotamuno - Semantic Geek.

]]>The post Exploring the Traveling Salesman Problem (TSP) appeared first on Dateme Tubotamuno - Semantic Geek.

]]>**Combinatorial Optimisation Problem**

TSP belongs to a large class known as combinatorial optimisation problems. Combinatorial optimisation is aimed at discovering the best object (or city in a map situation) from a finite set of objects (or list of cities). The best solution or decision in selecting the route between objects or cities is believed to be discrete or at least reduced to discrete. Hence, the problem becomes NP-hard as the ultimate goal of most combinatorial optimisation problems seeks to find an efficient way of allocating resources such as time or money depending on the scenario.

**The difficulty of the Traveling Salesman problem in Artificial Intelligence**

Solving TSP is considered to be computationally challenging even in modern times. It becomes quite challenging when a salesman desires to find the shortest route through several cities to safely return home. Regardless of the challenges, some algorithms and methods have been modified or designed to solve this problem. The popular Depth First Search (DFS) and Breadth-First Search (BFS) algorithms are two possible ways for tackling TSP.

Common applications of the Travel Salesman Problems

**Minimisation of travel cost**: TSP can be useful in reducing the cost involved in travelling across several service stations. In this paper, TSP is utilised to ascertain the most optimal route for the service workers of ABC appliances limited. These service workers are expected to visit customers once a month to work on their air conditioners. Finding the best possible means of visiting all the locations of the customers to ensure the cost and distance between a pair of customer locations is taken into consideration. This will ensure the most efficient route is chosen considering the head office is the departure and arrival point.

TSP as an optimisation method is believed to be relevant in applications such as logistics, planning and the manufacturing of microchips.

**Code implementation of TSP**

We will be using the air conditioning servicing example above in implementing TSP implementation in Python below. Let’s assume node 1 is the headquarters of ABC appliances limited and the service workers are scheduled to visit all the customers in locations 2, 3, 4 and 5 before returning to the origin. What is the most cost-effective route to take? We will ascertain this after the code implementation.

**Python implementation of the Travelling Salesman Problem (TSP)**

The code implementation is inspired from this source.

```
from sys import maxsize
from itertools import permutations
V = 4
# executing the traveling Salesman Problem (TSP)
def travellingSalesmanProblem(graph, z):
# all vertex are stored exclusing the source node
vertex = []
for i in range(V):
if i != z:
vertex.append(i)
min_path = maxsize
next_permutation=permutations(vertex)
for i in next_permutation:
# states the current Path weight(cost)
current_pathweight = 0
# execute the current path weight
k = z
for j in i:
current_pathweight += graph[k][j]
k = j
current_pathweight += graph[k][z]
# update minimum
min_path = min(min_path, current_pathweight)
return min_path
# the weight of the edges in a matrix
if __name__ == "__main__":
graph = [[0, 12, 30, 14], [12, 0, 17, 25],
[14, 25, 0, 11], [30, 17, 11, 0]]
z = 0
print(travellingSalesmanProblem(graph, z))
```

The TSP path weight is 62. It is the most cost efficient route for the after sales team.

The post Exploring the Traveling Salesman Problem (TSP) appeared first on Dateme Tubotamuno - Semantic Geek.

]]>The post Understanding Articulation Points in a graph appeared first on Dateme Tubotamuno - Semantic Geek.

]]>**A simple illustration of articulation points **

The undirected graph below contains seven nodes and there are two articulation or critical points. Node B is very important to the network as it directly connects with five nodes. Removing node B will break this graph into three disconnected components. The three disconnected graphs after removing node B will be (A) , (C and D) and (E, F and G). The second articulation point on this graph is node C. A decision to remove node C will lead to two disconnected components which are nodes (A, B, E, F, G) and (D). This clearly shows that node B and C are the two articulation points with B being slightly more critical. Node B is the most critical because if removed it renders the remaining graphs into three disconnected components. On the other hand, removing vertex C splits the graph into two disconnected components.

Some steps in discovering articulation points

A standard way of finding the articulation point in a given graph is by removing each node one after the other. The goal is to determine if the elimination of a specific node will lead to the disconnection of the graph. Two common techniques utilised for finding articulation points and checking the connectivity of a graph is the DFS (Depth First Search) and BFS (Breadth First Search) approaches.

Step 1: Remove node v from the graph

Step 2: Determine the connectivity of the graph using BFS or DFS

Step 3: Restore node v back to the disconnected or connected graph.

**DFS and BFS in articulation points**

I’ve written an article on DFS and BFS but in this section will briefly look at how these two techniques are relevant to discovering articulation points. DFS uses an edge-based technique to traverse through a graph via a LIFO(Last in, First Out) approach. The visited nodes are added to a stack and the most recent vertex in the graph is removed for the continuation of the traversal. On the contrary, BFS adopts a node-based technique in finding the shortest path. When a vertex is visited, it is marked and added to the queue. Usually, nodes at the same horizontal level are first visited before progressing to the next tier. When a node is visited, it is added to a queue which is known for its FIFO (First in, First Out) strategy. BFS are considered to be slower than DFS in implementation.

**DFS as a better option for Articulation points:**

DFS can be viewed as a better traversal to prevent the disconnection of a graph. Compared to BFS, DFS pays better recognition to parent nodes which are sometimes the articulation points in a given graph.

The output of a BFS traversal for the above graph will be : R, A, B, C, D, E. With a queue based system, the source node of R will be removed from the node lists and that will result in the disconnection of the graph as node R is an articulation point.

On the other hand, with DFS, the source or origin node remains in the stack due to the LIFO method. The outcome in this case is R, A, E, B, C, D. The connected network will remain as the source node R is maintained. Unlike in BFS, a queue-based system implies the removal of an articulation node R. In the above example, there are two articulation vertices R and E.

From a business perspective, articulation points could be applicable to a telephone network where a critical connection between other units is disconnected. Or, a group of friends who are usually brought together by the same person in the group. Assuming the link between these friends goes travelling or is out of town, the social connection between these folks will go cold.

```
// A Java program to find articulation points in an undirected graph
import java.io.*;
import java.util.*;
import java.util.LinkedList;
// Class represents an undirected graph using adjacency list
class Graph
{
private int V; // No. of vertices
// Array of lists for Adjacency List Representation
private LinkedList<Integer> adj[];
int time = 0;
static final int NIL = -1;
// Constructor
Graph(int v)
{
V = v;
adj = new LinkedList[v];
for (int i=0; i<v; ++i)
adj[i] = new LinkedList();
}
//Function to add an edge into the graph
void addEdge(int v, int w)
{
adj[v].add(w); // Add w to v's list.
adj[w].add(v); //Add v to w's list
}
// A recursive function that find articulation points using DFS
// u --> The vertex to be visited next
// visited[] --> keeps tract of visited vertices
// disc[] --> Stores discovery times of visited vertices
// parent[] --> Stores parent vertices in DFS tree
// ap[] --> Store articulation points
void APUtil(int u, boolean visited[], int disc[],
int low[], int parent[], boolean ap[])
{
// Actual count of children in DFS Tree
int children = 0;
// indeicate the current node as visited
visited[u] = true;
// commencing discovery time and low value
disc[u] = low[u] = ++time;
// Run through all nodes adjacent to this
Iterator<Integer> i = adj[u].iterator();
while (i.hasNext())
{
int v = i.next(); // v is current adjacent of u
// when v is not visited yet, make it a child of u
// in DFS tree and recur for it as appropriate
if (!visited[v])
{
children++;
parent[v] = u;
APUtil(v, visited, disc, low, parent, ap);
low[u] = Math.min(low[u], low[v]);
// the u is an articulation point in following cases
if (parent[u] == NIL && children > 1)
ap[u] = true;
if (parent[u] != NIL && low[v] >= disc[u])
ap[u] = true;
}
// Update low value of u for parent function calls.
else if (v != parent[u])
low[u] = Math.min(low[u], disc[v]);
}
}
// The function to do DFS traversal. It uses recursive function APUtil()
void AP()
{
// Mark all the vertices as not visited
boolean visited[] = new boolean[V];
int disc[] = new int[V];
int low[] = new int[V];
int parent[] = new int[V];
boolean ap[] = new boolean[V]; // To store articulation points
// Initialize parent and visited, and ap(articulation point)
// arrays
for (int i = 0; i < V; i++)
{
parent[i] = NIL;
visited[i] = false;
ap[i] = false;
}
// Using the recursive helper function to find articulation
// points in DFS tree rooted with vertex 'i'
for (int i = 0; i < V; i++)
if (visited[i] == false)
APUtil(i, visited, disc, low, parent, ap);
// Now ap[] contains articulation points, print them
for (int i = 0; i < V; i++)
if (ap[i] == true)
System.out.print(i+" ");
}
// Using the driver method
public static void main(String args[])
{
// Alphabetical graph labels are replaced by number. i.e A is 0.
System.out.println("Articulation points in graph ");
Graph g1 = new Graph(7);
g1.addEdge(0,1);
g1.addEdge(1,6);
g1.addEdge(0,2);
g1.addEdge(0,3);
g1.addEdge(0,4);
g1.addEdge(4,6);
g1.AP();
System.out.println();
}
}
Articulation points in graph
0
```

The post Understanding Articulation Points in a graph appeared first on Dateme Tubotamuno - Semantic Geek.

]]>The post A practical understanding of topological sorting and ordering appeared first on Dateme Tubotamuno - Semantic Geek.

]]>There are different algorithms designed to address the shortest path problem. We’ve covered one of such in the Dijkstra algorithm. Topological sorting is an important shortest path algorithm that can be applied to projects that require the completion of a prior task before a new one can be commenced. With this algorithm, there are a series of nodes in a Directed Acyclic Graph (DAG). A directed acyclic graph is a directed graph that contains no cycles. The ordering of nodes using this algorithm is referred to as topological ordering. It is a non-distinct arrangement of nodes that stipulates an edge from x to y should have x appearing before y in the topological sort order.

The source node is quite important with this algorithm and it is usually one with an in-degree of 0 or no incoming edges. Topological sort also works best when a graph consists of positive weights.

**Topological sorting on a simple directed acyclic graph**

As we’ve explained above, a DAG (Directed Acyclic Graph) contains no cycle or loop. From the graph below, it is quite clear that the edge connections end at vertex A. A topological ordering should naturally commence from nodes D or E. this is because both nodes do not contain an incoming edge.

There can be two topological sortings with the above graph. The sorting would always be initiated from the two source nodes of E and D. Are you wondering why the sort should commence with nodes D or E? It is because both nodes do not have an in-degree or incoming edges. With that in mind, in no particular order, the first sort is D, E, C, F, B, A and the second is E, D, C, B, A in the above graph.

**The practical application and theories of topological sorting**

A practical application of topological sorting is on tasks that involve a sequence of jobs.In a nutshell, when dependencies are required before the completion of a given job, topological sorting may be required. Khan and Parallel algorithms are two common types of topological sorting algorithms.

Below is a Python implementation of the Topological sorting algorithm highlighted in the diagram above. The code is inspired by Neelam Yardev from Geek for Geeks.

```
from collections import defaultdict
class Graph:
def __init__(self, vertices):
self.graph = defaultdict(list)
self.V = vertices
def addEdge(self, u, v):
self.graph[u].append(v)
def topologicalSortUtil(self, v, visited, stack):
# Mark the current vertex as visited.
visited[v] = True
for i in self.graph[v]:
if visited[i] == False:
self.topologicalSortUtil(i, visited, stack)
stack.append(v)
def topologicalSort(self):
# Mark all the nodes as not visited
visited = [False]*self.V
stack = []
for i in range(self.V):
if visited[i] == False:
self.topologicalSortUtil(i, visited, stack)
print ("stack"[::-1]) # return the adjency list in reverse order
#Source nodes and edges for the topological sorting
g = Graph(6)
g.addEdge("D", "B")
g.addEdge("D", "A")
g.addEdge("E", "A")
g.addEdge("E", "C")
g.addEdge("C", "F")
g.addEdge("F", "B")
print ("This is a Topological Sort of the graph")
```

The post A practical understanding of topological sorting and ordering appeared first on Dateme Tubotamuno - Semantic Geek.

]]>The post Exploring Breadth First Search and Depth First Search in a graph appeared first on Dateme Tubotamuno - Semantic Geek.

]]>**Exploring Breadth First Search or Breadth First Traversal **

BFS is an algorithm that is designed to search for a graph or tree data formation. It usually travels in a breadthward motion and utilises a queue as a prompt to identify the next vertex to commence a traversal. If a roadblock is encountered or no adjacent node is found, the tree root or the source node is removed for the queue. The traversal of the graph usually begins with a ‘search key’ or the initialising node. Imagine a hotel with so many floors and rooms as nodes, a breadth-first traversal algorithm is like a cleaning staff that will clean rooms floor by floor. All neighbouring nodes at the current depth or floor with the example above will be visited to clean before moving to the vertices or rooms on the next floor. No node is expected to be revisited as one would not expect hotel staff to clean the same room twice in the same period. Once a room is cleaned, it is ticked on a sheet as a visited while with BFS, the neighbouring reversed node is enqueued or marked as visited,

Rules are important to ensure procedures are followed or order is maintained. There are relevant rules for a BFS algorithm.

**Rule Number One**: From the tree root or ‘search key’ visit the adjacent unvisited vertex. In some cases, if there are three neighbouring nodes, an alphabetic or serial logic can be applied to choose the nearest node. Once visited, the node is displayed and enqueued.

**Rule Number Two:**In the event, a neighbouring or adjacent node is not available, the tree root or initial node is removed from the queue and previously visited node plays the role of a search key (tree root or source node).

**Rule Number Three:**The final rule is simply a repetition of the above two rules until all nodes are visited.

Below is a graphical example of how breadth-first search unfolds

**Step 1:** The queue is initiated with node R as the tree root.

Queue: [ ]

**Step 2**: Graph traversal commences with the starting node R. The root node is now marked as visited.

Queue: [ ]

**Step 3:** The neighbouring node of A is now next in line and subsequently marked as visited.

Queue: [ A ]

**Step 4:** There are a couple of neighbouring nodes to A that can be added to the enqueued and marked. On this occasion, alphabetically, node B is the next logical neighbouring node that should be added to the queue and marked accordingly. That is what has been implemented in this case.

Queue: [B, A ]

**Step 5**: Node C is next in line and is visited and marked as well.

Queue: [C, B, A ]

**Step 6:** Node D is the adjacent node to C and alphabetically ticks the box.

Queue: [D, C, B, A ]

**Step 7**: All adjacent nodes to R have all been visited. In this case, R will be dequeued and A will be the root node that leads to the enqueuing and further marking of E.

Queue: [ E, D, C, B]

Here is a Python code implementation of BFS with nodes assigned a numerical label as opposed to the strings in the diagram representation above.

```
from collections import defaultdict
# This class represents a directed graph
# using adjacency list representation
class Graph:
# Establising the constructor
def __init__(self):
self.graph = defaultdict(list)
def addEdge(self,n,r):
self.graph[n].append(r)
def BFS(self, t):
visited = [False] * (max(self.graph) + 1)
queue = []
queue.append(t)
visited[t] = True
while queue:
# Dequeue a vertex from
# queue and print it
t = queue.pop(0)
print (t, end = " ")
for i in self.graph[t]:
if visited[i] == False:
queue.append(i)
visited[i] = True
# Create a graph given in
# the above diagram
g = Graph()
g.addEdge(0, 1)
g.addEdge(0, 2)
g.addEdge(1, 2)
g.addEdge(1, 0)
g.addEdge(2, 0)
g.addEdge(2, 3)
g.addEdge(3, 3)
print("Executing BFS from (initialising from node 0)")
g.BFS(0)
#Output
Executing BFS from (initialising from node 0)
0 1 2 3
```

**Examining Depth First Search or Depth First Traversal**

Depth-first search traversal is different from breadth-first due to the direction of the search carried out. With depth-first search, the movement from a node to an adjacent one is depthward in nature. The depth-first search uses a stack to recall to get to the neighbouring unvisited vertex while breath-first uses a queue. A stack is the opposite of a queue, as the former is an ordered list of properties where all inclusions and exclusions are implemented at the same end. On the other hand, a queue has two sides, one end is used for inserted visited nodes and the other to delete data. The queue is usually referred to as FIFO (first-in, first-out) and the stack as the LIFO rule (last-in, first-out).

The below graph is an expression of how a depth-first traversal works. It does not search based on alphabetical order. The next node from a depthward motion is marked as visited and added to the stack. Node S is the tree root or the starting node and the traversal moves deep by visiting and adding nodes E, H and K to the stack. The edge labels indicate the order of the traversal with node G, becoming the last node in the traversal.

To implement the above traversal, certain rules were pertinent. You could refer to them as the rules of depth-first search.

Below are the rules that should guide the implementation of DFS

**Rule Number One**: The adjacent unvisited node should be visited. It needs to be marked as visited and added to the DFS stack. The traversal highlights a depthward motion.

**Rule Number Two**: When all adjacent vertices from a given node are visited, the most recently added node is deleted from the stack to make way for a new insertion.

**Rule Number Three**: Rules one and two are repeated until all nodes have been visited and added to the stack.

**Sequential implementation of DFS**

**Step 1: **This is a network with an empty stack. Vertex R is the root node and the stack is being initialised.

**Step 2:** Node R is visited and added to the stack. It is now the top node in the stack and the adjacent node from a depthward motion will be executed in the next step.

**Step 3:** Node A is adjacent to R and is visited next in that order. It is added to the stack and becomes the top node.

**Step 4:** Node E, is visited and also added to the stack. In a breadth-first search, node B would have been the adjacent vertex. In this case, vertex E becomes the top as it is the most recently visited one.

**Step 5:** Node B is unvisited and adjacent to E. In this instance, B is visited and added to the stack as the top node.

**Step 6:** There are no unvisited adjacent nodes to B. The vertex B will now be removed from the stack and E assumes the role of the top node to aid a smooth continuation of the traversal process.

**Step 6:** Node C is now visited and added to the stack to become the top vertex.

**Step 7:** There are no unvisited neighbouring nodes to C. Node C is now removed from the stack and the previously visited E becomes the top node to allow for a clear traversal to an unvisited vertex.

**Step 8: **Node D was the only one not visited in the previous sequence or step. It is now visited and added to the stack as the top node.

Here is the python code implementation using the node labels above to print the DFS order.

```
from collections import defaultdict
class Graph:
def __init__(self):
# The default dictionary for the graph
self.graph = defaultdict(list)
# function to add an edge to graph
def addEdge(self, u, v):
self.graph[u].append(v)
# A DFS function
def DFSUtil(self, v, visited):
# Mark the current vertex as visited
# and print it
visited.add(v)
print(v, end=' ')
# Recur for all the vertices
# adjacent to this vertex
for neighbour in self.graph[v]:
if neighbour not in visited:
self.DFSUtil(neighbour, visited)
# recursive DFSUtil() for DFS traversal
def DFS(self, v):
# Create a set to store visited vertices
visited = set()
# Requesting the recursive helper function
# to print the DFS traversal graph
self.DFSUtil(v, visited)
# Creating a graph and connecting the nodes
g = Graph()
g.addEdge('S', 'E')
g.addEdge('E', 'H')
g.addEdge('H', 'K')
g.addEdge('K', 'I')
g.addEdge('I', 'F')
g.addEdge('K', 'J')
g.addEdge('J', 'G')
print("Executing DFS from (initialising from node E)")
g.DFS('E')
```

The post Exploring Breadth First Search and Depth First Search in a graph appeared first on Dateme Tubotamuno - Semantic Geek.

]]>The post Exploring the different types of weights in a connected graph appeared first on Dateme Tubotamuno - Semantic Geek.

]]>In mathematics, a weight is used to express a set of multiplicative constants allocated in front of terms in an edge of a tree. For example, the weight of a graph at a point [n] is the maximum number of edges in a given branch [n]. The weights of a graph are computed and analysed in light of the problem investigated. Weighted graphs are believed to be present in different graph scenarios such as shortest path, centrality and travelling salesman problems.

The weight of a graph assigns value to either the node or the relationship between nodes. These values determine the shortest path or most central node in a given network. The nature or type of weight is more relevant in the type of network. For example, a weight type flow may be more applicable to traffic in a computer network, fluids in a water pipe, currents in an electrical circuit or demand movements. We will look into the flow network as it pertains to weight in the subsequent section.

**Different Types of graph weights **

Weighted graphs are labeled graphs with positive numbers as labels. These weights can be more relevant to a certain type of network than others.

**Flow Weights:**Graphs containing this type of weight are usually referred to as a flow network and are directed in nature. The amount of a flow that travels through an edge should never exceed the capacity of the respective node. It is expected that the amount of flow into a node equals the amount of flow out of it. The only occasion when this is not expected to hold is when the node is a source. A source node usually demonstrates an outgoing flow which is important in understanding the roles and importance of nodes in this type of labelled graph. On the other hand, a node only designed to receive flow is referred to as a sink. Earlier, I’ve mentioned different scenarios where a flow network can be most relevant. A good example will be the tube or train networks. Each station can be viewed as a node with respective capacity for trains or passengers to travel through. With this example, no station is a source or sink, as they all experience incoming and outgoing flow of trains and passengers. A mathematical representation of this weight can be initially represented with a graph*G*= (*V*,*E*), with V being a set of nodes and*E*the set of edges that are connected to the nodes*E*. The capacity function looks at the subset of two nodes in a linear map that results to*c*:*V*×*V*→ ℝ_{∞}. These nodes may be distinct but related in the network and are considered to be connected by the same edge or members of a set of edges. In mathematical terms, if (*v*,*u*) ∈*E*then (*u*,*v*) is also a member of*E*. This indicates that the two nodes are both connected by the same vertex*E*. Finally, the capacity function of the two nodes is expressed as*c*(*v*,*u*) = 0. The capacity of both nodes is expected to equal zero as the amount of flow out of each node equals the inflow. None of the above nodes is viewed as a source or sink. A flow network with a source and sink is represented as (*G*,*c*,*s*,*t*). Where*G*is the graph,*c*the capacity,*s*the source and*t*the sink or target.**Capacity Weight:**Flow and capacity tend to mostly work together in a weighted directed graph model. The capacity weight is quite applicable to a traffic network. A good example will be a transportation network where nodes are interchanges or junctions and the edges are the respective highway segments. The edge weights are capacities related to the maximum flow of traffic a vertex or interchange segment can carry. In other words, the capacity weights of these edges determine how many cars can travel through.

The capacity weighted graph below indicates the flow from each node. The node *s* is the source and *t* is the sink. The largest capacity is edge [*c,b*] while the least is vertex [*c,a*].

**Frequency weights:**Traditionally frequency weighting has been applied to communications engineering for sound level measurement. More specifically in the measurement of the sound level (low or high) to the human ears. Its association to spectral analysis has led its conversion of time series into a range of frequencies. In recent times, frequency as weights is now utilised in a social network or human interaction settings. In this scenario, weights are applied between two friends, contacts or persons in line with interpersonal communication. When two individuals or nodes in a graph sense, communicate regularly, the frequency weight of the edge that connects them will be stronger and viewed as the minimal path.**Distance weights**: In normal terms, the distance between two nodes on a graph is the number of edges in the shortest path. Distance as a form of weight in a weighted graph looks at the shortest walk. For a given graph G(V,E), w(a,b) represents the weight of edge e=(a,b). Assuming G(V,E,w) is an undirected weighted graph, it will be expected that w(a,b) = w(b,a) for (a,b) to be a ∈ (set member of) E. This type of weight can be applied to places or a transport network where the distance between two nodes (a,b) can be depicted in the weight function of w. For a vertex e = (a,b) ∈ E, the given weight of the edge determines the distance from a to b. There is a possibility that the time it requires to travel from a to b may not necessarily be the same from b to a. Resulting in a representation of w(a, b) ≠ w(b, a). The below are two graphs where we can estimate the distance between nodes a to b. The first network is a disconnected graph which indicates d(a, b) = ∞ (infinity). There is no weight distribution between both nodes due to a disconnected edge. On the other hand, the second graph is a connected one but the edge distribution of (x, y, z, x) highlights there is no minimum weight path from a to b due to the presence of a negative weight between x and z.

Distance as the weight can work closely with time. This will be touched in the next section.

- Time weights: The time required to travel from two nodes in a weighted graph can determine the shortest path. As earlier stated, for a vertex e = (a,b) ∈ E, the time required to travel from a to b, may not necessarily be the same as b to a. They can both have the same edge distance but have varying time weights. For example in transport networks, one travel path may require more time due to bad road traffic, an accident or too many truck drivers due to a factory nearby. These are some of the factors that can determine the time weight between two nodes in a connected graph.
**Resistance weights**: In a connected graph the resistance matrix is represented by R, and it is conveyed as R=(rij). In this instance, rij is the resistance distance between the nodes i and j of G. For a friendship or social network a resistance in weight form could be a profile that consistently ignores replying to a message from a given user. Or, in the case of LinkedIn, ignoring to accept a connection request. The least resistance between two nodes could be the shortest path in a connected graph.**Cost weights:**This looks at the amount of effort required to travel from one place to another in a connected network. It could be financial, material or human capital in nature. One could compare the cost of travelling from London to New York from a cost perspective. The different airlines can be different edges while the nodes are the source, transit and destination.

Example from the graph below, London can be represented by a node (0) and New York as (4). There are two airlines as edges (0,8,4) and (0,3,4). Nodes (8) and (3) are the transit airports en route to Newyork. This example reveals the most cost-effective airline is (0,3,4) with a combined cost of 3.

The above are seven examples of how weights could be represented in a connected weighted network.

The post Exploring the different types of weights in a connected graph appeared first on Dateme Tubotamuno - Semantic Geek.

]]>The post Finding the mother vertex in a graph appeared first on Dateme Tubotamuno - Semantic Geek.

]]>**What is the meaning of the mother vertex in a given graph?**

In a given graph G = (V, E), a mother vertex v has a pathway for all other vertices in the specified graph to enable the connection. All other vertices in the graph can be accessed via the mother vertex. A mother vertex is quite common in directed graphs but is also applicable in undirected networks. We will briefly explore mother vertices in different network examples

**Directed graph:** Directed graphs or DiGraphs do hold directed edges or nodes. These edges can be unidirectional or bidirectional. For DiGraphs, self-loops are permissible but parallel edges are not. Mother vertex exists in directed graphs and there can be multiple of these as shown below. Based on the directed graph below, nodes [4] and [6] are the mother vertex.

**Undirected connected graph: **A mother vertex can be present on undirected connected graphs. In undirected graphs, self-loops are allowed while parallel edges are not permitted. The undirected connected graph below displays nodes 11 to 16 connected by seven non-directional edges. In this type of undirected connected graph, every node is considered to be a mother vertex.

**Undirected disconnected graph:** There is no mother vertex in the example graph below. This is because no edge connects node [12] to node [A]. It is safe to say that undirected disconnected graphs do not have mother vertex.

**Python implementation of finding a mother vertex**

We will run through a mother vertex implementation in Python. The first code example returns no mother vertex from a selection of 12 nodes.

```
from collections import defaultdict
# A cass of a directed graph using adjacency list
class Graph:
def __init__(self,vertices):
self.V = vertices #No. of vertices
self.graph = defaultdict(list) # default dictionary
# DFS traversing from v
def DFSUtil(self, v, visited):
# Idenifying the current node as visited and print it
visited[v] = True
# Noting all the vertices adjacent to this vertex
for i in self.graph[v]:
if visited[i] == False:
self.DFSUtil(i, visited)
# Now adding w to the list of v
def addEdge(self, v, w):
self.graph[v].append(w)
# If a mother vertex exists return. Otherwise returns -1
def findMother(self):
visited =[False]*(self.V)
# Do a DFS traversal and find the last finished
# vertex
for i in range(self.V):
if visited[i]==False:
self.DFSUtil(i,visited)
v = i
# If there exist mother vertex (or vetices) in given
# graph, then v must be one (or one of them)
# Now check if v is actually a mother vertex (or graph
# has a mother vertex). We basically check if every vertex
# is reachable from v or not.
# Reset all values in visited[] as false and do
# DFS beginning from v to check if all vertices are
# reachable from it or not.
visited = [False]*(self.V)
self.DFSUtil(v, visited)
if any(i == False for i in visited):
return "Not Present"
else:
return v
# Create a graph given in the above diagram
g = Graph(12)
g.addEdge(0, 1)
g.addEdge(0, 2)
g.addEdge(1, 3)
g.addEdge(4, 1)
g.addEdge(6, 4)
g.addEdge(5, 6)
g.addEdge(6, 7)
g.addEdge(8, 7)
g.addEdge(9, 10)
g.addEdge(9, 11)
g.addEdge(11, 2)
g.addEdge(12, 11)
print ("A mother vertex is " + str(g.findMother()))
A mother vertex is Not Present
```

On the other hand, the example below identifies node [0] as the mother vertex in the below graph

```
from collections import defaultdict
# A cass of a directed graph using adjacency list
class Graph:
def __init__(self,vertices):
self.V = vertices #No. of vertices
self.graph = defaultdict(list) # default dictionary
# DFS traversing from v
def DFSUtil(self, v, visited):
# Idenifying the current node as visited and print it
visited[v] = True
# Noting all the vertices adjacent to this vertex
for i in self.graph[v]:
if visited[i] == False:
self.DFSUtil(i, visited)
# Now adding w to the list of v
def addEdge(self, v, w):
self.graph[v].append(w)
# If a mother vertex exists return. Otherwise returns -1
def findMother(self):
visited =[False]*(self.V)
# Do a DFS traversal and find the last finished
# vertex
for i in range(self.V):
if visited[i]==False:
self.DFSUtil(i,visited)
v = i
# If there exist mother vertex (or vetices) in given
# graph, then v must be one (or one of them)
# Now check if v is actually a mother vertex (or graph
# has a mother vertex). We basically check if every vertex
# is reachable from v or not.
# Reset all values in visited[] as false and do
# DFS beginning from v to check if all vertices are
# reachable from it or not.
visited = [False]*(self.V)
self.DFSUtil(v, visited)
if any(i == False for i in visited):
return "Not Present"
else:
return v
# Create a graph given in the above diagram
g = Graph(12)
g.addEdge(0, 1)
g.addEdge(0, 11)
g.addEdge(1, 2)
g.addEdge(2, 3)
g.addEdge(3, 4)
g.addEdge(4, 5)
g.addEdge(5, 6)
g.addEdge(6, 7)
g.addEdge(7, 8)
g.addEdge(8, 9)
g.addEdge(9, 10)
g.addEdge(10, 11)
print ("A mother vertex is " + str(g.findMother()))
A mother vertex is 0
```

The code implementation was inspired by this article.

The post Finding the mother vertex in a graph appeared first on Dateme Tubotamuno - Semantic Geek.

]]>The post The role of the betweenness centrality measure in networks appeared first on Dateme Tubotamuno - Semantic Geek.

]]>From the graph below, Pat is pivotal to the connections. In an event Pat was excluded from the connections, all other nodes in the graph will be disconnected. Pat is very influential and impacts the flow of information, traffic or connection between all nodes. The shortest and only way to move from one node to another is via Pat. Pat can also be viewed as the bridge in the below example. Tim can’t visit Mia without having to go through Pat. In a transport network, if Pat was a train station, it would be quite a busy one as all other nodes in the network would have to travel via the central station.

Betweenness centrality can be applied to both a weighted and unweighted graph. For an unweighted graph, the shortest path between a pair of nodes is considered the betweenness centrality node. In terms of weighted graphs, the sum of weights between the edges that connect these vertices has to be the minimum. The nodes that are considered to have a high betweenness centrality are ones that produce the shortest paths (weighted and unweighted) to other vertices. In the case of the above figure, Pat becomes the betweenness centrality node as it is the quickest and shortest path for every pair of vertices.

The below is an expression of the betweenness centrality formula.

The betweenness centrality of a node equals to, how is the total number of shortest paths from node **s** to node** t** and is the number of those paths that pass through v. The number of paths that travels through v determines the betweenness centrality. There might be a couple of vertices that serve as a pathway for a pair of nodes but the total number of shortest paths through a given node of **v** in the formula above determines the node with the highest betweenness centrality or the most powerful in a given network.

**Computing the shortest path of betweenness centrality of a node**

The below formula is put forward on NetworkX in understanding the shortest path in relation to the betweenness centrality of a node.

The starting point is understanding the betweenness centrality of node **v**. It sums up the fraction of all-pairs shortest paths that travel through node** v**. In this case, is the set of nodes, while is the number of shortest paths and looks at the number of paths that travel through node v aside .

Based on this expression, if s=t, , the number of nodes between s and t is expected to be one. Moving from node s to t, will be through a node. On the other hand, if , . Considering node v is a member of s,t, the probability of some paths going through v other than s,t is zero. In this case, v is viewed as having a set membership to s and t, all shortest paths to v have to involve s,t.

**Betweenness centrality measure for weighted graphs**

In the case of weighted graphs, the respective weight of the edges are expected to be greater than zero. If weights of these edges fall below zero, there can be an issue of an infinite number of equal paths that travel through a pair of nodes.

**Determining the betweenness centrality in Networkx**

Networkx has a simple line of python codes to determine the betweenness centrality of nodes in a given graph. The below two lines of code was added to an existing weighted graph (G) that comprised 10 nodes.

```
b=nx.betweenness_centrality(G)
print(b)
```

The betweenness centrality score is usually a dictionary output. It determines the betweenness centrality of each node and in this case their respective weights are taken into consideration.

```
{1: 0.0, 2: 0.013888888888888888, 3: 0.041666666666666664, 4: 0.05555555555555555, 7: 0.15277777777777776, 10: 0.0, 5: 0.1111111111111111, 6: 0.1388888888888889, 8: 0.05555555555555555, 9: 0.0}
```

A print out of the graph with the above betweenness similarity dictionary looks like below

By way of recap, the betweenness centrality algorithm captures centrality in a given network. The betweenness centrality of a given node is the number or sum of weights of the shortest paths that travel through the given vertex.

The post The role of the betweenness centrality measure in networks appeared first on Dateme Tubotamuno - Semantic Geek.

]]>The post Exploring Dijkstra’s shortest path algorithm appeared first on Dateme Tubotamuno - Semantic Geek.

]]>Above is an unweighted graph with ten vertices. In this scenario, the shortest path from one node to another is the fewest number of nodes that will be required to move from the source to the destination.

Using the NetworkX library in Python, I was able to check the shortest path from node 1 to 4 and it reveals [1,2,4] as the fastest route. You might be wondering why [1.5.4] was not considered as that is also a two-node movement?

```
print(nx.dijkstra_path(G,1,4))
[1, 2, 4]
```

I am now going to check the shortest path from nodes 1 to 8. As you can see below, the route [1,7,8] was chosen when [1,10,8] also contains the same number of paths. The path [1,7,8] has lower values when compared to [1.10.8]

```
print(nx.dijkstra_path(G,1,8))
[1, 7, 8]
```

After adding weights to all edges, the shortest path from nodes [1] to [8] was rechecked and route [1, 6, 7, 8] was revealed. This was done with both the Dijkstra and Johnson’s algorithm.

```
paths = nx.johnson(G, weight='weight') #Johnson's Algorithm
paths[1][8]
[1, 6, 7, 8]
print(nx.dijkstra_path(G,1,8))
[1, 6, 7, 8]
```

We can also use the Dijkstra’s algorithm to obtain the shortest weighted paths and return dictionaries of predecessors for each node and distance for each node from the source node. In this example, we adopt node [2] as our source and populate a dictionary of the nodes on the path from the source to the destination. It indicates that from node [2] to node [8] there are three paths which will travel from either node 7, 10 or 9.

```
pred, dist = nx.dijkstra_predecessor_and_distance(G, 2)
sorted(pred.items())
sorted(dist.items())
```

Path Length

NetworkX also allows you to determine the path length from a source to a destination node. Helping you know the count of the shortest path length. The first example below, from node [1] to [4], reveals the fastest length in weights. While the second example expresses a length of 5.7 in weight as the shortest distance from nodes [4] to [9].

```
length = nx.single_source_dijkstra_path_length(G, 1)
length[4]
length = nx.single_source_dijkstra_path_length(G, 4)
length[9]
5.7
```

Non-NetworkX implementation of the Dijkstra’s algorithm

We will now look at the Python implementation of Dijkstra’s algorithm without the NetworkX library.

As a reminder, Djikstra’s is a path-finding algorithm, common in routing and navigation applications. The weight of edges determines the shortest path. The weights can represent cost, time, distance, rate of flow or frequency.

The first step is to create a Graph and initialise the edge and weight dictionaries. When defining the add_edge function, provisions are made to capture an out-degree and an in-degree weight. The from_node and to_node arguments compute the bidirectional weight to determine the shortest path. For example, a path [A, B] with weight 2, and [B, A] with weight 1, will lead to a combined weight of 3 from A to B.

```
from collections import defaultdict
class Graph():
def __init__(self):
"""
self.edges is a dict of every possible next nodes
e.g. {'Z': ['A', 'B', 'C',], ...}
self.weights contains the weights between two nodes,
...the two nodes serving as the tuple
e.g. {('Z', 'A'): 11, ('Z', 'C'): 2.4, ...}
"""
self.edges = defaultdict(list)
self.weights = {}
def add_edge(self, from_node, to_node, weight):
# connecting nodes from both sides
self.edges[from_node].append(to_node)
self.edges[to_node].append(from_node)
# catering for the source and destination nodes
self.weights[(from_node, to_node)] = weight
self.weights[(to_node, from_node)] = weight
# combining the indegree and outdegree weights were possible
```

The next step is to create edges between nodes and assign specific weights to these connections. These assigned weights will determine the shortest path.

```
graph = Graph()
#nodes are created as edge connections and weights are determined
edges = [
('Z', 'A', 11),
('Z', 'B', 2.8),
('Z', 'C', 2.4),
('A', 'B', 3.9),
('A', 'D', 14),
('A', 'F', 1),
('B', 'D', 4),
('B', 'H', 5),
('C', 'L', 2),
('D', 'F', 1),
('F', 'H', 3),
('G', 'H', 2),
('G', 'Y', 2),
('I', 'J', 60),
('I', 'K', 41),
('I', 'L', 48),
('J', 'L', 12),
('K', 'Y', 50),
]
for edge in edges:
graph.add_edge(*edge)
```

The next step is to utilise the Dijkstra algorithm to find the shortest path. Beginning with the current_node and adding the weight of that node to the next one. The shortest weight equates to the shortest path in this case.

```
def dijsktra(graph, initial, end):
# the shortest paths is a dict of nodes
# whose value is a tuple of (previous node, weight)
shortest_paths = {initial: (None, 0)}
current_node = initial
visited = set()
while current_node != end:
visited.add(current_node)
destinations = graph.edges[current_node]
weight_to_current_node = shortest_paths[current_node][1]
for next_node in destinations:
weight = graph.weights[(current_node, next_node)] + weight_to_current_node
if next_node not in shortest_paths:
shortest_paths[next_node] = (current_node, weight)
else:
current_shortest_weight = shortest_paths[next_node][1]
if current_shortest_weight > weight:
shortest_paths[next_node] = (current_node, weight)
next_destinations = {node: shortest_paths[node] for node in shortest_paths if node not in visited}
if not next_destinations:
return "Route Not Possible"
# the next node is the destination with the lowest weight
current_node = min(next_destinations, key=lambda k: next_destinations[k][1])
# determing the shortest path
path = []
while current_node is not None:
path.append(current_node)
next_node = shortest_paths[current_node][0]
current_node = next_node
# Reverse path
path = path[::-1]
return path
```

A quick test was run to determine the shortest path from [Z] to [D]. It predicted [Z,B,D] as the shortest path. This is correct as the weight for [Z, B, D] is 6.8 and that of [Z, A, D] is 25.

For the second example, the goal was to discover the shortest path from A to H. The result generated was correct by identifying [A, F, H] as the shortest path with a total weight of 4 when compared to a longer route of [A, B, H] with a weight of 8.9

```
dijsktra(graph, 'Z', 'D')
['Z', 'B', 'D']
dijsktra(graph, 'A', 'H')
['A', 'F', 'H']
```

The post Exploring Dijkstra’s shortest path algorithm appeared first on Dateme Tubotamuno - Semantic Geek.

]]>The post Different ways of representing Graphs appeared first on Dateme Tubotamuno - Semantic Geek.

]]>**Adjacency Matrix:** This is one of the most popular ways a graph is represented. One of the core aims of this matrix is to assess if the pairs of vertices in a given graph are adjacent or not. In an adjacency matrix, row and columns represent vertices. The row sum equals the degree of the vertex that the row represents. It is advisable to use the adjacency matrix for weighted edges. You replace the standard ‘1’ with the respective weight. It is easier to represent directed graphs with edge weights through an adjacency matrix. Adjacency matrix works best with dense graphs. As dense graphs usually have twice the size of the edges to the given nodes.

Dense graph = | E |- |V|²

Sparse Graph = | E |- |V|

The above formula highlights a 1:1 vertex to edge ratio for a sparse graph and 1:2 for the dense graph type.

Adjacency matrix takes up |V|² space, regardless of the graph density. Matrix for a graph with 10,000 vertices will use up at least 100, 000 Bytes.

Similarly, you could think of the dense graph as twice the number of the order of the graph to its actual size. As in graph theory, the order of the graph is determined by the number of edges and the size as the number of vertices.

|V|= Order of the graph G (Number of edges)

|E|= Size of the graph G (Number of vertices)

Overall, the adjacency matrix is considered to be faster for dense graphs and simpler for weighted edges. On the downside, it uses more space. We will have a look at an adjacency matrix example in NetworkX, a Python library for networks. As expected, the Adjacency Matrix is a 2D array of size V x V where both V’s represent the number of vertices in a graph. For an unweighted graph, the 2D array of rows and columns will be adj[][]. Leading to adj[m][n] = 1.

This clearly expresses that there is an edge connecting vertex m to vertex n which automatically outputs 1. When the edges of graphs contain weights, the adj[m][n] = w, then we have an edge from vertex m to vertex n with weight w. In the case of an undirected graph (i.e. all present edges are bidirectional), the adjacency matrix is symmetric.

In the example matrix below, the rows and columns both represent the vertices and an instance of 1 indicates if there is an edge connection between nodes.

Here is a quick code implementation of an adjacency matrix created via Numpy and converted into a network. It is a matrix utilising a standard 0 or 1 values (0: when there are no edges and 1:when edges are present between nodes). As a sparse matrix, there are equal numbers of rows and columns with labels 1 – 7.

```
import networkx as nx
import numpy as np
import matplotlib.pyplot as plt
import scipy as sp
import pandas as pd
%matplotlib inline
#Adjacent matrix
adj_matrix = np.matrix([[0,1,0,0,1,0,1],[1,0,1,0,1,0,1],[0,1,0,1,0,0,1],[0,0,1,0,1,1,1],[1,1,0,1,0,0,1],[0,0,0,1,0,0,1],[0,0,0,1,0,0,1]])
adj_sparse = sp.sparse.coo_matrix(adj_matrix, dtype=np.int8)
labels = range(1,8)
DF_adj = pd.DataFrame(adj_sparse.toarray(),index=labels,columns=labels)
print (DF_adj)
1 2 3 4 5 6 7
1 0 1 0 0 1 0 1
2 1 0 1 0 1 0 1
3 0 1 0 1 0 0 1
4 0 0 1 0 1 1 1
5 1 1 0 1 0 0 1
6 0 0 0 1 0 0 1
7 0 0 0 1 0 0 1
```

The next step is converting the adjacency matrix into a graph via the NetworkX Python library. Firstly, the row and column labels are converted into vertices or nodes. This would generate a total of 7 nodes based on the corresponding row and column labels. Column label ‘i’ and row label ‘j’ are assigned as nodes and an add_edge command establishes a connection between nodes.

```
#Utilising Network graph
G = nx.Graph()
G.add_nodes_from(labels)
#Connecting the nodes
for i in range(DF_adj.shape[0]):
col_label = DF_adj.columns[i]
for j in range(DF_adj.shape[1]):
row_label = DF_adj.index[j]
node = DF_adj.iloc[i,j]
if node == 1:
G.add_edge(col_label,row_label)
#Draw graph
nx.draw(G,with_labels = True)
```

The graph is now drawn to represent the nodes and edges initialised in the previous section

**Incidence Matrix**: It is a matrix that highlights the relationship between two objects. In this instance, the objects are nodes and edges. The matrix has one row for each node and one column for the edges. An entry for the intersection of row (X) and column (Y) is 1 when there is a relationship. In the incidence matrix, loops count as twice.

Here is a simple representation of the weight in an incidence matrix. A weight of 23 is allocated to entries where there is an intersection.

```
import networkx as nx
from math import sqrt
T = nx.grid_2d_graph(3,3)
for m, n in T.edges():
x1, y1 = m
x2, y2 = n
T[m][n]['weight']=sqrt((x2-x1)**2 + (y2-y1)**2 )*23
print(nx.incidence_matrix(T,weight='weight').todense())
[[23. 23. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[ 0. 23. 23. 23. 0. 0. 0. 0. 0. 0. 0. 0.]
[ 0. 0. 0. 23. 23. 0. 0. 0. 0. 0. 0. 0.]
[23. 0. 0. 0. 0. 23. 23. 0. 0. 0. 0. 0.]
[ 0. 0. 23. 0. 0. 0. 23. 23. 23. 0. 0. 0.]
[ 0. 0. 0. 0. 23. 0. 0. 0. 23. 23. 0. 0.]
[ 0. 0. 0. 0. 0. 23. 0. 0. 0. 0. 23. 0.]
[ 0. 0. 0. 0. 0. 0. 0. 23. 0. 0. 23. 23.]
[ 0. 0. 0. 0. 0. 0. 0. 0. 0. 23. 0. 23.]]
```

**Adjacency list:** An adjacency list presents the neighbouring or connected edges to a given node. One of the benefits of adjacent lists is that it is faster and uses less space for sparse graphs. Unfortunately, it is slower for dense graphs.

These are some of the ways graphs can be represented.

The post Different ways of representing Graphs appeared first on Dateme Tubotamuno - Semantic Geek.

]]>