Graphs are one of my favorite data structures. They have many exciting applications like optimizing routes and social network analysis, to name a few. You are probably using apps that use graphs every day. First, let’s start with the basics.
Tip
|
A graph is a non-linear data structure where a node can have zero or more connected nodes. |
You can think of a Graph as an extension of a Linked List. Instead of having a next
or previous
reference, you can have as many as you want. You can implement a graph node as an array of associated nodes.
link:../../../src/data-structures/graphs/node.js[role=include]
As you can see, it’s pretty similar to the Linked List node. The only difference is that it uses an array of adjacent nodes instead of just one or two.
Another difference between a linked list and a Graph is that a linked list always has a root node (or first element), while the Graph doesn’t. You can start traversing a graph from anywhere. Let’s examine these graph properties!
The connection between two nodes is called edge. Also, nodes might be called vertex.
A graph can be either directed or undirected.
An undirected graph has edges that are two-way street. E.g., On the undirected example, you can traverse from the green Node to the orange and vice versa.
A directed graph (digraph) has edges that are one-way street. E.g., On the directed example, you can only go from green Node to orange and not the other way around. When one Node has an edge to itself is called a self-loop.
A graph can have cycles or not.
A cyclic graph is the one that you can pass through a node more than once. E.g., On the cyclic illustration, if you start in the green Node, go the orange and purple; finally, you could come back to green again. Thus, it has a cycle. An acyclic graph is the one that you can’t pass through a node more than once. E.g., in the acyclic illustration, can you find a path where you can pass through the same vertex more than one?
The Directed Acyclic Graph (DAG) is unique. It has many applications like scheduling tasks, spreadsheets' change propagation, and so forth. DAG is also called Tree data structure only when each Node has only one parent.
A disconnected graph is one that has one or more subgraphs. In other words, a graph is disconnected if two nodes don’t have a path between them.
A connected graph is the opposite of disconnected; there’s a path between every Node. No one is left stranded.
A complete graph is where every Node is adjacent to all the other nodes in the Graph. E.g., If there are seven nodes, every Node has six edges.
Weighted graphs have labels in the edges (a.k.a weight or cost). The link weight can represent many things like distance, travel time, or anything else.
For instance, a weighted graph can have a distance between nodes. So, algorithms can use the weight and optimize the path between them.
Now that we know what graphs are and some of their properties. Let’s discuss some real-life usages of graphs.
Graphs become a metaphor where nodes and edges model something from our physical world. To name a few:
-
Optimizing Plane traveling
-
Nodes = Airport
-
Edges = Direct flights between two airports
-
Weight = miles between airports | cost | time
-
-
GPS Navigation System
-
Node = road intersection
-
Edge = road
-
Weight = time between intersections
-
-
Network routing
-
Node = server
-
Edge = data link
-
Weight = connection speed
-
There are endless applications for graphs in electronics, social networks, recommendation systems, and many more. That’s cool and all, but how do we represent graphs in code? Let’s see that in the next section.
There are two main ways to graphs one is:
-
Adjacency Matrix
-
Adjacency List
Representing graphs as adjacency matrix is done using a two-dimensional array. For instance, let’s say we have the following Graph:
The number of vertices, |V|, defines the size of the matrix. In the example, we have five vertices, so we have a 5x5 matrix.
We fill up the matrix row by row. Mark with 1 (or any other weight) when you find an edge. E.g.
-
Row 0: It has a self-loop, so it has a
1
in the coordinate 0,0. Node 0 also has an edge to 1 and 4, so we mark it. -
Row 1: node 1 has one edge to 3, so we check it.
-
Row 2: Node 2 goes to Node 4, so we note the insertion with 1.
-
etc.
The example graph above is a directed graph (digraph). In the case of an undirected graph, the matrix would be symmetrical by the diagonal.
If we represent the example graph in code, it would be something like this:
const digraph = [
[1, 1, 0, 0, 1],
[0, 0, 0, 1, 0],
[0, 0, 0, 0, 1],
[0, 0, 1, 0, 0],
[0, 1, 0, 0, 0],
];
It would be very easy to tell if two nodes are connected. Let’s query if node 2 is connected to 3:
digraph[2][3]; //=> 0
digraph[3][2]; //=> 1
As you can see, we don’t have a link from node 2 to 3, but we do in the opposite direction. Querying arrays is constant time O(1), so not bad at all.
The issue with the adjacency matrix is the space it takes. Let’s say you want to represent the entire Facebook network on a digraph. You would have a massive matrix of 1.2 billion x 1.2 billion. The worst part is that most of it would be empty (zeros) since people are friends to at most a few thousands.
Tip
|
When the Graph has few connections compared to the number of nodes, we say that we have a sparse graph. Conversely, if we have almost complete maps, we say we have a dense graph. |
The adjacency matrix’s space complexity is O(|V|2), where |V| is the number of vertices/nodes.
Another way to represent a graph is by using an adjacency list. This time instead of using an array (matrix), we use a list.
If we want to add a new node to the list, we can do it by adding one element to the end of the array of nodes O(1). In the next section, we will explore the running times of all operations in an adjacency list.
Since adjacency lists are more efficient (than adjacency matrix), we will use to implement a graph data structure.
Let’s start by creating the constructor of the Graph class.
link:../../../src/data-structures/graphs/graph.js[role=include]
Notice that the constructor takes a parameter. The edgeDirection
allows us to use one class for both undirected and directed graphs.
For adding a vertex, we first need to check if the Node already exists. If so, we return the Node.
addVertex
methodlink:../../../src/data-structures/graphs/graph.js[role=include]
-
Check if value is already on the graph. If it is, then return it.
-
Create new
Node
with the given value. -
Set
hashMap
with value and node pair.
If the Node doesn’t exist, we create the new Node and add it to a HashMap
.
Tip
|
[tree-map-chap] stores key/pair value very efficiently. Lookup is O(1) .
|
The key
is the Node’s value, while the value
is the newly created Node.
The Node
class is constructed as follows:
link:../../../src/data-structures/graphs/node.js[role=include]
removeVertex
methodlink:../../../src/data-structures/graphs/graph.js[role=include]
-
Try to find if node exists.
-
Remove related edges. See
removeAdjacent
below. -
Remove Node with the given value.
Notice on callout 2 that we visit every edge on the Graph and remove the ones that contain the Node to remove.
For removing adjacent nodes, we use Node’s method called removeAdjacent
that can be implemented as follows:
removeAdjacent
link:../../../src/data-structures/graphs/node.js[role=include]
All adjacencies are stored as a HashSet to provide constant-time deletion.
An edge is a connection between two nodes (vertices). If the Graph is undirected means that every link is a two-way street. When we create the edge from node 1 to node 2, we also need to establish a connection between nodes 2 and 1 for undirected graphs.
If we are dealing with a digraph (directed Graph), then we create one edge.
addEdge
methodlink:../../../src/data-structures/graphs/graph.js[role=include]
-
Find or create nodes if they don’t exists yet.
-
Create edge from source to destination.
-
If it’s an undirected graph, create the edge in the other direction.
We can add adjacencies using the addAdjacent
method from the Node class.
addAdjacent
link:../../../src/data-structures/graphs/node.js[role=include]
areAdjacents
methodlink:../../../src/data-structures/graphs/graph.js[role=include]
isAdjacent
link:../../../src/data-structures/graphs/node.js[role=include]
removeEdge
methodlink:../../../src/data-structures/graphs/graph.js[role=include]
removeAdjacent
link:../../../src/data-structures/graphs/node.js[role=include]
Data Structure |
Vertices |
Edges |
Space Complexity |
||
Add |
Remove |
Add |
Remove |
||
Graph (adj. matrix) |
O(|V|2) |
O(|V|2) |
O(1) |
O(1) |
O(|V|2) |
Graph (adj. list w/array) |
O(1) |
O(|V| + |E|)) |
O(1) |
O(|E|) |
O(|V| + |E|) |
Graph (adj. list w/HashSet) |
O(1) |
O(|V|)) |
O(1) |
O(1) |
O(|V| + |E|) |
As you can see, using a HashSet
on for the adjacency list make a performance improvement if you need to query for connectivity. However, this is rarely required. Most graph algorithms visit all adjacent nodes one by one.