Skip to content

Latest commit

 

History

History
305 lines (215 loc) · 11.3 KB

File metadata and controls

305 lines (215 loc) · 11.3 KB

Graph

Graphs are one of my favorite data structures. They have many exciting applications like optimizing routes and social network analysis, to name a few. You are probably using apps that use graphs every day. First, let’s start with the basics.

Tip
A graph is a non-linear data structure where a node can have zero or more connected nodes.

You can think of a Graph as an extension of a Linked List. Instead of having a next or previous reference, you can have as many as you want. You can implement a graph node as an array of associated nodes.

Node’s constructor
link:../../../src/data-structures/graphs/node.js[role=include]

As you can see, it’s pretty similar to the Linked List node. The only difference is that it uses an array of adjacent nodes instead of just one or two.

Another difference between a linked list and a Graph is that a linked list always has a root node (or first element), while the Graph doesn’t. You can start traversing a graph from anywhere. Let’s examine these graph properties!

Graph Properties

The connection between two nodes is called edge. Also, nodes might be called vertex.

image
Figure 1. Graph is composed of vertices/nodes and edges
Directed Graph vs Undirected

A graph can be either directed or undirected.

image
Figure 2. Graph: directed vs undirected

An undirected graph has edges that are two-way street. E.g., On the undirected example, you can traverse from the green Node to the orange and vice versa.

A directed graph (digraph) has edges that are one-way street. E.g., On the directed example, you can only go from green Node to orange and not the other way around. When one Node has an edge to itself is called a self-loop.

Graph Cycles

A graph can have cycles or not.

image
Figure 3. Cyclic vs. Acyclic Graphs.

A cyclic graph is the one that you can pass through a node more than once. E.g., On the cyclic illustration, if you start in the green Node, go the orange and purple; finally, you could come back to green again. Thus, it has a cycle. An acyclic graph is the one that you can’t pass through a node more than once. E.g., in the acyclic illustration, can you find a path where you can pass through the same vertex more than one?

The Directed Acyclic Graph (DAG) is unique. It has many applications like scheduling tasks, spreadsheets' change propagation, and so forth. DAG is also called Tree data structure only when each Node has only one parent.

Connected vs Disconnected vs Complete Graphs
image
Figure 4. Different kinds of graphs: disconnected, connected, and complete.

A disconnected graph is one that has one or more subgraphs. In other words, a graph is disconnected if two nodes don’t have a path between them.

A connected graph is the opposite of disconnected; there’s a path between every Node. No one is left stranded.

A complete graph is where every Node is adjacent to all the other nodes in the Graph. E.g., If there are seven nodes, every Node has six edges.

Weighted Graphs

Weighted graphs have labels in the edges (a.k.a weight or cost). The link weight can represent many things like distance, travel time, or anything else.

image
Figure 5. Weighted Graph representing USA airports distance in miles.

For instance, a weighted graph can have a distance between nodes. So, algorithms can use the weight and optimize the path between them.

Exciting Graph applications in real-world

Now that we know what graphs are and some of their properties. Let’s discuss some real-life usages of graphs.

Graphs become a metaphor where nodes and edges model something from our physical world. To name a few:

  • Optimizing Plane traveling

    • Nodes = Airport

    • Edges = Direct flights between two airports

    • Weight = miles between airports | cost | time

  • GPS Navigation System

    • Node = road intersection

    • Edge = road

    • Weight = time between intersections

  • Network routing

    • Node = server

    • Edge = data link

    • Weight = connection speed

There are endless applications for graphs in electronics, social networks, recommendation systems, and many more. That’s cool and all, but how do we represent graphs in code? Let’s see that in the next section.

Representing Graphs

There are two main ways to graphs one is:

  • Adjacency Matrix

  • Adjacency List

Adjacency Matrix

Representing graphs as adjacency matrix is done using a two-dimensional array. For instance, let’s say we have the following Graph:

image
Figure 6. Graph and its adjacency matrix.

The number of vertices, |V|, defines the size of the matrix. In the example, we have five vertices, so we have a 5x5 matrix.

We fill up the matrix row by row. Mark with 1 (or any other weight) when you find an edge. E.g.

  • Row 0: It has a self-loop, so it has a 1 in the coordinate 0,0. Node 0 also has an edge to 1 and 4, so we mark it.

  • Row 1: node 1 has one edge to 3, so we check it.

  • Row 2: Node 2 goes to Node 4, so we note the insertion with 1.

  • etc.

The example graph above is a directed graph (digraph). In the case of an undirected graph, the matrix would be symmetrical by the diagonal.

If we represent the example graph in code, it would be something like this:

const digraph = [
  [1, 1, 0, 0, 1],
  [0, 0, 0, 1, 0],
  [0, 0, 0, 0, 1],
  [0, 0, 1, 0, 0],
  [0, 1, 0, 0, 0],
];

It would be very easy to tell if two nodes are connected. Let’s query if node 2 is connected to 3:

digraph[2][3]; //=> 0
digraph[3][2]; //=> 1

As you can see, we don’t have a link from node 2 to 3, but we do in the opposite direction. Querying arrays is constant time O(1), so not bad at all.

The issue with the adjacency matrix is the space it takes. Let’s say you want to represent the entire Facebook network on a digraph. You would have a massive matrix of 1.2 billion x 1.2 billion. The worst part is that most of it would be empty (zeros) since people are friends to at most a few thousands.

Tip
When the Graph has few connections compared to the number of nodes, we say that we have a sparse graph. Conversely, if we have almost complete maps, we say we have a dense graph.

The adjacency matrix’s space complexity is O(|V|2), where |V| is the number of vertices/nodes.

Adjacency List

Another way to represent a graph is by using an adjacency list. This time instead of using an array (matrix), we use a list.

image
Figure 7. Graph represented as an Adjacency List.

If we want to add a new node to the list, we can do it by adding one element to the end of the array of nodes O(1). In the next section, we will explore the running times of all operations in an adjacency list.

Implementing a Graph data structure

Since adjacency lists are more efficient (than adjacency matrix), we will use to implement a graph data structure.

Let’s start by creating the constructor of the Graph class.

Graph’s constructor
link:../../../src/data-structures/graphs/graph.js[role=include]

Notice that the constructor takes a parameter. The edgeDirection allows us to use one class for both undirected and directed graphs.

Adding a vertex

For adding a vertex, we first need to check if the Node already exists. If so, we return the Node.

Graphs’s addVertex method
link:../../../src/data-structures/graphs/graph.js[role=include]
  1. Check if value is already on the graph. If it is, then return it.

  2. Create new Node with the given value.

  3. Set hashMap with value and node pair.

If the Node doesn’t exist, we create the new Node and add it to a HashMap.

Tip
[tree-map-chap] stores key/pair value very efficiently. Lookup is O(1).

The key is the Node’s value, while the value is the newly created Node.

The Node class is constructed as follows:

Node’s class (for Graph data structure)
link:../../../src/data-structures/graphs/node.js[role=include]

Deleting a vertex

Graphs’s removeVertex method
link:../../../src/data-structures/graphs/graph.js[role=include]
  1. Try to find if node exists.

  2. Remove related edges. See removeAdjacent below.

  3. Remove Node with the given value.

Notice on callout 2 that we visit every edge on the Graph and remove the ones that contain the Node to remove.

For removing adjacent nodes, we use Node’s method called removeAdjacent that can be implemented as follows:

Node’s removeAdjacent
link:../../../src/data-structures/graphs/node.js[role=include]

All adjacencies are stored as a HashSet to provide constant-time deletion.

Adding an edge

An edge is a connection between two nodes (vertices). If the Graph is undirected means that every link is a two-way street. When we create the edge from node 1 to node 2, we also need to establish a connection between nodes 2 and 1 for undirected graphs.

If we are dealing with a digraph (directed Graph), then we create one edge.

Graphs’s addEdge method
link:../../../src/data-structures/graphs/graph.js[role=include]
  1. Find or create nodes if they don’t exists yet.

  2. Create edge from source to destination.

  3. If it’s an undirected graph, create the edge in the other direction.

We can add adjacencies using the addAdjacent method from the Node class.

Node’s addAdjacent
link:../../../src/data-structures/graphs/node.js[role=include]

Querying Adjacency

Graphs’s areAdjacents method
link:../../../src/data-structures/graphs/graph.js[role=include]
Node’s isAdjacent
link:../../../src/data-structures/graphs/node.js[role=include]

Deleting an edge

Graphs’s removeEdge method
link:../../../src/data-structures/graphs/graph.js[role=include]
Node’s removeAdjacent
link:../../../src/data-structures/graphs/node.js[role=include]

Graph Complexity

Table 1. Time complexity for a Graph data structure

Data Structure

Vertices

Edges

Space Complexity

Add

Remove

Add

Remove

Graph (adj. matrix)

O(|V|2)

O(|V|2)

O(1)

O(1)

O(|V|2)

Graph (adj. list w/array)

O(1)

O(|V| + |E|))

O(1)

O(|E|)

O(|V| + |E|)

Graph (adj. list w/HashSet)

O(1)

O(|V|))

O(1)

O(1)

O(|V| + |E|)

As you can see, using a HashSet on for the adjacency list make a performance improvement if you need to query for connectivity. However, this is rarely required. Most graph algorithms visit all adjacent nodes one by one.