Depth First Search Algorithm

Depth First Search is a graph search algorithm that fully traverses a possible path before checking it’s neighbours. It dives in ‘depth first’. It uses the stack ADT to store nodes. Similar to it’s counterpart, BFS, it also relies on marking nodes to prevent searching the same node again.

DFS works on both directed and undirected graphs. In the case of directed graphs, neighbours are defined as nodes which have incoming edges from the current working node (any node pointed from an outgoing edge of current working node).

Process

Iterative Version

Take a node $v$ in $G$ . If it is not visited, push it to the stack.
Pop from the stack and treat the popped node as current working node cwd
Explore all cwn’s neighbours
1. If the neighbour has not been marked as visited, add it to the stack and mark as visited
2. Else, ignore it
Repeat 2-3 until the stack is empty
If the stack is empty, repeat step 1-4 until all every node in $G$ is visited.

Recursive Version

Take a start node and if it is not visited, apply steps 2-3 with that node as the cwd
Mark cwd as visited
For all neighbours of cwn:
1. If the neighbour is not marked as visited, recursively apply steps 2-3 with the neighbour as the cwd
2. Else, ignore it
Repeat 1-3 for every unvisited node in $G$

Pseudocode

Marking nodes

In the pseudocode below, there is no concrete definition of what marking means. Implementations include using node attributes, an array of nodes containing boolean values or even hash tables.

Iterative Pseudocode

Algorithm DFS_iterative(G: Graph):
	stack ← Stack 
	cwn ← Node 
	For each Node start_node in G do: //This outer loop handles disconnected components
		If start_node is not visited then:
			Push start_node to stack 
			While stack is not Empty do: #|V|
				cwn ← Pop from stack
				If cwn is not visited then:
					Mark cwn as visited
					For edge (cwn,v) in outgoing edges do: #|E|
						Push v to stack
					End do
				End if
			End do
		End if
	End do
End Algorithm

Recursive Pseudocode

Algorithm DFS_recursive(G: Graph):
	Function explore(v: Node):
		Mark v as visited
		For edge (v,w) in G do: #|E|
			If w is not visited then:
				explore(w)
			End if
		End do
	End Function


	For each Node start in G do: //This loop handls disconnected components
		If start is unvisited then:
			explore(start)
		End if
	End do:
	
End Algorithm

DFS Tree

Depth-first search can be visually modelled using a DFS tree.The final output is a tree because due to marking nodes, no node can be visited twice, i.e. there are no cycles. The tree is rooted with the start node of the DFS as the root.

If the graph is connected, then the DFS tree is a spanning tree. If the graph is disconnected, then we have a collection of trees, i.e. a DFS Forest

%%🖋 Edit in Excalidraw, and the dark exported image%%

When the DFS algorithm visits a node that has already been marked, a back edge is created. A DFS tree, combined with it’s back edges, can be used to reconstruct the original graph.

Time Complexity

Depending on the representation of the graph, DFS can have different runtimes. We assume an array is used to mark nodes, where a node $v$ is ‘marked’ if the $v$ th index in the array is marked.

Basic Operation: Traversing to a node (any statement that checks against a node)

When looking for neighbours of a node, we need to check the edge-list of the given node. Assuming the graph is connected (or that DFS is run on every disconnected subgraph), DFS always traverses every node , so the time complexity is at least $Θ (∣ V ∣)$ .

Adjacency List

For each node, finding neighbours is $Θ (de g (v))$ so the average case for checking neighbours (for every node) is always:

#tosee is this right?

i = 1 \sum ∣ V ∣ Θ (de g (v_{i})) = Θ (∣ E ∣)

*For an undirected graph

Giving our total runtime to be:

∣ V ∣ + i = 1 \sum ∣ V ∣ de g (v_{i}) = ∣ V ∣ + 2∣ E ∣ = Θ (∣ V ∣ + ∣ E ∣)

Case Scenario	Input Description	Growth Function	Asymptotic	Notes
Worst	$G$ is a complete graph (very dense)	#todo	$Θ (∥ V ∥^{2})$	In a complete graph, $∥ E ∥ = ∥ V ∥ (∥ V ∥ - 1) /2$ , so fastest growing term is $∥ V ∥^{2}$
Average			$Θ (∥ V ∥ + ∥ E ∥)$
Best	$G$ is a tree		$Θ (∥ V ∥)$	In a tree, the number of edges is always $∥ E ∥ = ∥ V ∥ - 1$ . $Θ (∥ V ∥ + ∥ V ∥ - 1) = Θ (∥ V ∥)$

Adjacency Matrix

For each node, checking neighbours is $Θ (∣ V ∣)$ always, so finding neighbours for every single node is always:

∣ V ∣ \cdot Θ (∣ V ∣) = Θ (∣ V ∣^{2})

Hence the total runtime is:

∣ V ∣ + ∣ V ∣^{2} = Θ (∣ V ∣^{2})

Case Scenario	Input Description	Growth Function	Asymptotic	Notes
Worst	$G$ is a complete graph (very dense)	#todo	$Θ (∥ V ∥^{2})$	Time complexity is unaffected because adjacency matrices have the same time complexity for every case
Average			$Θ (∥ V ∥^{2})$
Best	$G$ is a tree		$Θ (∥ V ∥^{2})$

Applications

Check if Graph is connected

We can modify the DFS algorithm to introduce a counter variable count, which increments every time the algorithm traverses an unvisited node. If $G$ is connected, then count should equal $∣ V ∣$ in one pass of the outermost loop.

Cycle detection

DFS can be modified to detect cycles, by checking for any back edges in the DFS tree, since the existence of them implies a cycle. Depending on whether the graph is directed or undirected, we can use different strategies.

If the graph is undirected, we simply need to check to see we have visited a node that is already marked as visited, excluding the ancestor node that actually called the DFS. The reason we exclude the ancestor node is because there will always be a case where DFS checks the edge $(an ces t or, c hi l d)$ twice, once from the ancestor node, and once from the child node.

This is very easy to implement in the recursive pseudocode because we can add a second parameter to explore which defines the ancestor node:

Algorithm DFS_recursive(G: Graph):
	Function explore(v: Node, ancestor: Node):
		Mark v as visited
		For edge (v,w) in G do: #|E|
			If w is not visited then:
				explore(w,v)
			Else if w != ancestor then:
				Display "Cycle Detected!"
			End if
		End do
	End Function
	
	For node v in G do: #|V|
		If v is not visited then:
			explore(v,v)
		End if
	End do

End Algorithm

The iterative version requires an auxiliary parent attribute for each node for the ancestor check:

#todo

Algorithm DFS_iterative(G: Graph):
	stack ← Stack 
	cwn ← Node 
	For each Node start_node in G do: //This outer loop handles disconnected components
		If start_node is not visited then:
			Push start_node to stack 
			While stack is not Empty do: #|V|
				cwn ← Pop from stack
				If cwn is not visited then:
					Mark cwn as visited
					For edge (cwn,v) in outgoing edges do: #|E|
						Push v to stack
					End do
				End if
			End do
		End if
	End do
End Algorithm

Topological Sorting

Other

Locating strongly connected components in a graph
Detecting k-partiteness in Graph (Maths)

Implementation

Iterative

import networkx as nx
from ADTs import Stack
 
def DFS(G: nx.Graph, start_node=None, end_node=None):
    # Setting start_node to the 'first' node in a graph
    if start_node == None:
        start_node = list(G)[0]
 
    # Initialize the Stact ADT to keep track of nodes
    # And a set of 'seen' nodes
    S = Stack()
    seen_nodes = set()
    output = []
    
    S.push(start_node)    
 
    while S.isFull():
        cwn = S.pop()    # Set Current Working Node to popped value
        # Do end_node check early
        # Note the check if a node is seen is done AFTER removing it from stack
        if cwn not in seen_nodes:
            seen_nodes.add(cwn)
            output.append(cwn)
            if cwn == end_node and end_node != None:
                return (output)
            neighbours = list(G.adj[cwn])
            neighbours.reverse()
            for neighbour in neighbours:
                S.push(neighbour)
    return output

Recursive

import networkx as nx
 
def Recursive_DFS(G: nx.Graph, start, end=None, output=list(), seen_nodes=set()):
    seen_nodes.add(start)
    output.append(start)
    if start == end and end != None:
        return (output)
    neighbours = list(G.adj[start])
    for neighbour in neighbours:
        if neighbour not in seen_nodes:
            Recursive_DFS(G, neighbour)
    return (output)

Questionably Accurate Notes

Explorer

Depth First Search Algorithm

Process

Iterative Version

Recursive Version

Pseudocode

Iterative Pseudocode

Recursive Pseudocode

DFS Tree

Time Complexity

Adjacency List

Adjacency Matrix

Applications

Check if Graph is connected

Cycle detection

Topological Sorting

Other

Implementation

Iterative

Recursive

Table of Contents

Related Concepts

See Also:

Questionably Accurate Notes

Explorer

Depth First Search Algorithm

Process §

Iterative Version §

Recursive Version §

Pseudocode §

Iterative Pseudocode §

Recursive Pseudocode §

DFS Tree §

Time Complexity §

Adjacency List §

Adjacency Matrix §

Applications §

Check if Graph is connected §

Cycle detection §

Topological Sorting §

Other §

Implementation §

Iterative §

Recursive §

Table of Contents

Related Concepts

See Also:

Process

Iterative Version

Recursive Version

Pseudocode

Iterative Pseudocode

Recursive Pseudocode

DFS Tree

Time Complexity

Adjacency List

Adjacency Matrix

Applications

Check if Graph is connected

Cycle detection

Topological Sorting

Other

Implementation

Iterative

Recursive