Depth First Search is a graph search algorithm that fully traverses a possible path before checking it’s neighbours. It dives in ‘depth first’. It uses the stack ADT to store nodes. Similar to it’s counterpart, BFS, it also relies on marking nodes to prevent searching the same node again.
DFS works on both directed and undirected graphs. In the case of directed graphs, neighbours are defined as nodes which have incoming edges from the current working node (any node pointed from an outgoing edge of current working node).
Process
Iterative Version
- Take a node in . If it is not visited, push it to the stack.
- Pop from the stack and treat the popped node as current working node
cwd
- Explore all
cwn
’s neighbours- If the neighbour has not been marked as visited, add it to the stack and mark as visited
- Else, ignore it
- Repeat
2-3
until the stack is empty - If the stack is empty, repeat step
1-4
until all every node in is visited.
Recursive Version
- Take a start node and if it is not visited, apply steps
2-3
with that node as thecwd
- Mark
cwd
as visited - For all neighbours of
cwn
:- If the neighbour is not marked as visited, recursively apply steps
2-3
with the neighbour as thecwd
- Else, ignore it
- If the neighbour is not marked as visited, recursively apply steps
- Repeat
1-3
for every unvisited node in
Pseudocode
Marking nodes
In the pseudocode below, there is no concrete definition of what marking means. Implementations include using node attributes, an array of nodes containing boolean values or even hash tables.
Iterative Pseudocode
Algorithm DFS_iterative(G: Graph):
stack ← Stack
cwn ← Node
For each Node start_node in G do: //This outer loop handles disconnected components
If start_node is not visited then:
Push start_node to stack
While stack is not Empty do: #|V|
cwn ← Pop from stack
If cwn is not visited then:
Mark cwn as visited
For edge (cwn,v) in outgoing edges do: #|E|
Push v to stack
End do
End if
End do
End if
End do
End Algorithm
Recursive Pseudocode
Algorithm DFS_recursive(G: Graph):
Function explore(v: Node):
Mark v as visited
For edge (v,w) in G do: #|E|
If w is not visited then:
explore(w)
End if
End do
End Function
For each Node start in G do: //This loop handls disconnected components
If start is unvisited then:
explore(start)
End if
End do:
End Algorithm
DFS Tree
Depth-first search can be visually modelled using a DFS tree.The final output is a tree because due to marking nodes, no node can be visited twice, i.e. there are no cycles. The tree is rooted with the start node of the DFS as the root.
If the graph is connected, then the DFS tree is a spanning tree. If the graph is disconnected, then we have a collection of trees, i.e. a DFS Forest
%%🖋 Edit in Excalidraw, and the dark exported image%%
When the DFS algorithm visits a node that has already been marked, a back edge is created. A DFS tree, combined with it’s back edges, can be used to reconstruct the original graph.
Time Complexity
Depending on the representation of the graph, DFS can have different runtimes. We assume an array is used to mark nodes, where a node is ‘marked’ if the th index in the array is marked.
Basic Operation: Traversing to a node (any statement that checks against a node)
When looking for neighbours of a node, we need to check the edge-list of the given node. Assuming the graph is connected (or that DFS is run on every disconnected subgraph), DFS always traverses every node , so the time complexity is at least .
Adjacency List
For each node, finding neighbours is so the average case for checking neighbours (for every node) is always:
#tosee is this right?
*For an undirected graph
Giving our total runtime to be:
Case Scenario | Input Description | Growth Function | Asymptotic | Notes |
---|---|---|---|---|
Worst | is a complete graph (very dense) | #todo | In a complete graph, , so fastest growing term is | |
Average | ||||
Best | is a tree | In a tree, the number of edges is always . |
Adjacency Matrix
For each node, checking neighbours is always, so finding neighbours for every single node is always:
Hence the total runtime is:
Case Scenario | Input Description | Growth Function | Asymptotic | Notes |
---|---|---|---|---|
Worst | is a complete graph (very dense) | #todo | Time complexity is unaffected because adjacency matrices have the same time complexity for every case | |
Average | ||||
Best | is a tree |
Applications
Check if Graph is connected
We can modify the DFS algorithm to introduce a counter variable count
, which increments every time the algorithm traverses an unvisited node. If is connected, then count should equal in one pass of the outermost loop.
Cycle detection
DFS can be modified to detect cycles, by checking for any back edges in the DFS tree, since the existence of them implies a cycle. Depending on whether the graph is directed or undirected, we can use different strategies.
If the graph is undirected, we simply need to check to see we have visited a node that is already marked as visited, excluding the ancestor node that actually called the DFS. The reason we exclude the ancestor node is because there will always be a case where DFS checks the edge twice, once from the ancestor node, and once from the child node.
This is very easy to implement in the recursive pseudocode because we can add a second parameter to explore
which defines the ancestor node:
Algorithm DFS_recursive(G: Graph):
Function explore(v: Node, ancestor: Node):
Mark v as visited
For edge (v,w) in G do: #|E|
If w is not visited then:
explore(w,v)
Else if w != ancestor then:
Display "Cycle Detected!"
End if
End do
End Function
For node v in G do: #|V|
If v is not visited then:
explore(v,v)
End if
End do
End Algorithm
The iterative version requires an auxiliary parent
attribute for each node for the ancestor check:
Algorithm DFS_iterative(G: Graph):
stack ← Stack
cwn ← Node
For each Node start_node in G do: //This outer loop handles disconnected components
If start_node is not visited then:
Push start_node to stack
While stack is not Empty do: #|V|
cwn ← Pop from stack
If cwn is not visited then:
Mark cwn as visited
For edge (cwn,v) in outgoing edges do: #|E|
Push v to stack
End do
End if
End do
End if
End do
End Algorithm
Topological Sorting
Other
- Locating strongly connected components in a graph
- Detecting k-partiteness in Graph (Maths)
Implementation
Iterative
import networkx as nx
from ADTs import Stack
def DFS(G: nx.Graph, start_node=None, end_node=None):
# Setting start_node to the 'first' node in a graph
if start_node == None:
start_node = list(G)[0]
# Initialize the Stact ADT to keep track of nodes
# And a set of 'seen' nodes
S = Stack()
seen_nodes = set()
output = []
S.push(start_node)
while S.isFull():
cwn = S.pop() # Set Current Working Node to popped value
# Do end_node check early
# Note the check if a node is seen is done AFTER removing it from stack
if cwn not in seen_nodes:
seen_nodes.add(cwn)
output.append(cwn)
if cwn == end_node and end_node != None:
return (output)
neighbours = list(G.adj[cwn])
neighbours.reverse()
for neighbour in neighbours:
S.push(neighbour)
return output
Recursive
import networkx as nx
def Recursive_DFS(G: nx.Graph, start, end=None, output=list(), seen_nodes=set()):
seen_nodes.add(start)
output.append(start)
if start == end and end != None:
return (output)
neighbours = list(G.adj[start])
for neighbour in neighbours:
if neighbour not in seen_nodes:
Recursive_DFS(G, neighbour)
return (output)