Depth First Search (DFS) is a search algorithm mostly used in graphs and binary tree data structures. It is useful to solve cycle detection in the graph, topological sort, and pathfinding problems. DFS can be implemented in a recursive and nonrecursive fashion using a stack.
How DFS Works?
0
heading
In a given graph or tree, DFS starts at the root or start node. It explores the node by expanding its children.
It visits the first child node and expands and explores its children. It will keep exploring until it reaches the last leaf node.
After it has explored a tree under the first child then it explores its sibling nodes.
The algorithm ends when all trees under all siblings explored
The algorithm can also terminate early if we found goal node
For Example, In the graph below we start at 0 then expand children 1,2, and 3. We first visit child 1 first and expand the tree under it. We see it has children 4 and 5. We visit 4 first. Since 4 doesn’t have any children we visit its sibling 5. Since 5 doesn't have any children and parent of 5 i.e 1 don't have any more children we move to visit its sibling 2. We explore 2 children which is 6 only. Since 2 don’t have any more children we move to 3. 3 has 7 as a child we explore that. Now since 7 don't have children we move back to parent 3. Also all of 1 sibling we already explored we move back to 1. The algorithm terminates here. So the path we traverse is 0->1->4->5->2->6->3->7
1
list
2
image
Represent Problem as Graph
3
paragraph
Many times in a problem graph is not given. Instead, a multi-array is given which has a relation between 2 nodes.
For the above example it can be presented as [[0,1],[0,2],[0,3],[1,4],[1,5],[2,6][3,7]]. We can represent this in the form of a dictionary.
The graph can be directed or undirected.
4
list
edges = collections.defaultdict(set)
for u, v in nodes:
edges[u].add(v)
edges[v].add(u) #Use this only if it's undirected graph
5
code
Recursive DFS
6
heading
Promoted
Promoted
In recursive DFS we recursively use the function every time we want to explore a new node.
To check if the node is already visited or not, we need extra space. We store all visited nodes in this space.
7
list
import collections
def traverse(nodes, n):
edges = collections.defaultdict(set)
for u, v in nodes:
edges[u].add(v)
edges[v].add(u)
def DFS(edges, node, visited):
visited[node] = True #Mark current node visited
print(node) #In future we can modify this to check conditions. For now we only print node.
for u in edges[node]:
if not visited[u]: #visit only if not visited
DFS(edges, u, visited) #recursilvely call function to visit node u
visited = [False for _ in range(n)]
DFS(edges, 0, visited)
8
code
Iterative DFS
9
heading
All recursive problems can be solved with iteration by using a stack.
We use the stack in DFS to store nodes we want to visit later.
When we visit nodes we push all its children to stacks and then pop the first child out.
We mark that child visited and push all its non-visited children to stack.
We do this until the stack is empty.
10
list
import collections
def traverse(nodes, n):
edges = collections.defaultdict(set)
for u, v in nodes:
edges[u].add(v)
edges[v].add(u)
visited = [False for _ in range(n)]
stack = [] #Stack to store non visited node to explore later
visited[0] = True
stack.append(0)
while stack:
v = stack.pop()
print(v)
for u in edges[v]:
if not visited[u]:
visited[u] = True
stack.append(u)
11
code
Analysis of DFS
12
heading
Time and Space complexity
13
subheading
Promoted
Promoted
We traverse through all nodes. Therefore time complexity is O(N), Space complexity is O(N)
We can also represent complexity in terms of branching factors. If every node has b branch and m path length. Then time complexity is O(b^m). Since we store the entire tree of a child at a time. Space complexity is O(b*m)
14
list
Is DFS Complete?
15
subheading
Let's imagine a graph is directed and it has a cycle as shown in the example below. We can’t say for sure DFS will stop.
We are required to detect the cycle and additional conditions to get out of the cycle.
For large graphs, it might get complex if there are many and longer cycles. DFS is not good if graphs have cycles.
DFS is complete for the finite acyclic graph.
16
list
17
image
Is DFS Optimal?
18
subheading
In a graph, if we have multiple goal nodes, the optimal solution will be the shortest path to any of the nodes.
DFS will stop as soon as it finds the goal node. But it may not be the best goal node.
In the example below shaded nodes are goal nodes. DFS will find a goal node connected to 16. This is not the best node. As there is another node closer to 1.
So DFS is not optimal. DFS is appropriate when the goal node is not shallow.