# Programming #6: Shortest Path Problem

Now that we have some of the programming basics under our belt, we can start using them to solve problems. Let’s start with one of the most famous problems in computer science, the shortest path problem.

As the name implies, the shortest path problem is about finding the shortest path in a graph. Imagine that we have a graph that represents our home town. Vertices or nodes would represent locations, and edges would represent streets connecting those locations. For simplicity, let’s say that each street goes both directions (undirected graph). We can also add weights to each edge that tell us the distance between two locations along a a given street. In this example, the shortest path would be the path between two locations, where the sum of the distance between streets is as small as possible.

### Unweighted Graph

Before we work out this problem, let’s imagine the solution for an unweighted graph. In this case, the shortest path between two vertices would just be the path with the fewest number of edges. The solution to this problem is just a breadth first search (BFS). Recall that with a BFS, we start at one vertex, visit all the closest vertices (as opposed to following one edge first), and then expand the search to more distant vertices until we find the one we are looking for. This ends up giving us the path with the least amount of edges. Back to the weighted graph problem.

### Dijkstra’s Algorithm

One solution to the shortest path problem is Dijkstra’s Algorithm. From wikipedia:

Let the node at which we are starting be called the initial node. Let the distance of node Y be the distance from the initial node to Y. Dijkstra’s algorithm will assign some initial distance values and will try to improve them step by step.

1. Assign to every node a tentative distance value: set it to zero for our initial node and to infinity for all other nodes.
2. Set the initial node as current. Mark all other nodes unvisited. Create a set of all the unvisited nodes called the unvisited set.
3. For the current node, consider all of its unvisited neighbors and calculate their tentative distances. Compare the newly calculated tentative distance to the current assigned value and assign the smaller one. For example, if the current node A is marked with a distance of 6, and the edge connecting it with a neighbor B has length 2, then the distance to B (through A) will be 6 + 2 = 8. If B was previously marked with a distance greater than 8 then change it to 8. Otherwise, keep the current value.
4. When we are done considering all of the neighbors of the current node, mark the current node as visited and remove it from the unvisited set. A visited node will never be checked again.
5. If the destination node has been marked visited (when planning a route between two specific nodes) or if the smallest tentative distance among the nodes in the unvisited set is infinity (when planning a complete traversal; occurs when there is no connection between the initial node and remaining unvisited nodes), then stop. The algorithm has finished.
6. Otherwise, select the unvisited node that is marked with the smallest tentative distance, set it as the new “current node”, and go back to step 3.

Of course, this finds the distance between a source and all other vertices. In our example, we just want to find the distance from location “A” to location “E”. To do this, we can use a min priority queue to efficiently store minimum distances for easy removal. For example, if we store all of our vertices in the priority queue and use extract min, the first element removed would be our starting location with a distance of 0. From here we can just follow each edge and update the distance values of the corresponding vertices in our queue. Using extract min again, we will move forward in our graph along the shortest edge. This process can be repeated until we find the vertex we are looking for.

This is an example of a greedy algorithm since we pick whatever option looks best at the moment. The basic run-time for Dijkstra’s algorithm is O(V2), where V represents the number of vertices in the graph. If our priority queue is optimized, we can bring the run-time down to O(E+Vlog(V)).

### Implementation

Let’s implement a version of our algorithm that uses a priority queue to speed up the process. Say we are given an undirected weighted graph G (represented as an adjacency dict), a starting vertex, and an ending vertex. Our goal is to return the shortest path between our starting vertex and ending vertex, along with the distance. For example:

```G = {'A': {'B': 3, 'C': 4},
'B': {'A': 3, 'C': 5, 'D': 6, 'E': 2},
'C': {'A': 4, 'B': 5, 'E': 7},
'D': {'B': 6},
'E': {'B': 2, 'C': 7}}
start = 'A'
end = 'E'

solution = (['A', 'B', 'E'], 5)
```

A solution built out in Python2:

```# Import heapq to use as our priority queue
import heapq

# Build shortest path algoritm
def get_shortest_path(G, start, end):
"""
Return the single shortest path form start to end
in the form of (distance, [verticies])).
"""
# Initialize queue and explored variables
queue = [(0, start, [])]
explored = set()

# Main loop
while True:
(dist, v, solution) = heapq.heappop(queue)
if v not in explored:
solution = solution + [v]