We would be remiss in our study of programming if we did not devote some time to the crafting of quality code. This subject can be a bit subjective and many programmers have a dogmatic attachment to what they believe qualifies as quality code. Luckily for us, some standards have emerged. The Python Enhancement Proposal (PEP 8) is the go-to style guide for Python code. The Google Python Style Guide is another great resource. For an interactive guide, consider Code Like a Pythonista.
This post will go through an example of crafting readable code to solve a problem. The code can be found on Github. We will stick to some best practices as summarized nicely in the Zen of Python:
Beautiful is better than ugly.
Explicit is better than implicit.
Simple is better than complex.
Complex is better than complicated.
Flat is better than nested.
Sparse is better than dense.
Special cases aren’t special enough to break the rules.
Although practicality beats purity.
Errors should never pass silently.
Unless explicitly silenced.
In the face of ambiguity, refuse the temptation to guess.
There should be one—and preferably only one—obvious way to do it.
Although that way may not be obvious at first unless you’re Dutch.
Now is better than never.
Although never is often better than right now.
If the implementation is hard to explain, it’s a bad idea.
If the implementation is easy to explain, it may be a good idea.
Namespaces are one honking great idea—let’s do more of those!
Continue reading “Programming #9: Style and Testing”
In computer science, complexity classes help us talk about the difficulty of a problem. If there is a fast way to the find the solution of a problem (search), the problem is said to be Polynomial-time (P). If there is a fast way to check the solution of a problem (verification), the problem is said to be Nondeterministic Polynomial-time (NP).
For example, there exists a fast way to solve the sorting problem and a fast way to verify that the sort is correct. The sorting problem is thus both P and NP. For a problem like finding the solution in Sudoku, it is easy to check if a solution is correct, just see if the numbers add up, but hard to search for the solution in the first place, try every combination? The Sudoku problem NP, until it can otherwise be proved that a quick solution exists.
NP-hard is another class of problems that may or may not be in NP, but if solved would give us the answer to every NP problem. NP-complete is another class of problems that consist of problems that are both NP and NP-hard. Solving an NP-complete problem in a reasonable amount of time would be world-changing. The Knapsack problem is an NP-complete problem.
Continue reading “Programming #8: Knapsack Problem”
To illustrate the basic approach to developing and analyzing algorithms, let’s consider a simple case of the dynamic connectivity problem. Basically, if we are given a list of connections between elements, and then given two elements, we want to return whether or not they are connected. More formally:
Continue reading “Programming #7: Dynamic Connectivity Problem”
Now that we have some of the programming basics under our belt, we can start using them to solve problems. Let’s start with one of the most famous problems in computer science, the shortest path problem.
As the name implies, the shortest path problem is about finding the shortest path in a graph. Imagine that we have a graph that represents our home town. Vertices or nodes would represent locations, and edges would represent streets connecting those locations. For simplicity, let’s say that each street goes both directions (undirected graph). We can also add weights to each edge that tell us the distance between two locations along a a given street. In this example, the shortest path would be the path between two locations, where the sum of the distance between streets is as small as possible.
Continue reading “Programming #6: Shortest Path Problem”
Imagine that we want to represent a social network in a data structure. A connection between two people will imply that they know each other. With what we have discussed so far, trees jump out as the obvious way to represent connections between nodes. But recall that a tree does not allow for cycling, i.e the bottom of a tree can’t be connected to the top. With a social network, this structure doesn’t make sense.
A graph on the other hand allows for cycling. There is no root node like with a tree, instead everything is connected in a web. In the social network example, names of people would be the vertices/nodes of our graph and each line connecting two vertices would be an edge.
Continue reading “Programming #5: Graphs”
The next data structure we’ll be talking about is a tree. Like their real life counterparts, trees as a data structure have a root, some branches, some leaves, etc. In programming, a tree is basically just an extension of a linked list. Recall that an element or node in a linked list stores its value and a pointer to the next value. With a node in a tree, we store the value of a node and the next values for its children. A tree must also be completely connected, if we start at the first node, we can reach every element in the tree. There will also be no cycles, going down a tree can’t bring us back up.
Continue reading “Programming #4: Trees”
So far, we have talked about data structures that use numerical indices to find values in an array. What if we want more freedom in how we look up a value. Consider the Oxford dictionary, if we wanted to find the definition (value) for a certain word (key), we would just look up the word.
Dictionaries are structured as maps. The main idea behind a map is that we can look something up by its key and find whatever is stored by its value. This key-value structure lends itself to many uses that an index based array could not. In Python, the dictionary data-structure is a map.
Continue reading “Programming #3: Maps and Hashing”