Programming #1: List-Based Collections

A collection can be thought of as a group of things. For example, you could have a collection of toys, books, stuff in your room, etc. Collections don’t have a particular order and the types of things they contain can vary. For example, if you reorder a collection of books, it’s still a collection of books. If you ask for book number three, it doesn’t really mean anything.

This lack of structure reduces the usefulness of collections in the context of programming. To get around this, data structures have been developed that take collections and logical rules to them. This post will explore some of the most common list-based collections.

Arrays

Arrays are just collections that have an order to them. Depending on the programming language, arrays can hold elements of different types or they can be uniform. In some languages arrays have a set size and in some they don’t. Across all languages, each element in an array will have a location called an index. Think of this as a number assignment starting from 0. For example, index 1 of the array [10, ‘purple’, 9, 30] would be ‘purple’.

Pros: Easy to access a location. For example, if you need to access the location in the middle of the array you can just keep track of how long the array is calculate the middle index from that.

Cons: Insertion and deletion can be costly. If you add something to the middle of an array, you have to shift everything that comes after it to a new index location. In the worst case, this time on the order of O(n). Deletion works the same way.

In Python, there is a built-in data structure called a ‘list’ that is built as an array. The time complexity of various list based methods can be found at https://wiki.python.org/moin/TimeComplexity. An example of lists in Python:

# Initialize list
my_list = ['red', 'green', 'blue']

# get value at index 0
print my_list[0]
# returns 'red'

# add 'orange' to the end of the list
my_list.append('orange')
print my_list
# returns ['red', 'green', 'blue', 'orange']

Linked List

Unlike an array, a linked list has no indices. Instead, elements of a linked list keep track of what element comes after them (the link). The elements may not know where they are in relation to the larger list, but they know where they are in their neighborhood. For example, an element in an Array will store a ‘value’ and an ‘index’, while an element in a linked list will store a ‘value’ and a ‘next value’.

Pros: Easy to insert and delete. Adding an element is as easy as changing the ‘next value’ that comes before it. Happens in constant time O(1).

Cons: Hard to select a single location, like the middle.

An example of a Linked List class coded in Python:

class Element(object):
    def __init__(self, value):
        self.value = value
        self.next = None

class LinkedList(object):
    def __init__(self, head=None):
        self.head = head

    def append(self, new_element):
        current = self.head
        if self.head:
            while current.next:
                current = current.next
            current.next = new_element
        else:
            self.head = new_element

    def get_position(self, position):
        """
        Get an element from a particular position.
        Assume the first position is "1".
        Return "None" if position is not in the list.
        """
        counter = 1
        current = self.head
        if position < 1:
            return None
        while current and counter <= position:
            if counter == position:
                return current
            current = current.next
            counter += 1
        return None
    
    def insert(self, new_element, position):
        """
        Insert a new node at the given position.
        Assume the first position is "1".
        Inserting at position 3 means between
        the 2nd and 3rd elements.
        """
        counter = 1
        current = self.head
        if position > 1:
            while current and counter < position:
                if counter == position - 1:
                    new_element.next = current.next
                    current.next = new_element
                current = current.next
                counter += 1
        elif position == 1:
            new_element.next = self.head
            self.head = new_element
    
    
    def delete(self, value):
        """
        Delete the first node with a given value.
        """
        current = self.head
        previous = None
        while current.value != value and current.next:
            previous = current
            current = current.next
        if current.value == value:
            if previous:
                previous.next = current.next
            else:
                self.head = current.next

Stacks

A stack can be thought of like a stack of books in real life, you have easy access to add or remove a book on top of the stack, but if you want to see the book on the bottom you have remove all the books on top. Adding an element to a stack is called a push and removing an element is called a pop. Stacks function in a “last in, first out” sense. You will always pop the last element that was pushed.

Pros: Useful when you only care about the most recent element or the order in which you saved elements. Pushing and popping from the top of the stack happens in constant time O(1).

Cons: Unable to easily access elements in the middle of the stack.

In Python, the list data type can be used as a stack by using append() and pop(). An example of a stack class built from scratch:

class Element(object):
    def __init__(self, value):
        self.value = value
        self.next = None

class LinkedList(object):
    def __init__(self, head=None):
        self.head = head

    def append(self, new_element):
        current = self.head
        if self.head:
            while current.next:
                current = current.next
            current.next = new_element
        else:
            self.head = new_element

    def insert_first(self, new_element):
        """
        Insert new element as the head of the LinkedList
        """
        new_element.next = self.head
        self.head = new_element

    def delete_first(self):
        """
        Delete the first (head) element in the LinkedList
        and return it
        """
        deleted = self.head
        if self.head:
            self.head = self.head.next
            deleted.next = None
        return deleted

class Stack(object):
    def __init__(self,top=None):
        self.ll = LinkedList(top)

    def push(self, new_element):
        """
        Push (add) a new element onto the top of the stack
        """
        self.ll.insert_first(new_element)

    def pop(self):
        """
        Pop (remove) the first element off the top of the
        stack and return it
        """
        return self.ll.delete_first()

Queues

Queues are like stacks but instead of “last in, first out” they use “first in, first out”. Think of a line of people waiting to buy tickets, the first person to get in line will be the first person called up to the ticket booth. The last person to get in line will be the last. The first element in a queue is called the head and the last element is called the tail. Adding to the tail end is called an enqueue and removing from the head end is called a dequeue.

Pros: Useful when you care about ranking elements based on how long they have been in the queue and making sure you get to every element. Operations happen in constant time O(1).

Cons: Unable to easily access elements in the middle of the Queue.

In Python, the collections.deque module is built-in as a double-ended queue. An example of a queue class built from scratch:

class Queue(object):
    def __init__(self, head=None):
        self.storage = [head]

    def enqueue(self, new_element):
        self.storage.append(new_element)

    def peek(self):
        return self.storage[0]

    def dequeue(self):
        return self.storage.pop(0)

Conclusion

Your choice of data structure will always depend on your goal.

If you need to be able to access every element directly and in constant time, use an array. Think of a book. Each page can be opened independently of the others and it is easy to find only the page you are looking for.

If you need to be able to access and modify elements sequentially, use a linked list. Think of a roll of film. You have to unroll the whole thing to get the element you want, but it is easy to splice in a frame.

If you need to be able to access the most recent item added to a list, use a stack. For example, if you wanted to reverse the letters in a word, you could push each letter to a stack and then pop them out in order.

If you need to be able to access every item added to a list in order, use a queue. For example, if you want to implement a breadth-first search, you could add all the nearest possibilities to a queue and explore them each in the order added. Repeat the process until the search finds its target.

Advertisements

Author: David Shahrestani

"I have the strength of a bear, that has the strength of TWO bears."

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s