7 min read

Pairing with my good friend Code Llama

AI won't take your job, but someone who's leveraging AI will. Besides, it's easy getting started with AI, especially if you're looking for a coding buddy.
Pairing with my good friend Code Llama

Because it’s my bag, I’ve been diving deep into the incredibly exciting and rapidly evolving world of AI. The innovations in the open source arena are astonishing; indeed, you can now run a diverse set of LLMs from the convenience of your laptop. If you are willing to roll up your sleeves, there are plenty of OSS models and frameworks that cost you only the time it takes to set things up. 

One popular tool for running open source LLMs locally is Ollama. I started using it recently and I’ve been blown away with how useful, productive, and enlightening common workflows can become if you take the time to explore, understand, and ultimately leverage AI. With Ollama, you can run various LLMs such as Meta's Llama2, Mistral, LLaVA, and my new favorite, Meta's Code Llama.

Meta's Code Llama

Code Llama is an LLM that can use text prompts to generate and discuss code. With Code Llama, you can generate Python, Java, Typescript, and more. As Meta's documentation for Code Llama states:

It has the potential to make workflows faster and more efficient for developers and lower the barrier to entry for people who are learning to code. 

Indeed, the evolution of AI enhanced productivity will come in the form of boosted workflows. These workflows will have fewer speed bumps resulting from context switching and accordingly increase productive capacity.

I saw AI's boosting of productivity potential firsthand recently. It’s been quite a long time since I’ve coded in Python (circa 2005!) and in an effort to refresh my memory on the finer aspects of Python, I decided to play some ping-pong. But in this case, I decided that I'd use Code Llama as my coding partner.

After installing Ollama, you need to download the Code Llama LLM like so:

$ ollama run codellama

Installing the Code Llama LLM is simple!

After downloading the LLM, which could take a bit, you’ll be presented with an in-terminal chat-like interface. You prompt Code Llama and it'll magically respond with (mostly) cogent answers.

Ping-pong pair programming

In ping-pong, your partner writes a failing test case. Your job is to write the least amount of code that makes that failing test case pass. Once you’ve done that, you then write a failing test case and your partner in turn writes the least amount of code to make your test case pass. You go back and forth and in the process of playing ping-pong, you'll undoubtedly learn new techniques, styles, and ways of approaching problem solving. Over the years, I've learned valuable lessons when pairing with folks! Ping pong has certainly made me a better programmer. I've even employed ping-pong in the context of interviewing.

I figured I’d get right into AI pair programming by building a simple stack implementation in Python. Stacks commonly have push and pop methods which follow a last in, first out (or LIFO) algorithm. push adds items onto the stack and pop removes the last item added. I was also aware that out-of-the-box, Python comes with a unit testing framework dubbed unittest (akin to Java's JUnit).

Kicking things off, I began with the following prompt:

>>>> write me a test using python unittest for a pop method of a stack

Prompting Code Lama via Ollama.

As requested, Code Llama promptly returned some code using Python’s unittest framework:

import unittest
from collections import deque

class TestStack(unittest.TestCase):
    def setUp(self) -> None:
        self.stack = deque()

    def test_pop(self) -> None:
        self.assertEqual(self.stack.pop(), None)
        for i in range(5):
            self.stack.append(i)
        self.assertEqual(self.stack.pop(), 4)
        self.assertEqual(self.stack.pop(), 3)
        self.assertEqual(self.stack.pop(), 2)
        self.assertEqual(self.stack.pop(), 1)
        with self.assertRaises(IndexError):
            self.stack.pop()

Code Llama's answer to my "create a test for me in Python" prompt.

Code Llama additionally described the usage of unittest’s setup method as well as the usage of assertEqual and assertRaises

Of course, I was provided with way more implementation code than what I really wanted. Nevertheless, the example provided was enough to get me started as the example code quickly got me up to speed on unittest and also informed me that Python has an implementation of a stack data structure via the deque object. 

I copied the code into a file dubbed test_stack.py and then ran it. Unfortunately, nothing happened. Then I remembered from my Python days of yore that I needed the equivalent of amain method. But I had forgotten the syntax. No problem, I just asked Code Llama and it told me to insert this:

if __name__ == '__main__':
   unittest.main()

A main method in Python.

I added the above code and reran the test. Unsurprisingly, the test errors out as the first assert of self.stack.pop() raises an IndexError because the underlying list is initially empty. You can see that the test case, in fact, verifies this case correctly in the last part of the test_pop method. 

Testing, testing, 1, 2, 3

As it turns out, AI (inadvertently) wrote me a failing test! Now all I needed to do was make the test pass with as little code as possible. I figured that I would trim down the test case and simply test for an empty pop yielding None

import unittest
from collections import deque

class TestStack(unittest.TestCase):
    def setUp(self) -> None:
        self.stack = deque()

    def test_pop(self) -> None:
        self.assertEqual(self.stack.pop(), None)

if __name__ == '__main__':
    unittest.main()

Trimmed down test case.

Next, I wanted to write my own stack implementation and remove deque. Looking at the TestStack class, I remembered some important aspects of coding in Python: methods are defined with def, whitespace is a delimiter, and don't forget about colons. But I couldn't remember exactly how imports work. No problem! All I had to do was ask Code Llama:

>>> how do I import a python class in my local directory?

Prompting my buddy for some answers.

I was informed via a nice description that I could create a file, using snake case, and inside of the file define a class. Then in another file, I can import that class using the from file import class syntax. Excellent! Next, I needed to define a minimal stack – ideally, with only a pop method. Might as well ask my groovy fried Code Llama how to define a custom stack in Python:

>>> how do I implement a custom stack class in python?

I could use this prompt as a Google search, but then I'd have to click on the webpage I think is appropriate.

Code Llama dutifully spat out the following code:

class CustomStack:
    def __init__(self):
        self.stack = []

    def push(self, item):
        self.stack.append(item)

    def pop(self):
        if len(self.stack) == 0:
            raise IndexError("Stack is empty")
        return self.stack.pop()

    def peek(self):
        if len(self.stack) == 0:
            raise IndexError("Stack is empty")
        return self.stack[-1]

    def size(self):
        return len(self.stack)

A handy stack generated by Code Llama.

Not only did I get what appears to be working pop and push methods, but I also got a peek and size method to boot! This is obviously more code than I needed; accordingly, I simplified the code with the intent of making the aforementioned test pass:

class Stack:
    def __init__(self):
        self.stack = []

    def pop(self):
        None

The least amount of code required to get my failing test to pass.

When I rerun the test case, things pass! Of course, I've merely written the least amount of code necessary to make the test pass. It's now my friend Code Llama's turn to write a failing test case. This wasn't too hard of a prompt to fashion:

>>> write me a test that fails for the pop method in this class: class Stack:
...     def pop(self):
...         return None

Prompting Code Llama to create a failing test case.

Impressively, Code Llama did exactly what I wanted. It generated a simple test case that will force a failure:

import unittest
from my_stack import Stack

class TestStack(unittest.TestCase):
    def test_pop(self):
        stack = Stack()
        self.assertEqual(stack.pop(), 1)

Thank you Code Llama for an easy test case to expose my simple code!

I added another test case to my existing TestStack, ran it, and sure enough, I've got a failing test. It's time to make it pass. My brutally simple pop method needs to do some real logic as opposed to returning None.

Can't pop without a push

Satisfying my friend Code Llama's new test case means I need to flesh out a push method so I can test popping a value.

class Stack:
    def __init__(self):
        self.stack = []

    def pop(self):
        if len(self.stack) > 0:
            return self.stack[-1]
        else:
            return None

    def push(self, value):
        self.stack.append(value)

A refactored Stack class with a newly defined push method.

My Stack now has a push method that adds an item to an internal list (self.stack) and the pop method simply returns the top item or None if the list is empty.

I've refactored my test case slightly and have two passing test cases now.

import unittest
from stack import Stack

class TestStack(unittest.TestCase):
    def setUp(self):
        self.stack = Stack()

    def test_pop(self):
        self.assertEqual(self.stack.pop(), None)

    def test_pop_again(self):
        self.stack.push(1)
        self.assertEqual(self.stack.pop(), 1)

if __name__ == '__main__':
    unittest.main()

A valid test case fashioned with the help of my good friend Code Llama.

After a few back and forth interactions with Code Llama and my Python confidence is increased. I can see how this technique would work equally well with a language I wasn't familiar with too.

What a match!

The game of ping-pong need not go on forever. In fact, a few rounds is often enough to get into the flow of things and learn a little from your partner. In my case, I had a lot of fun getting to know how Code Llama approaches problem solving and along the way, it helped re-acquainted me with Python.

I dig ping-pong pairing with an AI and will continue to do it going forward. As with all things AI related, prompting is a crucial skill as well as having a keen eye for code correctness. LLMs will confidently give you an incorrect answer (i.e. code that doesn't work).

In my case, I was using a terminal; nevertheless, there are other in-IDE options ranging from GitHub's Copilot to other innovative tools. You can also take advantage of Ollama and Code Llama via CodeGPT and have yourself some sweet VSCode integration.

Can you dig it, man?