Error handling and testing solutions

According to the part of the course you are following, we will review two kinds of tests:

  • Part A testing with asserts: moved to SoftPython

  • Part B testing with unittest: read this notebook

Testing

  • If it seems to work, then it actually works? Probably not.

  • The devil is in the details, especially for complex algorithms.

  • We will do a crash course on testing in Python

WARNING: Bad software can cause losses of million $/€ or even harm people. Suggested reading: Software Horror Stories

Where Is Your Software?

As a data scientist, you might likely end up with code which is moderately complex from an algorithmic point of view, but maybe not too big in size. Either way, when red line is crossed you should start testing properly:

where is your software

In a typical scenario, you are a junior programmer and your senior colleague ask you to write a function to perform some task, giving only an informal description:

def my_sum(x,y):
    """ RETURN the sum of x and y
    """
    raise Exception("TODO IMPLEMENT ME!")

Even better, your colleague might provide you with some automated tests you might run to check your function meets his/her expectations. If you are smart, you will even write tests for your own functions to make sure every little piece you add to your software is a solid block you can build upon.

even_numbers example

Let’s see a slightly more complex function:

[2]:
def even_numbers(n):
    """
    Return a list of the first n even numbers

    Zero is considered to be the first even number.

    >>> even_numbers(5)
    [0,2,4,6,8]
    """
    raise Exception("TODO IMPLEMENT ME!")

In this case, if you run the function as it is, you are reminded to implement it:

>>> even_numbers(5)
---------------------------------------------------------------------------
Exception                                 Traceback (most recent call last)
<ipython-input-2-d2cbc915c576> in <module>()
----> 1 even_numbers(5)

<ipython-input-1-a20a4ea4b42a> in even_numbers(n)
      8     [0,2,4,6,8]
      9     """
---> 10     raise Exception("TODO IMPLEMENT ME!")

Exception: TODO IMPLEMENT ME!

Why? The instruction

raise Exception("TODO IMPLEMENT ME!")

tells Python to immediatly stop execution, and signal an error to the caller of the function even_number. If there were commands right after raise Exception("TODO IMPLEMENT ME"), they would not be executed. Here, we are directly calling the function from the prompt, and we didn’t tell Python how to handle the Exception, so Python just stopped and showed the error message given as parameter to the Exception

Spend time reading the function text!

Always carefully read the function text and ask yourself questions! What is the supposed input? What should be the output? Is there any output to return at all, or should you instead modify in-place a passed parameter (i.e. for example, when you sort a list)? Are there any edge cases, es what happens for n=0)? What about n < 0 ?

Let’s code a possible solution. As it often happens, first version may be buggy, in this case for example purposes we intentionally introduce a bug:

[3]:
def even_numbers(n):
    """
    Return a list of the first n even numbers

    Zero is considered to be the first even number.

    >>> even_numbers(5)
    [0,2,4,6,8]
    """
    r = [2 * x for x in range(n)]
    r[n // 2] = 3   # <-- evil bug, puts number '3' in the middle, and 3 is not even ..
    return r

Typically the first test we do is printing the output and do some ‘visual inspection’ of the result, in this case we find many numbers are correct but we might miss errors such as the wrong 3 in the middle:

[4]:
print(even_numbers(5))
[0, 2, 3, 6, 8]

Furthermore, if we enter commands a the prompt, each time we fix something in the code, we need to enter commands again to check everything is ok. This is inefficient, boring, and prone to errors.

Let’s add assertions

To go beyond the dumb “visual inspection” testing, it’s better to write some extra code to allow Python checking for us if the function actually returns what we expect, and throws an error otherwise. We can do so with assert command, which verifies if its argument is True. If it is not, it raises an AssertionError immediately stopping execution.

Here we check the result of even_numbers(5) is actually the list of even numbers [0,2,4,6,8] we expect:

assert even_numbers(5) == [0,2,4,6,8]

Since our code is faulty, even_numbers returns the wrong list [0,2,3,6,8] which is different from [0,2,4,6,8] so assertion fails showing AssertionError:

---------------------------------------------------------------------------
AssertionError                            Traceback (most recent call last)
<ipython-input-21-d4198f229404> in <module>()
----> 1 assert even_numbers(5) != [0,2,4,6,8]

AssertionError:

We got some output, but we would like to have it more informative. To do so, we may add a message, separated by a comma:

assert even_numbers(5) == [0,2,4,6,8], "even_numbers is not working !!"
---------------------------------------------------------------------------
AssertionError                            Traceback (most recent call last)
<ipython-input-18-8544fcd1b7c8> in <module>()
----> 1 assert even_numbers(5) == [0,2,4,6,8], "even_numbers is not working !!"

AssertionError: even_numbers is not working !!

So if we modify code to fix bugs we can just launch the assert commands and have a quick feedback about possible errors.

Error kinds

As a fact of life, errors happen. Sometimes, your program may have inconsistent data, like wrong parameter type passed to a function (i.e. string instead of integer). A good principle to follow in these cases is to try have the program detect weird situations, and stop as early as such a situation is found (i.e. in the Therac 25 case, if you detect excessive radiation, showing a warning sign is not enough, it’s better to stop). Note stopping might not always be the desirable solution (if one pidgeon enters one airplane engine, you don’t want to stop all the other engines). If you want to check function parameters are correct, you do the so called precondition checking.

There are roughly two cases for errors, external user misusing you program, and just plain wrong code. Let’s analyize both:

Error kind a) An external user misuses you program.

You can assume whover uses your software, final users or other programmers , they will try their very best to wreck your precious code by passing all sort of non-sense to functions. Everything can come in, strings instead of numbers, empty arrays, None objects … In this case you should signal the user he made some mistake. The most crude signal you can have is raising an Exception with raise Exception("Some error occurred"), which will stop the program and print the stacktrace in the console. Maybe final users won’t understand a stacktrace, but at least programmers hopefully will get a clue about what is happening.

In these case you can raise an appropriate Exception, like TypeError for wrong types and ValueError for more generic errors. Other basic exceptions can be found in Python documentation. Notice you can also define your own, if needed (we won’t consider custom exceptions in this course).

NOTE: Many times, you can consider yourself the ‘careless external user’ to guard against.

Let’s enrich the function with some appropriate type checking:

Note that for checking input types, you can use the function type() :

[5]:
type(3)
[5]:
int
[6]:
type("ciao")
[6]:
str

Let’s add the code for checking the even_numbers example:

[7]:
def even_numbers(n):
    """
    Return a list of the first n even numbers

    Zero is considered to be the first even number.

    >>> even_numbers(5)
    [0,2,4,6,8]
    """
    if type(n) is not int:
        raise TypeError("Passed a non integer number: " + str(n))

    if n < 0:
        raise ValueError("Passed a negative number: " + str(n))

    r = [2 * x for x in range(n)]
    return r

Let’s pass a wrong type and see what happens:

>>> even_numbers("ciao")

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-14-a908b20f00c4> in <module>()
----> 1 even_numbers("ciao")

<ipython-input-13-b0b3a85f2b2a> in even_numbers(n)
      9     """
     10     if type(n) is not int:
---> 11         raise TypeError("Passed a non integer number: " + str(n))
     12
     13     if n < 0:

TypeError: Passed a non integer number: ciao

Now let’s try to pass a negative number - it should suddenly stop with a meaningful message:

>>> even_numbers(-5)

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-15-3f648fdf6de7> in <module>()
----> 1 even_numbers(-5)

<ipython-input-13-b0b3a85f2b2a> in even_numbers(n)
     12
     13     if n < 0:
---> 14         raise ValueError("Passed a negative number: " + str(n))
     15
     16     r = [2 * x for x in range(n)]

ValueError: Passed a negative number: -5

Now, even if you ship your code to careless users, and as soon as they commit a mistrake, they will get properly notified.

Error kind b): Your code is just plain wrong

In this case, it’s 100% your fault, and these sort of bugs should never pop up in production. For example your code passes internally wrong stuff, like strings instead of integers, or wrong ranges (typically integer outside array bounds). So if you have an internal function nobody else should directly call, and you suspect it is being passed wrong parameters or at some point it has inconsistent data, to quickly spot the error you could add an assertion:

[8]:
def even_numbers(n):
    """
    Return a list of the first n even numbers

    Zero is considered to be the first even number.

    >>> even_numbers(5)
    [0,2,4,6,8]
    """
    assert type(n) is int, "type of n is not correct: " + str(type(n))
    assert n >= 0, "Found negative n: " + str(n)

    r = [2 * x for x in range(n)]

    return r

As before, the function will stop as soon we call it we wrong parameters. The big difference is, this time we are assuming even_numbers is just for personal use and nobody else except us should directly call it.

Since assertion consume CPU time, IF we care about performances AND once we are confident our program behaves correctly, we can even remove them from compiled code by using the -O compiler flag. For more info, see Python wiki

EXERCISE: try to call latest definition of even_numbers with wrong parameters, and see what happens.

NOTE: here we are using the correct definition of even_numbers, not the buggy one with the 3 in the middle of returned list !

Testing with Unittest

NOTE: Testing with Unittest is only done in PART B of this course

Is there anything better than assertfor testing? assert can be a quick way to check but doesn’t tell us exactly which is the wrong number in the list returned by even_number(5). Luckily, Python offers us a better option, which is a complete testing framework called unittest. We will use unittest because it is the standard one, but if you’re doing other projects you might consider using better ones like pytest (note it can also execute tests made with unittest, so if your visualstudio code for some reason doesn’t work with unittest, you can try setting pytest as test framework)

So let’s give unittest a try. Suppose you have a file called file_test.py like this:

[9]:
import unittest

def even_numbers(n):
    """
    Return a list of the first n even numbers

    Zero is considered to be the first even number.

    >>> even_numbers(5)
    [0,2,4,6,8]
    """
    r = [2 * x for x in range(n)]
    r[n // 2] = 3   # <-- evil bug, puts number '3' in the middle
    return r

class MyTest(unittest.TestCase):

    def test_long_list(self):
        self.assertEqual(even_numbers(5),[0,2,4,6,8])


We won’t explain what class mean (for classes see the book chapter), the important thing to notice is the method definition:

def test_long_list(self):
    self.assertEqual(even_numbers(5),[0,2,4,6,8])

In particular:

  • method is declared like a function, and begins with 'test_' word

  • method takes self as parameter

  • self.assertEqual(even_numbers(5),[0,2,4,6,8]) executes the assertion. Other assertions could be self.assertTrue(some_condition) or self.assertFalse(some_condition)

Running tests

To run the tests, enter the following command in the terminal:

python -m unittest file_test

!!!!! WARNING: In the call above, DON’T append the extension .py to file_test !!!!!!

!!!!! WARNING: Still, on the hard-disk the file MUST be named with a .py at the end, like file_test.py!!!!!!

You should see an output like the following:

[10]:
jupman.show_run(MyTest)
F
======================================================================
FAIL: test_long_list (__main__.MyTest)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/tmp/ipykernel_21160/3269760140.py", line 19, in test_long_list
    self.assertEqual(even_numbers(5),[0,2,4,6,8])
AssertionError: Lists differ: [0, 2, 3, 6, 8] != [0, 2, 4, 6, 8]

First differing element 2:
3
4

- [0, 2, 3, 6, 8]
?        ^

+ [0, 2, 4, 6, 8]
?        ^


----------------------------------------------------------------------
Ran 1 test in 0.002s

FAILED (failures=1)

Now you can see a nice display of where the error is, exactly in the middle of the list!

When tests don’t run

When -m unittest does not work and you keep seeing absurd errors like Python not finding a module and you are getting desperate (especially because Python has unittest included by default, there is no need to install it! ), try putting the following code at the very end of the file you are editing:

unittest.main()

Then simply run your file with:

python file_test.py

In this case it should REALLY work. If it still doesn’t, call the Ghostbusters. Or, better, the IndentationBusters, you’re likely having tabs mixed with spaces mixed with very bad luck.

Adding tests

How can we add (good) tests? Since best ones are usually short, it would be better starting small boundary cases. For example like n=1 , which according to function documentation should produce a list containing zero:

[11]:
class MyTest(unittest.TestCase):

    def test_one_element(self):
        self.assertEqual(even_numbers(1),[0])

    def test_long_list(self):
        self.assertEqual(even_numbers(5),[0,2,4,6,8])

Let’s call again the command:

python -m unittest file_test
[12]:
jupman.show_run(MyTest)
FF
======================================================================
FAIL: test_long_list (__main__.MyTest)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/tmp/ipykernel_21160/1413161586.py", line 7, in test_long_list
    self.assertEqual(even_numbers(5),[0,2,4,6,8])
AssertionError: Lists differ: [0, 2, 3, 6, 8] != [0, 2, 4, 6, 8]

First differing element 2:
3
4

- [0, 2, 3, 6, 8]
?        ^

+ [0, 2, 4, 6, 8]
?        ^


======================================================================
FAIL: test_one_element (__main__.MyTest)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/tmp/ipykernel_21160/1413161586.py", line 4, in test_one_element
    self.assertEqual(even_numbers(1),[0])
AssertionError: Lists differ: [3] != [0]

First differing element 0:
3
0

- [3]
+ [0]

----------------------------------------------------------------------
Ran 2 tests in 0.003s

FAILED (failures=2)

From the tests we can now see there is clearly something wrong with the number 3 that keeps popping up, making both tests fail. You can see immediately which tests have failed by looking at the first two FF at the top of the output. Let’s fix the code by removing the buggy line:

[13]:
def even_numbers(n):
    """
    Return a list of the first n even numbers

    Zero is considered to be the first even number.

    >>> even_numbers(5)
    [0,2,4,6,8]
    """
    r = [2 * x for x in range(n)]
    # NOW WE COMMENTED THE BUGGY LINE  r[n // 2] = 3   # <-- evil bug, puts number '3' in the middle
    return r

And call yet again the command:

python -m unittest file_test
[14]:
jupman.show_run(MyTest)
..
----------------------------------------------------------------------
Ran 2 tests in 0.002s

OK

Wonderful, all the two tests have passed and we got rid of the bug.

WARNING: DON’T DUPLICATE TEST CLASS NAMES AND/OR METHODS!

In the following, you will be asked to add tests. Just add NEW methods with NEW names to the EXISTING class MyTest !

Exercise: boundary cases

Think about other boundary cases, and try to add corresponding tests.

  • Can we ever have an empty list?

  • Can n be equal to zero? Add a test inside MyTest class for its expected result.

  • Can n be negative? In this case the function text tells us nothing about the expected behaviour, so we might choose it now: either the function raises an error, or it gives a back something, like i.e. list of even negative numbers. Try to modify even_numbers and add a relative test inside MyTest class for expecting even negative numbers (starting from zero).

Exercise: expecting assertions

What if user passes us a float like 3.5 instead of an integer? If you try to run even_numbers(3.5) you will discover it works anyway, but we might decide to be picky and not accept inputs other than integers. Try to modify even_numbers to make so that when input is not of type int, raises TypeError (to check for type, you can write type(n) == int).

To test for it, add following test inside MyTest class :

def test_type(self):

    with self.assertRaises(TypeError):
        even_numbers(3.5)

The with block tells Python to expect the code inside the with block to raise the exception TypeError:

  • If even_numbers(3.5) actually raises TypeError exception, nothing happens

  • If even_numbers(3.5) does not raise TypeError exception, with raises AssertionError

After you completed previous task, consider when the input is the float 4.0: in this case it might make sense to still accept it, so modify even_numbers accordingly and write a test for it.

Exercise: good tests

What difference is there between the following two test classes? Which one is better for testing?

class MyTest(unittest.TestCase):

    def test_one_element(self):
        self.assertEqual(even_numbers(1),[0])

    def test_long_list(self):
        self.assertEqual(even_numbers(5),[0,2,4,6,8])

and

class MyTest(unittest.TestCase):

    def test_stuff(self):
        self.assertEqual(even_numbers(1),[0])
        self.assertEqual(even_numbers(5),[0,2,4,6,8])

Running unittests in Visual Studio Code

You can run and debug tests in Visual Studio Code, which is very handy. First, you need to set it up.

  1. Hit Control-Shift-P (on Mac: Command-Shift-P) and type Python: Configure Tests

vscode 1 4292234

  1. Select unittest:

vscode 2 2341234123

  1. Select . root directory (we assume tests are in the folder that you’ve opened):

vscode 3 3142434

  1. Select *test*.py Python files containing the word 'test':

vscode 4 92383283

Hopefully, on the currently opened test file new labels should appear above class and test methods, like in the following example. Try to click on them:

vscode 5 8232114

In the bottom bar, you should see a recap of tests run (right side of the picture):

vscode 6 2348324332

TROUBLESHOOTING

If you encounter problems running tests and have Anaconda, sometimes an easy solution can be just closing Visual Studio Code and running it from the Anaconda Navigator. You can also try updating it.

Running tests by console does not work:

  • remember to SAVE the files before executing tests: in Windows, a file appears as not saved when its filename in the tab is written in italics; on Linux, you might see a dot to the right of the filename

Run Test label does not show up in code:

  • if you see red squiggles in the code, most probably syntax is not correct and thus no test will get discovered ! If this is the case, fix the syntax error, SAVE, and then tell Visual Studio to discover test.

  • you might also try Right click->Run current Test File.

  • try selecting another testing framework , try pytest, which is also capable to discover and execute unittests.

  • if you are really out of luck with the editor, there is always the option of running tests from the console.

Spend time using the console !!!!

During exams VSCode testing might not work, so please be prepared to use the console

Functional programming

In functional programming, functions behave as mathematical ones so they always take some parameter and return new data without ever changing the input. They say functional programming is easier to test. Why?

Immutable data structures: all data structures are (or are meant to be) immutable -> no code can ever tweak your data, so other developers just cannot (should not) be able to inadvertently change your data.

Simpler parallel computing: point above is particularly inmportant in parallel computation, when the system can schedule thread executions differently each time you run the program: this implies that when you have multiple threads it can be very very hard to reproduce a bug where a thread wrongly changes a data which is supposed to be exclusively managed by another one: it might fail in one run and succeed in another just because the system scheduled differently the code execution! Functional programming frameworks like Spark solve these problems very nicely.

Easier to reason about code: it is much easier to reason about functions, as we can use standard equational reasoning on input/outputs as traditionally done in algebra. To understand what we’re talking about, you can see these slides: Visual functional programming (will talk more about it in class)

Show solution