Unit testing and Integration testing
From DANSE
Table of contents |
First off, there are a wealth of resources available on the internet on Unit testing. Some of the better ones specifically concerning Python are:
- Unittest (http://www.python.org/doc/current/lib/module-unittest.html) within Python's Global Module Index on the official python website. Unittest is the standard unit testing framework for Python, and will be used in the examples below.
- Dive into Python (http://diveintopython.org/toc/index.html) by Mark Pilgrim. Chapters 13-15 present an excellent approach to writing code from functional requirements.
For other languages:
- CppTest (http://cpptest.sourceforge.net) is an excellent unit testing framework for C++.
- JUnit (http://junit.sourceforge.net) is a related unit testing framework for Java.
Introduction
Unit testing is an important step in the process of developing and maintaining production-quality software. In a large-scale software project, especially one that has several developers that will swap components, proper unit and integration testing is essential. Developing good functional requirements and unit tests before writing code can take some time, are far worth the effort.
If you were building a very expensive instrument (i.e. to measure the inelastic scattering of neutrons) or a very precise instrument (i.e. to release chemicals upon microvoltage changes in brain activity), you would outline all of the important requirements that your instrument needs to meet far before you begin to build the instrument. You'd engineer the dimensions, power requirements, wiring, composition, and so on until there were solid plans for every step of the instrument far before you begin to begin attaching pieces together. So, why do software any differently?
Software is commonly planned and written in what can be best described as dynamically, but is often really better described as kludged. To build durable and reliable software, it is necessary to work like a 'Software Engineer' -- to carefully determine the functional requirements for your code, to build a testing suite for your code out of your requirements, and then to finally write the code until the requirements are met.
Why bother, you still ask?
- Before writing code, it forces you to detail your requirements in a useful fashion.
- While writing code, it keeps you from over-coding. When all the test cases pass, the function is complete.
- When refactoring code, it assures you that the new version behaves the same way as the old version.
- When writing code in a team, it increases confidence that the code you're about to commit isn't going to break other peoples' code, because you can run their unittests first.
If we are looking to write code that supplies some basic calcualtor functions, we would first determine which math functions we would like to incoporate. Then, for each math function we've chosen, we would build a component that impliments the selected math function. We'll develop the components for the average and factorial functions through the course of this tutorial; then integrate them into a simple calculator. We will first outline the functional requirements for each of our components, then write the unit tests for our code, then finally write our code. After all of our code satisfies the functional requirements, we'll do any necessary refactoring.
Note: the code used in the following examples can be obtained from the DANSE cvs on arcs.cacr.caltech.edu in the 'tutorials' directory.
Functional Requirements
Functional requirements are a set of explicit expectations for your code. These expectations should not only outline the behavior of your code in situations involving the successful processing ('Success' checks) of input and output, but should also detail what happens when there are errors generated ('Failure' checks). There may also be additional expectations you have for your code that are not obvious, but are required nonetheless when you have code that contains reciprocal functions. When a conversion is performed and then the reciprocal conversion is applied, the original information may expected as the result ('Sanity' checks).
Develop expectations for your code
The first step in writing functional requirements should be to write our a set of rules and expectations for your code. So for our set of math functions, what are the expectations?
AVERAGE: ave(x) = sum(x[i])/n
- the average of n values is equal to the sum of all values divided by n
- one or more values (n >= 1) are required
- the values must be given as a sequence
- the values to be averaged can be integers, floats, or mixtures of both types
FACTORIAL: fact(n) = n * fact(n-1)
- the factorial of n is equal to the product of n and the factorial of n-1
- the value supplied must be an integer
- the value supplied must be non-negative
- a single value is required
Convert expectations into functional requirements
Now that we've developed expectations for our code, we can draft a set of requirements that fulfill these expectations. We need to make sure to define a set of explicit statements that can be tested; thus, the functional requirements should state what happens upon successful return, failure, or within a sanity check. The appropriate type of variable should also be specified where possible, as well as any computational limits that may be relevant.
Average
- ave should return the average for a given numerical sequence {i.e. list, tuple} of n values
- ave should return the given value for a given single number {i.e. int, long, float, complex, ...}
- ave should fail when given non-numerical input {i.e. bool, string, dict, ...}
- ave should fail when the given sequence contains at least one item that is not single-valued {i.e. list, tuple, dict}
- ave should fail when given an empty sequence
Factorial
- fact should return the factorial for a given integer {i.e. int, long}
- fact should fail when given a non-integer
- fact should fail when given a negative integer
- fact should fail when given a non-single-valued item
If you are not able to draft a set of explicit checks from your expectations, then either your code is not sufficiently fine-grained, or your idea of what is needed not well enough defined. Coarse-grained code translates less well to a reusable component, and is further much harder to debug and maintain. Thus if you find yourself with an extremely long list of functional requirements, you should probably consider further modularizing your code into more easily digestable components. However, if you cannot explicitly state the functional requirements for your code, then it is not yet time for you to begin programming -- you need to research your problem further before attempting to solve it!
Unit Testing
With our functional requirements in hand, we can finally begin coding. However, instead of writing the actual code, we will first translate the functional requirements into unit test code. It is a good convention to keep the test code seperate from the actual code, so we will start by creating the rather generic-named 'alltest.py'.
Build a Empty Test Frame
First we construct a empty frame for our functions. We begin with a simple comment header, and an import of python's unit testing module 'unittest'.
We will also begin to build the structure for excecuting our test suite. We plan to store all of our math functions in 'xmath.py'; thus, when we create test cases for each of our functions we will name them accordingly (Xmath_Ave_TestCase and Xmath_Fact_TestCase). The code below illustrates how to build a complete test suite from each test case, and have the output directed to stdout.
# ========================================================== # alltest.py # ========================================================== import unittest if __name__ == "__main__": suite1 = unittest.makeSuite(Xmath_Ave_TestCase) suite2 = unittest.makeSuite(Xmath_Fact_TestCase) alltests = unittest.TestSuite((suite1,suite2)) unittest.TextTestRunner(verbosity=2).run(alltests) # End of file
Create a Test Case
The next step involves a further fleshing-out of our test suite for our software components. Within 'xmath.py', we plan to have the functions 'ave' and 'fact', so here we will build placeholders for each of the unit tests for each function within the corresponding TestCase class. It is convention to name the unit tests to begin with 'test' and then to end with what is being tested.
# ========================================================== # alltest.py # ========================================================== import unittest class Xmath_Ave_TestCase(unittest.TestCase): def test_avesequence(self): """ave: return the known average for a known numerical sequence""" return def test_avenumber(self): """ave: return the given value for a given single number""" return def test_nonnumerical(self): """ave: fail when given non-numerical input""" return def test_nonsinglevaluedsequence(self): """ave: fail if given sequence contains a non-single-valued item""" return def test_emptysequence(self): """ave: fail when given an empty sequence""" return class Xmath_Fact_TestCase(unittest.TestCase): def test_factinteger(self): """fact: return the known factorial for a known integer""" return def test_noninteger(self): """fact: fail when given a non-integer""" return def test_negative(self): """fact: fail when given a negative integer""" return def test_nonsinglevalueditem(self): """fact: fail when given a non-single-valued item""" return if __name__ == "__main__": suite1 = unittest.makeSuite(Xmath_Ave_TestCase) suite2 = unittest.makeSuite(Xmath_Fact_TestCase) alltests = unittest.TestSuite((suite1,suite2)) unittest.TextTestRunner(verbosity=2).run(alltests) # End of file
Note that the above test cases have unit test placeholders that exactly match our list of functional requirements. We now have a working test suite. You can run the tests from the commandline:
>$ python alltest.py ave: return the given value for a given single number ... ok ave: return the known average for a known numerical sequence ... ok ave: fail when given an empty sequence ... ok ave: fail when given non-numerical input ... ok ave: fail if given sequence contains a non-single-valued item ... ok fact: return the known factorial for a known integer ... ok fact: fail when given a negative integer ... ok fact: fail when given a non-integer ... ok fact: fail when given a non-single-valued item ... ok ---------------------------------------------------------------------- Ran 9 tests in 0.007s OK
You should see that all the tests pass. Having all tests pass should be the goal of each step in the code development process.
Create Tests for Each Function
Now we can start adding the content of the tests... (FIXME)
# ========================================================== # alltest.py # ========================================================== from xmath import * import unittest class Xmath_Ave_TestCase(unittest.TestCase): def test_avesequence(self): """ave: return the known average for a known numerical sequence""" sequences = ( ([1,1,1,1,1,1,1], 1), ([1,2,3,4,5], 3), ([-1,-2], -1.5), ([7], 7), ([0,33.5,66.245,100], 49.93625), ([2e101,5e100], 1.25e101), ([2e-101,5e-100], 2.6e-100), ([2e20,2e-20], 1e20), ([1.453e22,1.453e22], 1.453e22), ((1,2,3,4,5), 3), ((((1,2,3,4,5))), 3) ) for x, average in sequences: self.assertEqual(ave(x), average) return def test_avenumber(self): """ave: return the given value for a given single number""" for x in range(1,545,34): self.assertEqual(ave(x), x) for x in (0.345,1.1,3214.887,12.099621235,-23.32): self.assertEqual(ave(x), x) self.assertEqual(ave(1e100), 1e100) self.assertEqual(ave(1e-100), 1e-100) return def test_nonnumerical(self): """ave: fail when given non-numerical input""" for x in ['a', 'hello', bool(True), None, {1:1,2:2}]: self.assertRaises(XmathError, ave, x) return def test_nonsinglevaluedsequence(self): """ave: fail if given sequence contains a non-single-valued item""" for x in ( [1,2,3,'abcd',5], ([1,2],3,4), ((1,2),(3,4)) ): self.assertRaises(XmathError, ave, x) return def test_emptysequence(self): """ave: fail when given an empty sequence""" for x in ((),[]): self.assertRaises(XmathError, ave, x) return class Xmath_Fact_TestCase(unittest.TestCase): def test_factinteger(self): """fact: return the known factorial for a known integer""" exact_values = ( (0,1), (1,1), (5,120), (10,3628800), (13,6227020800) ) for x, factorial in exact_values: self.assertEqual(fact(x), factorial) return def test_noninteger(self): """fact: fail when given a non-integer""" for x in (0.345,1.1,3214.887,12.099621235,-23.32): self.assertRaises(XmathError, fact, x) for x in ['a', bool(True), None]: self.assertRaises(XmathError, fact, x) return def test_negative(self): """fact: fail when given a negative integer""" for x in (-1,-5,-13,-263): self.assertRaises(XmathError, fact, x) return def test_nonsinglevalueditem(self): """fact: fail when given a non-single-valued item""" for x in ['hello',[1,2,3,4,5],(6,7,8,9),{1:1,2:2}]: self.assertRaises(XmathError, fact, x) return if __name__ == "__main__": suite1 = unittest.makeSuite(Xmath_Ave_TestCase) suite2 = unittest.makeSuite(Xmath_Fact_TestCase) alltests = unittest.TestSuite((suite1,suite2)) unittest.TextTestRunner(verbosity=2).run(alltests) # End of file
explain assertEqual and assertRaises... (FIXME)
we also have to create a skeleton for xmath.py... (FIXME)
# ========================================================== # xmath.py # ========================================================== import math class XmathError(Exception): pass def fact(n): """fact(n) --> Compute the factorial of n""" return def ave(x): """ave(x) --> Calculate the mean of x""" return if __name__ == '__main__': print 'fact(10) = ', fact(10) x = [10,19,30,33] print 'ave([10,19,30,33]) = ', ave(x)
This skeleton code allows us to try the tests again... we should get lots of errors, because we have empty functions! The code at the end of xmath.py allows us to easily test a standard use case or two for each function as a quick check; however, these should not be used as a substitute for unit testing. (FIXME)
>$ python xmath.py fact(10) = None ave([10,19,30,33]) = None
>$ python alltest.py ave: return the given value for a given single number ... FAIL ave: return the known average for a known numerical sequence ... FAIL ave: fail when given an empty sequence ... FAIL ave: fail when given non-numerical input ... FAIL ave: fail if given sequence contains a non-single-valued item ... FAIL fact: return the known factorial for a known integer ... FAIL fact: fail when given a negative integer ... FAIL fact: fail when given a non-integer ... FAIL fact: fail when given a non-single-valued item ... FAIL ---------------------------------------------------------------------- Ran 9 tests in 0.003s FAILED (failures=9)
The details of each error have been left out of the above box... (FIXME)
Test-first Coding
Development on the unit test code is done, and we must now make the main code conform to our functional requirements. The first step is to add some of the core math code... (FIXME)
# ========================================================== # xmath.py # ========================================================== import math class XmathError(Exception): pass def fact(n): """fact(n) --> Compute the factorial of n""" if n == 0 or n == 1: return 1 else: return n*fact(n-1) def ave(x): """ave(x) --> Calculate the mean of x""" xsum = 0.0 for i in range(len(x)): xsum = xsum + x[i] return xsum/len(x) if __name__ == '__main__': print 'fact(10) = ', fact(10) x = [10,19,30,33] print 'ave([10,19,30,33]) = ', ave(x)
Now we can use the basic tests attached to xmath.py to test the anticipated behavior of our two functions... (FIXME)
>$ python xmath.py fact(10) = 3628800 ave([10,19,30,33]) = 23.0
Success! Many programmers would stop here; however, this is not good enough. We must check how our functions hold up against our unit tests... (FIXME)
>$ python alltest.py ave: return the given value for a given single number ... ERROR ave: return the known average for a known numerical sequence ... ok ave: fail when given an empty sequence ... ERROR ave: fail when given non-numerical input ... ERROR ave: fail if given sequence contains a non-single-valued item ... ERROR fact: return the known factorial for a known integer ... ok fact: fail when given a negative integer ... ERROR fact: fail when given a non-integer ... ERROR fact: fail when given a non-single-valued item ... ERROR ---------------------------------------------------------------------- Ran 9 tests in 0.057s FAILED (errors=7)
Two things should stand out here. Firstly, the only things that passed were tests of known values (i.e. the most common uses for the functions). Secondly, the error that our unit tests expects is 'XmathError'; in our unit test suite we decided to have our functions throw this error instead of those provided by python. So let's fix the code to throw the correct errors... (FIXME)
# ========================================================== # xmath.py # ========================================================== import math class XmathError(Exception): pass def fact(n): """fact(n) --> Compute the factorial of n""" try: if n == 0 or n == 1: return 1 else: return n*fact(n-1) except: raise XmathError def ave(x): """ave(x) --> Calculate the mean of x""" try: xsum = 0.0 for i in range(len(x)): xsum = xsum + x[i] return xsum/len(x) except: raise XmathError if __name__ == '__main__': print 'fact(10) = ', fact(10) x = [10,19,30,33] print 'ave([10,19,30,33]) = ', ave(x)
Now we try our test suite again.
>$ python alltest.py ave: return the given value for a given single number ... ERROR ave: return the known average for a known numerical sequence ... ok ave: fail when given an empty sequence ... ok ave: fail when given non-numerical input ... ok ave: fail if given sequence contains a non-single-valued item ... ok fact: return the known factorial for a known integer ... ok fact: fail when given a negative integer ... ok fact: fail when given a non-integer ... FAIL fact: fail when given a non-single-valued item ... ok ====================================================================== ERROR: should return the given value for a given single number ---------------------------------------------------------------------- Traceback (most recent call last): File "alltest.py", line 29, in test_avenumber self.assertEqual(ave(x), x) File "/home/mmckerns/dev/tutorials/native/python/xmath.py", line 45, in ave raise XmathError XmathError ====================================================================== FAIL: should fail when given a non-integer ---------------------------------------------------------------------- Traceback (most recent call last): File "alltest.py", line 71, in test_noninteger self.assertRaises(XmathError, fact, x) File "/usr/local/lib/python2.3/unittest.py", line 285, in failUnlessRaises raise self.failureException, excName AssertionError: XmathError ---------------------------------------------------------------------- Ran 9 tests in 0.159s FAILED (failures=1, errors=1)
So we didn't do as badly as initially indicated. From the full traceback we can see that the remaining problems stem from ave() failing when a single number is entered [i.e. ave(10)], and fact() not failing when a non-integer is uesd [i.e. fact(1.5)]. We can now fix these issues in the code with conditionals... (FIXME)
# ========================================================== # xmath.py # ========================================================== import math class XmathError(Exception): pass def fact(n): """fact(n) --> Compute the factorial of n""" try: if isinstance(n,bool): raise #numeric type required if int(n) != n: raise #value must be an integer if n == 0 or n == 1: return 1 else: return n*fact(n-1) except: raise XmathError def ave(x): """ave(x) --> Calculate the mean of x""" try: if isinstance(x,bool): raise #numeric type required for type in [int,long,float]: if isinstance(x,type): return x xsum = 0.0 for i in range(len(x)): xsum = xsum + x[i] return xsum/len(x) except: raise XmathError if __name__ == '__main__': print 'fact(10) = ', fact(10) x = [10,19,30,33] print 'ave([10,19,30,33]) = ', ave(x)
This time, all the tests pass.
>$ python alltest.py ave: return the given value for a given single number ... ok ave: return the known average for a known numerical sequence ... ok ave: fail when given an empty sequence ... ok ave: fail when given non-numerical input ... ok ave: fail if given sequence contains a non-single-valued item ... ok fact: return the known factorial for a known integer ... ok fact: fail when given a negative integer ... ok fact: fail when given a non-integer ... ok fact: fail when given a non-single-valued item ... ok ---------------------------------------------------------------------- Ran 9 tests in 0.077s OK
Since all the unit tests pass, we can stop coding. Our code satisfies all of the functional requirements, and is easliy testable when modifications are made. Congratulations, we are done... (FIXME)
Refactoring
Refactoring generally means that we return to our code to make it 'better'. We do this in a variety of situations:
- Modifying our math code to make it faster, use less disk memory, use less disk space, or so on. In this case, the unit tests are untouched.
- Modifying our functional requirements. This situation usually arises when we form new expectations of our code -- such as adding (or removing) constraints, or fixing a bug (i.e. dealing with a previously unexpected scenario).
Now that our code currently passes all the unit tests, we can use our test suite as a gauge to measure the speed and efficency of our code. When we run the test suite, unittest dumps the time it took to test the code to stdout. We can modify our code, and then use the testing suite to approximate the effect on the speed of the code. Let's look at speeding up our code. Since we don't explicitly raise an exception when fact() takes a negative argument, we can now add an explicit conditional for negative numbers and then see if the code runs faster... (FIXME)
def fact(n): """fact(n) --> Compute the factorial of n""" try: if isinstance(n,bool): raise #numeric type required if int(n) != n: raise #value must be an integer if n < 0: raise #value cannot be negative if n == 0 or n == 1: return 1 else: return n*fact(n-1) except: raise XmathError
>$ python alltest ave: return the given value for a given single number ... ok ave: return the known average for a known numerical sequence ... ok ave: fail when given an empty sequence ... ok ave: fail when given non-numerical input ... ok ave: fail if given sequence contains a non-single-valued item ... ok fact: return the known factorial for a known integer ... ok fact: fail when given a negative integer ... ok fact: fail when given a non-integer ... ok fact: fail when given a non-single-valued item ... ok ---------------------------------------------------------------------- Ran 9 tests in 0.004s OK
The new code runs noticibly faster. What about trying an alternate implimentation of factorial that doesn't use recursive functions? (FIXME)
def fact(n): """fact(n) --> Compute the factorial of n""" try: if isinstance(n,bool): raise #numeric type required if int(n) != n: raise #value must be an integer if n < 0: raise #value cannot be negative result = 1 for i in range(2,n+1,1): result = i*result return result except: raise XmathError
>$ python alltest ave: return the given value for a given single number ... ok ave: return the known average for a known numerical sequence ... ok ave: fail when given an empty sequence ... ok ave: fail when given non-numerical input ... ok ave: fail if given sequence contains a non-single-valued item ... ok fact: return the known factorial for a known integer ... ok fact: fail when given a negative integer ... ok fact: fail when given a non-integer ... ok fact: fail when given a non-single-valued item ... ok ---------------------------------------------------------------------- Ran 9 tests in 0.003s OK
Both the recursive and non-recursive functions run at nearly the same speed. However, it is interesting to note that python has a recursive limit of 999. Any recursive depth greater than 999, will cause a RuntimeError (our code catches this exception as a XmathError).
RuntimeError: maximum recursion depth exceeded
So the non-recursive implimentation of fact() may be the better one. Upon closer inspection, this recursion limit is actually the cause of the slowdown in our code before we added the explicit condtional for fact() to throw an error upon being fed a negative argument. If we choose to keep the original (recursive) implimentation of fact(), we should probably modify our functional requirements so that the recursion limit of 999 is accounted for...
Further, an astute developer may have noticed that for exceedingly large input values, fact() produces exceedingly inaccurate results. This is inaccuracy is caused by poor floating-point math; however, we will ignore it here, as it is the topic of another tutorial... (FIXME)
Integration Testing
Integration testing checks how our new component interacts with other available components, and as part of a larger application.