development

Integration and Simulation Tests in Python

One of my (many) tasks lately has been to rework unit and integration tests for Review Bot, our automated code review add-on for Review Board.

The challenge was providing a test suite that could test against real-world tools, but not require them. An ever-increasing list of compatible tools has threatened to become an ever-increasing burden on contributors. We wanted to solve that.

So here’s how we’re doing it.

First off, unit test tooling

First off, this is all Python code, which you can find on the Review Bot repository on GitHub.

We make heavy use of kgb, a package we’ve written to add function spies to Python unit tests. This goes far beyond Mock, allowing nearly any function to be spied on without having to be replaced. This module is a key component to our solution, given our codebase and our needs, but it’s an implementation detail — it isn’t a requirement for the overall approach.

Still, if you’re writing complex Python test suites, check out kgb.

Deciding on the test strategy

Review Bot can talk to many command line tools, which are used to perform checks and audits on code. Some are harder than others to install, or at least annoying to install.

We decided there’s two types of tests we need:

  1. Integration tests — ran against real command line tools
  2. Simulation tests — ran against simulated output/results that would normally come from a command line tool

Being that our goal is to ease contribution, we have to keep in mind that we can’t err too far on that side at the expense of a reliable test suite.

We decided to make these the same tests.

The strategy, therefore, would be this:

  1. Each test would contain common logic for integration and simulation tests. A test would set up state, perform the tool run, and then check results.
  2. Integration tests would build upon this by checking dependencies and applying configuration before the test run.
  3. Simulation tests would be passed fake output or setup data needed to simulate that tool.

This would be done without any code duplication between integration or simulation tests. There would be only one test function per expectation (e.g., a successful result or the handling of an error). We don’t want to worry about tests getting out of sync.

Regression in our code? Both types of tests should catch it.

Regression or change in behavior in an integrated tool? Any fixes we apply would update or build upon the simulation.

Regression in the simulation? Something went wrong, and we caught it early without having to run the integration test.

Making this all happen

We introduced three core testing components:

  1. @integration_test() — a decorator that defines and provides dependencies and input for an integration test
  2. @simulation_test() — a decorator that defines and provides output and results for a simulation test
  3. ToolTestCaseMetaClass — a metaclass that ties it all together

Any test class that needs to run integration and simulation tests will use ToolTestCaseMetaClass and then apply either or both @integration_test/@simulation_test decorators to the necessary test functions.

When a decorator is applied, the test function is opted into that type of test. Data can be passed into the decorator, which is then passed into the parent test class’s setup_integration_test() or setup_simulation_test().

These can do whatever they need to set up that particular type of test. For example:

  • Integration test setup defaults to checking dependencies, skipping a test if not met.
  • Simulation test setup may write some files or spy on a subprocess.Popen() call to fake output.


For example:

class MyTests(kgb.SpyAgency, TestCase,
              metaclass=ToolTestCaseMetaClass):
    def setup_simulation_test(self, output):
        self.spy_on(execute, op=kgb.SpyOpReturn(output))

    def setup_integration_test(self, exe_deps):
        if not are_deps_found(exe_deps):
            raise SkipTest('Missing one or more dependencies')

    @integration_test(exe_deps=['mytool'])
    @simulation_test(output=(
        b'MyTool 1.2.3\n'
        b'Scanning code...\n'
        b'0 errors, 0 warnings, 1 file(s) checked\n'
    ))
    def test_execute(self):
        """Testing MyTool.execute"""
        ...

When applied, ToolTestCaseMetaClass will loop through each of the test_*() functions with these decorators applied and split them up:

  • Test functions with @integration_test will be split out into a test_integration_<name>() function, with a [integration test] suffix appended to the docstring.
  • Test functions with @simulation_test will be split out into test_simulation_<name>(), with a [simulation test] suffix appended.

The above code ends up being equivalent to:

class MyTests(kgb.SpyAgency, TestCase):
    def setup_simulation_test(self, output):
        self.spy_on(execute, op=kgb.SpyOpReturn(output))

    def setup_integration_test(self, exe_deps):
        if not are_deps_found(exe_deps):
            raise SkipTest('Missing one or more dependencies')

    def test_integration_execute(self):
        """Testing MyTool.execute [integration test]"""
        self.setup_integration_test(exe_deps=['mytool'])
        self._test_common_execute()

    def test_simulation_execute(self):
        """Testing MyTool.execute [simulation test]"""
        self.setup_simulation_test(output=(
            b'MyTool 1.2.3\n'
            b'Scanning code...\n'
            b'0 errors, 0 warnings, 1 file(s) checked\n'
        ))
        self._test_common_execute()

    def _test_common_execute(self):
        ...

Pretty similar, but less to maintain in the end, especially as tests pile up.

And when we run it, we get something like:

Testing MyTool.execute [integration test] ... ok
Testing MyTool.execute [simulation test] ... ok

...

Or, you know, with a horrible, messy error.

Iterating on tests

It’s become really easy to maintain and run these tests.

We can now start by writing the integration test, modify the code to log any data that might be produced by the command line tool, and then fake-fail the test to see that output.

class MyTests(kgb.SpyAgency, TestCase,
              metaclass=ToolTestCaseMetaClass):
    ...

    @integration_test(exe_deps=['mytool'])
    def test_process_results(self):
        """Testing MyTool.process_results"""
        self.setup_files({
            'filename': 'test.c',
            'content': b'int main() {return "test";}\n',
        })

        tool = MyTool()
        payload = tool.run(files=['test.c'])

        # XXX
        print(repr(payload))

        results = MyTool().process_results(payload)

        self.assertEqual(results, {
            ...
        })

        # XXX Fake-fail the test
        assert False

I can run that and get the results I’ve printed:

======================================================================
ERROR: Testing MyTool.process_results [integration test]
----------------------------------------------------------------------
Traceback (most recent call last):
    ...
-------------------- >> begin captured stdout << ---------------------
{"errors": [{"code": 123, "column": 13, "filename": "test.c", "line': 1, "message": "Expected return type: int"}]}

Now that I have that, and I know it’s all working right, I can feed that output into the simulation test and clean things up:

class MyTests(kgb.SpyAgency, TestCase,
              metaclass=ToolTestCaseMetaClass):
    ...

    @integration_test(exe_deps=['mytool'])
    @simulation_test(output=json.dumps(
        'errors': [
            {
                'filename': 'test.c',
                'code': 123,
                'line': 1,
                'column': 13,
                'message': 'Expected return type: int',
            },
        ]
    ).encode('utf-8'))
    def test_process_results(self):
        """Testing MyTool.process_results"""
        self.setup_files({
            'filename': 'test.c',
            'content': b'int main() {return "test";}\n',
        })

        tool = MyTool()
        payload = tool.run(files=['test.c'])
        results = MyTool().process_results(payload)

        self.assertEqual(results, {
            ...
        })

Once it’s running correctly in both tests, our job is done.

From then on, anyone working on this code can just simply run the test suite and make sure their change hasn’t broken any simulation tests. If it has, and it wasn’t intentional, they’ll have a great starting point in diagnosing their issue, without having to install anything.

Anything that passes simulation tests can be considered a valid contribution. We can then test against the real tools ourselves before landing a change.

Development is made simpler, and there’s no worry about regressions.

Going forward

We’re planning to apply this same approach to both Review Board and RBTools. Both currently require contributors to install a handful of command line tools or optional Python modules to make sure they haven’t broken anything, and that’s a bottleneck.

In the future, we’re looking at making use of python-nose‘s attrib plugin, tagging integration and simulation tests and making it trivially easy to run just the suites you want.

We’re also considering pulling out the metaclass and decorators into a small, reusable Python packaging, making it easy for others to make use of this pattern.

Designing Unity: The Start Menu

Early on when we began to develop Unity for Workstation, we started to look at ways to give users access to the guest’s start menu. This seemed like an easy thing to solve at first. A month later we realized otherwise. We debated for some time and discussed the pros and cons of many approaches before settling on a design.

We had a number of technical and design restrictions we had to consider:

  • The UI should be roughly the same across Windows and Linux hosts.
  • Start menu contents must always be accessible regardless of the desktop environment on Linux.
  • Need to cleanly support start menus from many VMs at once.

Our chosen design

The design we settled on was to have a separate utility window for representing the start menu. This window can auto-hide and dock to any corner of the screen, or remain free-floating, and provides buttons for each VM. The buttons are color-coded to match the Unity window’s border and badge color. When you first go into Unity, the window briefly shows, indicating where it’s docked.

Unity Start Menu Integration

There are many advantages to this design.

  • You don’t have to re-learn how to use it between platforms or even desktop environments.
  • It’s pretty easy to get to and yet stays out of your way when you don’t need it.
  • All the start menus are easily accessible from one place.
  • The start menu buttons are color-coded to match the Unity windows.
  • Users can control whether the window is docked in a corner or free-floats on the desktop.
  • We have a lot of flexibility for feature expansion down the road.

Why not integrate with the Start Menu?

Since the first Workstation 6.5 beta, I’ve been asked why we chose the design we have instead of integrating the start menu into the notification area or into the existing Applications/Start menus. The idea to do so seems kind of obvious at first, but there are many reason we didn’t go that route.

Let’s start with the host’s Applications/Start menus. This seems the most natural place to put applications, as the user is already used to going there. We began going down this route, until we realized the problems associated:

  • On Linux, not everyone runs GNOME, KDE or another desktop environment with an applications menu supporting the .desktop spec correctly or at all. This means we’d be drastically limiting which desktop environments we could even represent applications in.
  • In the case of GNOME, it would add more clicks to get access to any application (Applications ? Virtual Machines ? VM Name ? Applications). This becomes tedious, quickly. Also, from my tests, adding entries three levels deep doesn’t always appear to work reliably across desktops.
  • In Windows, the situation is just as unclear. People tend to think that Windows only has one Start menu, but in reality, we’d have to support three (Classic, XP, and Vista). For quicker access, we’d need to add something to the root menu, and each of these start menus have slight differences in how we can do this. None of the solutions are even particularly good there, as entries may be hidden from the user to make room for other pinned applications.
  • In summary, where you go to access the start menu contents will be different not just on each OS, but across desktop environments and even different modes of the same environment (on Windows).

What about the notification area/system tray?

Another possibility that has been brought up is to use the notification area and to tie the start menu to an icon there. While this would generally work, it wouldn’t work too well.

  • On Linux, it’s frowned upon to put persistent entries in the notification area. A panel applet could work, but users would have to manually add it, and it would be GNOME or KDE-specific.
  • There’s no guarantee there even is a notification area or even a panel in Linux desktops.
  • On Windows, the icon may be automatically hidden in the system tray to make room for other icons.
  • The icon is such a small area to click on, making it annoying to launch applications quickly.
  • The icon is generally not too discoverable.

Tips and future improvements

While we’ll probably keep our current model, there are definitely improvements I’d personally like to make in some future release. One such possible example is to allow dragging an entry off onto the panel or desktop to create a shortcut/launcher. If you frequently access certain applications, you’d be able to put them wherever you want them for quick access. Clicking them while the VM is powered off would power the VM back on in the background and then run the application.

A lot of this exists already. While there is no automatic launcher creation, you can create your own that run:

vmware-unity-helper --run /path/to/vmx C:pathtoapplication parameters

This is not a supported feature at this time and may have bugs, but in the general case it should work just fine.

Scroll to Top