Do. Reflect. Learn. Repeat. (moved to zsoldosp.eu): September 2011

It's common for migration projects (a.k.a.: rewrite) to specify scope initially as "to behave the same way as the old system does". Thus the testing approach to automatically compare the new system's results to the old system's seems to be a perfect choice (of course, in addition to the data migration, calculations should be reconciled too - at least for the records that were calculated using the latest version of the old system).

However, once the application goes live and the old one is decommissioned, we cannot rely on these tests anymore (there is no old system anymore), and we have no regression suite to rely on for the future change requests/enhancements.

Just to avoid any misunderstanding: I don't advocate not having automated reconciliation checks (verifications), on the contrary, I think they are immensely valuable. We can write the best specifications/code, but still miss some details, which, luckily for us, do pop up in the real data. These automated checks give everyone on the project team the peace of mind they so need before go-live.

The point I'm trying to make here is that while these checks are essential, they are not enough for the long term health of the system. These are a good starting point, but just as we do with the specifications, as we work on the new system, when we find and uncovered logic case (e.g.: as part of the calculation reconciliation), we need to add a test case to the new application's test suite to ensure proper regression coverage that we can rely on in the years to come. And adding this test case is easy - we can just copypaste (after making it anonymous of course) the input that caused the problem with the verification into the unit/functional tests, implement the missing functionality, and move on. But saving on these few seconds costs a lot later down the road.

Unit/functional/integration/system tests are supposed to be self contained - we would like to create (a) clean database(s), which we put into a known state before the tests (some frameworks support this out of the box, e.g.: Django, but we can easily implement this ourselves). Migration reconciliations, by their very nature, need to work on the live (snapshot) data. Also, as described earlier, these reconciliation tests are temporary artifacts, while the other tests supposed to be permanent (at least until the client decides to change the requirements). Separation of Concerns also applies to the test suite - running tests in the same suite with different assumptions (live db we shouldn't touch vs. empty test db we can read/write as we wish) is more than risky - keep them physically separated, both at runtime and in source control.

Even if the delivered project could be summed up in that vague sentence (does what the old does), this summary is never true - few start rewrite projects just to get the exact same functionality. Usually these projects are sponsored because the old application became unmaintainable and the client is missing opportunities, because the software is not supporting, but hindering their goals. Without the self-contained test suite, our shiny new migrated application is going to become another one of these.