Thursday, November 17, 2011

CITCON London 2011

Given what a great conference CITCON 2010 was, when registration opened for CITCON 2011, I didn't hesitate - which turned out to be a good thing, since spots had filled up rather quick. So next year, watch the mailing list, and rush to the registration form!


Friday Evening - Open Spaces Introduction/Topic Proposals


It was held at Skillsmatter, so I assumed there will be no surprises for me there (great infrastructure), but since I didn't have to take the tube this time (picked my accommodation to be walking distance from Glasswell Road on purpose), I've quickly learned that Bing Maps has little to no knowledge of Glasswell Road and Skillsmatter. So the first day I took a nice, leisurely, albeit somewhat longer road towards the location.


The registration went flawlessly, ran by PJ's mom. Great job!


PJ and Jeffery made the introductions in their usual, entertaining manner. While it wasn't new for me, I was glad to see this time they emphasized that the schedule can change throughout the day - apparently it wasn't only me who had been surprised by it last year. And of course the schedule did change.


Because Julian Simpson couldn't make it, I introduced his topic (Do you use your tests in prod?), and then two of my own - Continuous Release and Delivery in Downloadable Product Companies and Why Most People Don't have a Rollback Plan for Releases and instead "plan" to Hack Forward.


The great thing about proposing topics is that even when it doesn't meet the threshold to have a whole session dedicated to it, a lot of people now know that you are interested in this topic, and find you to share their experiences about it during the breaks.


To my surprise, we actually followed Jeffery's recommendation and we didn't run side conversations during the proposals, so the process was smooth and efficient, and there was more than enough time to chat later - especially that I didn't worry much about the agenda, since proposing two topics made my priorities to attend those, and I knew that whatever agenda we'll have at Friday closing time is not final.


As we learned the next day, when you propose a topic, you should be careful which words to use. In my case, rollback was a terrible word, since many interpreted as going back in time, undoing, while what I meant was more like backout (planned and disciplined retreat).


Sessions I attended


I've added my notes on the conference wiki:



Sessions I wish I attended



  • BitbeamBot: The Angry Birds Playing, Mobile Testing Robot by Jason Huggins ran a session about automated UI testing of touch screens

  • there was a TDD session - given that often I need (want) to explain TDD to people new to the concept, it would have been a great learning opportunity seeing someone (was it run by Steve Freeman?) else's approach to introducing it. Plus likely I would have gained a new understanding of TDD...

  • Backyard beekeeping - I ended up in a random chat, but I was would have been curious to see these non-technical, non-work related sessions

  • I couldn't stay long enough in the Slaughtered Lamb


The hallway chats


While open spaces are already like an unconference, nonetheless a lot of great conversations took place during the breaks (didn't get to use the law of two feet this time either), and I've met a lot of great people. I just wish I had more time to talk more with each of them. Guess I queued these conversations for future processing on twitter or via email.


In contrast to last year's conference, I spent almost the whole time offline, in the analog world, and it didn't feel the least bit wrong. I've pretty much only used twitter when I followed someone on twitter - instead of exchanging emails or business card. Though I have to admit, business cards are great for jotting down small reminder notes on them.


Some travel lessons I've learned


While this is tangential to CITCON, I have learnt a few lessons on this trip. I'll list them here, hoping it's beneficial for others.



  • have your travel plans printed out (a'la TripIt. It made the check-in process much smoother, and going from the airport to my hotel was perfectly relaxed, knowing exactly which tube to take. The only part I forgot to plan was from the Nueremberg Airport to the office (only remembered that would be useful when I landed). Thanks to Career Tools for the idea (and they great advice for attending conferences, approaching your boss about sending you to a conference, and more)!

  • Phones, data plans, roaming. While this isn't the reason that I always buy my phones from the stores and not from the carriers, being able to just buy a prepaid SIM in London made a huge difference. It's not necessarily the money, though I think I've saved there (for the four days I stayed, just the data plan would have cost me 14 €, and there are the calls I made - contrast that with the £15), but rather that making a call or using mobile net wasn't something that I had to consciously choose (is it worth the extra roaming charge?).

  • have enough slack in your trip. I arrived Friday afternoon, and returned home on Monday morning. This allowed me to dedicate Friday and Saturday to the conference (and the Slaughtered Lamb afterwards), be able to meet up with friends living in London on Sunday, and the fact that some of my friends were an hour late just simply didn't cause a problem. And financially it doesn't have to be more expensive - I believe I spent about the same amount on the three nights' stay as I did last year for two nights. Advance planning, more research for accommodation helps you with that.

  • If you need travel accessories (e.g.: AC adapter for the UK), try them before leaving home - I managed to buy adapters incompatible with my laptop's charger. Luckily the reception at my hotel managed to find one of theirs that was compatible, but it took a long five minutes for them to find one, so I wouldn't rely on this

  • In addition to that, I forgot my phone charger at home, and while Bing Maps isn't perfect, it's much better than walking in London without a GPS. Lesson learned: it's handy when you have a USB cable with you that you can use to charge from somebody's laptop - the 10 cm long USB cable fits in any pocket.

Friday, November 4, 2011

Data Migrations As Acceptance Tests

While I have previously said that on migration projects both verification and regression tests are important, does it mean that the two should be separate? Like first, let's migrate the data, and then we'll rewrite the functionality, right? Or let's do it the other way around - we'll talk with the customer, incrementally figure out their requirements, deliver the software (with a proper regression test suite) that satisfies them, and then we migrate. Both approaches have problems:



  • customers want to use the software with their current, real data - having only the data and no application to use it with is no value to them. Neither is having only an application with no data in it

  • real data has lots of surprising scenarios that the domain expert might have forgotten to specify (see caveats though)

  • requirements are not static, and new ones will be introduced during the development process, that inevitably will cause the new application's models to change, which means that the migration has a moving target it needs to migrate to.


Doing them in parallel


If the data source is organized chronologically (order by date in the migration script), and organized in a format that resembles what the system's end users will enter into the system, then we can use the new application's outmost automatable entry point (Selenium, HTTP POST, a module's public API) to enter this data during the migration from the old system to the new.


Why


While a clear disadvantage of this approach is speed of the migration - it will be likely slower than an INSERT INTO new_db.new_table select .... FROM old_db.old_table join ... where .... statement, but in the case of non-trivial migrations it will likely compensate for the slowness, because:



  • changes to the new system's code/data structure become a problem localized to the new application code - no headaches to update the migration scripts in addition to the code

  • when the client requests the demo deployment to be in sync with the old system, the code is ready (spare for the part to figure out which records have changed)

  • the legacy data edge cases provides focus - no need to invent corner cases, for there will be enough in the data

  • likely there will be releasable functionality sooner than with either of the above approaches


How


First, create the acceptance tests for the migration:



  • Pick the data to be migrated

  • find the view in the original system that displays this data to the users and find a way to extract the data from there

  • define the equivalent view in the new system (it's about end to end, user visible features!)

  • write the verification script that compares the above two (be sure to list the offending records in the failure message!)

  • define the data entry point in the new system

  • write the migration script - extract from the old system, transform if needed (only to match the entry points expectations of the format - no quality verification as in classic ETL!), then send it into the new system (using the above defined entry point)


At this point both the new view, and the data entry points are empty. From here on, the TDD cycle becomes a nested loop



  • run the migration script. See which record it failed for

  • analyze the failing acceptance test, and find the missing features for it

  • (A)TDD the missing features

  • run the migration script to restart the cycle


Caveats


While the existing data makes one focus on the real edge cases instead of the imagined one, beware - not everything has to (or can be) migrated - for instance, in a payment system, the system used to accept many currencies in the past, but now only . IN this case, possibly the currency exchange handling logic could be dropped in the new system (and just to store the currency in a char field for the old ones); or in some other domains, maybe only the last ten years' data is needed. However, this should be a business decision, not a decision for a developer!


Source Data Quality is often a problem, one that will likely cause issues. If data needs to be fixed (as above, ask the stakeholders!), it should stay out from your application's code, and be in the Transform part of the migration script.