Thursday, December 20, 2012

Global Day of Coderetreat 2012 - Nuremberg



If you are not familiar with the concept of a code retreat, listen to this podcast (or read the transcript).
As you may remember, I've attended my first code retreat earlier this year in Frankfurt. One of the reasons for attending was that I had a nagging thought throughout organizing CITCON Budapest that I should do something locally too, and a coderetreat sounded just like the perfect thing. So I've came back with enough enthusiasm from Frankfurt to approach our CEO (Dirk) about Paessler AG helping me organize one in Nuremberg - and he offered the office to host it in and to pay for the lunch. Thus the biggest obstacle was cleared.
The importance of this couldn't be emphasized. As I've seen on the Global Day of Coderetreat organizers' list, it has caused some headache and worry for fellow hosts. And in addition to the fiscal support, my colleagues went way beyond anything I expected - our sysadmins arranging the required technical & security infrastructure, running the GDCR event on the company blog, and even non-programmer colleagues offering to come by Saturday to help with anything if needed. Thank you all, you made it really easy! (by the way, Paessler is hiring!)
Anyhow, this post will not be about the organization process, but about what I have learned on this day as a co-facilitator & participant about code and people.
Another big thanks goes to Marco Emrich, a seasoned coderetreat facilitator, who helped getting the event off the ground and helped me getting started facilitating (thanks to NicoleJohannes for introducing me to Marco!)
The attendance was low (next time we'll schedule the start 30-60 minutes later), but I was surprised to learn how well the coderetreat functioned with so few participants. To ensure there is enough variety and that people have new pairs, we took turns with Marco coding (which we planned anyway in case of odd number of participants).

The Sessions

  1. no constraints, getting familiar with the problem
  2. no primitives & focus on the rules (fake out the world if needed at all)
  3. Ping-Pong TDD & naive implementation (with a switch at half time)
  4. no conditional & no mouse
  5. baby steps
  6. free to choose session

It's the Global Day of Coderetreat

Even though we were only a few people here, it's been great to chat to the others elsewhere worldwide, saying hi to people doing exactly the same thing we do, just in various other locations. While there were audio problems sometime, we didn't mind it. And next year we won't bother with trying to schedule exact times for the calls, since we all will miss those times anyway, but rather just rely on improvising video calls.
  • It's really interesting to see other people programming. It is certainly not something one gets to do during their day jobs (except maybe trainers, team leads, and mentors). It is even more interesting to contrast the external observer's impressions/understanding with the understanding of the people involved in the programming - the difference can be huge. This might help understanding how many developers feel bosses/managers always misunderstand their progress...
  • It's also interesting to track people across the different pairs and see whether they bring their firm opinions (battle scars?) to each session or let go of them to allow the opportunity to learn a different approach. This is not necessarily a bad thing, but if you have plans to learn, beware of this and be explicit what you want to learn - know whether you want to explore one idea and bounce it off/get it challenged by many people or whether you want to simply see how others program and see if any of that could be applicable to you. E.g.: it was pretty interesting to see how my F#/Scala inspired ideas (case classes & types) could be materialized in clojure. However, I have certainly learned less about clojure than I would have had I let my partner do it as he would in clojure natively (though the no primitives restriction was screaming for types in my opinion).
  • We developers are really creative at finding (or at least looking for) loopholes in the constraints. During the no conditional session one pair TDD'd a function returning a boolean and claimed they had no conditionals in their code and were trying to convince us that even the rest of the system wouldn't need conditionals to use this code... Or that returning booleans is not an issue since they could refactor it into something that doesn't need to return booleans once they get to the rest of the system... I'm certainly looking forward running a no return values allowed session and see how people will circumvent that (and rationalize it away)!
  • The longer you have worked with a tool/language, the more readily you accept and work around its quirks. When we wanted to structure our tests the way we would describe the four rules of Conway's Game of Life in writing (heading, then four subheadings, and then the concrete examples under the subheadings) it didn't match RSpec's expectations at all. We agreed that both structures (RSpec vs. the natural) make sense and we can understand how & why RSpec evolved this way, but we couldn't make it match the natural structure. This lead to a nice brief discussion about when you want to deviate from standard tooling/processes and when one is better off following it.
  • Some people just can't put off the desire to finish the task. I will have to be conscious of this in the future, and prod people more - e.g.: to ask whether this test & app code meets their definition of perfect.
  • Sometimes when things are hard, it is an indication that you are doing the wrong thing or solving the wrong problem. E.g.: I wanted to write a test for our function to ensure that the function fails unless it gets exactly 8 parameters (number of neighbors) but we were not supposed to use primitives. It felt like enlightenment when my pair pointed out that there is nothing in the rules that mandate that requirement!
  • TDD as if you meant it is really hard unless both of you are good at the chosen language. I assumed Ruby and Python are rather similar, but learned quickly it is not the case. While we had great conversations during the session, and I've learned some interesting things about Ruby, we have not made much progress with the actual constraint.
  • It seems most people work on their desktop machines/docking stations, and thus their laptop is a secondary device, where they have not invested in their environments that much, and thus the no mouse constraint is much harder (e.g.: having ReSharper installed at work, but not on the personal laptop).
... and I could go on much longer :) Certainly, the coderetreat format is great, and I enjoyed both programming and facilitating (though the fact my German is not strong enough makes it somewhat difficult), and I'm sure we'll do more coderetreats in Nuremberg. So keep an eye out for it on the coderetreat site or on the Softwarekammer events page.

Thursday, October 25, 2012

Book Review - Exploring Everyday Things with Ruby and R by Sau Sheong Chang

Book cover image 

Disclaimer: I received a free (electronic) copy of this ebook (Exploring Everyday Things with Ruby and R by Sau Sheong Chang) from O'Reilly as part of the O'Reilly Blogger Review Program, which also requires me to write a review about it. That aside, I would have purchased this book this year anyway, and would have reviewed it on this blog too.

About me and why I read this book

I have been programming professionally for about 8 years, mainly business applications and reporting, so I already have quite some love for data. While I haven't used math much in my day jobs, I liked (and was good at) it in high school, including taking extra classes - so I have learned basic statistics. Refreshing and advancing my data analytics skills is one of my goals this year, and reading this book was part of that plan - I have heard that R is one of the most powerful languages for statistical analysis currently available.

About the book

The book is written assuming basic understanding of programming and sets two goals:
  • to awaken the curiosity in the reader to go out and explore things and search for explanation, models, and experiments to validate understanding;
  • to show you some basic, but practical R and Ruby.
While the author intended each chapter to be more or less self sufficient, I have found it to be better read sequentially, especially the simulation chapters.

Ruby

I had no trouble with the code examples, even though I have only programmed about half an hour total in my life in Ruby. Beware that the only knowledge you gain about Ruby is the bare minimum required, so you'll have to put aside your thirst for complete understanding of the language and its ecosystem. If you need to have a proper understanding to work in a language (which I don't think is necessary), you are better off either reading a Ruby book first or using your favorite language to obtain the data - the code is easy to port.

Making me curious

I have had a lot of wow/a-ha moments, both about the topics chosen for discussion as well as the math/algorithmic ideas. You may find that you disagree with some of the conclusion the author draws, and it is emphasized during the introduction that the goal of the book is not to convince you about these conclusions, but to demonstrate the journey from question to conclusion in order to equip you with tools to enable you doing the same. This is mostly achieved.
I award extra bonus points for mentioning the limitations of the used analytical tools - I don't think I would trust any book/article/blog post which presents something without its downsides!
Not all examples are exactly everyday (e.g.: an analysis of going to work by car vs. public transportation would have been more everyday than how to simulate the flocking of birds), but they cover a wide breath of topics. The processing and analysis of the data is always challenging enough, plus your general knowledge is expanded.
One thing I was missing is a description of a really important part - being a layman, how do I go about finding which algorithms to use? While it isn't a book about Research 101, a description of the search process would have been great. You can of course always google, but when entering a new topic I find guided search helpful - which are some of the trick keywords, which sites to prefer/avoid, etc. On the other hand, enough methods are described that just properly learning and understanding them would make me a much better statistician already. Once done with that I could just fall back reading through the R packages and methods, hoping that if I have seen a word before it would emerge from my passive knowledge when I'm faced with a matching problem.

The R language

The book does a solid job to help you get started. It demonstrates enough language features to enable to you experiment with it for work projects (e.g.: use MySql as a datasource, create packages, etc.); points out the R component/library hubs to look for community packages; and recommends further learning resources.
The code examples are like most programming book snippets - procedural, (mostly) everything is located in a single method/script. Not a tangled-spaghetti mess that makes one despise it in legacy code, only it makes for a lower signal/noise ratio and requires more effort from the reader. Guess its a genre problem, so if you have read other programming books, you shouldn't have any problems with this one.
Technical comment: the ebook isn't formatted to play nice with the Kindle DX, and while in print the code block might be only broken between left & right pages, on the kindle it makes for awkward read.
The exposed APIs suggest that R is a bit too ceremonial for my taste, but that could be abstracted away for the project that warrants R's use. I have also used a number of visually great .NET UI third party components that were a pain to work with from a programmer's perspective, yet helped us create a great product. Plus things that feel alien first become second nature after enough practice, so it isn't a big deal. I plan to take a look at NumPy as well, and defer the decision whether to dive deeper into R (possibly via using F# 3.0 type providers for R).

Overall

The book hasn't left me in awe, but it didn't feel like a chore to read as some other books. I got the taste of R that I wanted when I picked up my copy to read. On top of that, I have learned about fun things, and it also added books to my reading (wish)list (e.g.: The Grammar of Graphics by Leland Wilkinson, Armchair Economist by Stephen E. Landsburg, and more). This is no definitive guide on R, but to wet your appetite and get you started, it is a good one I can recommend without reservations.

Wednesday, September 19, 2012

My first Code Retreat - Legacy Code Retreat in Frankfurt on Sep 15, 2012

If you are not familiar with the concept of a code retreat, listen to this podcast (or read the transcript).

While I knew about Code Retreats for a while, this was the first I actually managed to attend). It was organized by the German Software Craftsmanship community group, hosted by Namics, and facilitated by Nicole Rauch and Andreas Leidig. And it was great, thanks to everyone involved in putting up the event!

The format has been described by others, so I won't cover that. I have to say though that I really like the format and I wish I started socializing (in software related matters) first at a code retreat instead of conferences or usergroups - the format of the event guarantees one doesn't have to worry about uncomfortable silences to be filled with smalltalk. The day starts with coding, the retrospective is group talk, and with the exception of the lunch, the breaks are only five minutes long, and you are searching for the next programming pair during that time anyway. Great way to get more comfortable interacting with strangers about software! (And if you do want to socialize, just come early for breakfast and stay after the event).

I wonder if being familiar with automated testing is a pre-requisite

My assumption is that one could attend a legacy code retreat even if (s)he has no experience with automated testing, since

  • You could learn the basics of testing from the pairs you are working with
  • You can see it applied in the real world. The most common objection I hear from people recently introduced to automated testing/TDD is that it might work on greenfield projects, but cannot be applied on their existing project

So if you are (or know of someone who is) a person who attended such a code retreat with no prior testing experience, please let me know - I would love to know whether the above hypothesis matches your experience! Unfortunately all my pairs had prior experience, so it's still just a hypothesis.

Iteration impressions, lessons learned

  • Dynamic language IDEs still have a long way to go, so for now I'll probably stick to Vim for python
  • While it's interesting to take a guided tour of a language you don't know, the focus of the codebase is not on datastructures (only uses lists/arrays) and thus you only catch a glimpse of the language. I'll have to attend a normal code retreat to see whether this would be different there
  • Giving a language demo is interesting, and you learn a lot about the language too. People new to a language tend to ask questions about things you take for granted, yet you may not know the answer to
  • Taking baby steps and not assuming anything is a Good Thing ™ - the codebase is devious one, crafted with care to make you trip over. I.e.: it is a proper legacy codebase, despite its small size!
  • The "never assume" advice holds especially as you move between iterations. During one iteration we made a mistake that wasn't caught by the regression tests. Since in the previous iteration (with another pair) we had 100% (line) coverage, the fact that in the next iteration we might not have that didn't occur to me...
  • Discipline is hard. I was totally carried away refactoring during the last iteration. I had this craving to actually make progress with the refactoring, and I caught myself saying things "were we responsible coders, we would now stop to write some tests, but let's just move on now", as well as tugging multiple pieces of the spaghetti at the same time. While here I might be forgiven (after all, the last iteration was a free to choose what to do (with) this codebase), it's an important reminder that I should watch myself at work - I would have never expected myself to get so off track in a matter of 10-15 minutes. And I used to pride myself that I realize when I'm in a dead end and have no trouble throwing away code to start from the last known good state!
  • The code retreat format is great for teaching people the importance of prototypes, I will keep that in mind for the future. During the functional iteration we haven't made much progress, but on the train home I did a quick experiment to start making it functional from the outside in, starting at the main() method, introducing the GameState as a subclass of Game, and each step returning a new GameState (while still modifying the old game state, since the refactoring was incomplete, as it usually is the case). This approach didn't occur to me the first time, and had I not started from a clean slate, I would not have thought of it if I were to continue where I left off the previous attempt.
  • While the facilitators keep going around, we didn't always get deep into the issues they commented on (e.g.: I think if the test case and the test name express clearly the domain and the scenario, it is totally fine to use a variable called sut, etc.).
  • However, there is a lot of time available to discuss with your pair, not having to worry whether or not the code will be finished, which is great. One caveat is that you do have a time limit on the discussion, since you don't want to bore your pair and want to actually write code, so you are forced to condense your thoughts. Luckily, this limit is not as bad as twitter's
  • Theory vs. practice, a.k.a. talking the talk vs. walking the walk. I've been guilty of this myself, describing how my ideal test case would look like in theory, and what guiding principles I follow while writing an actual test case. Then the pair politely points out that the theory is great, but what we have here in the code is not a manifestation of those principles...

The Iteration I wish was there - working towards a change request

Each iteration had a different focus, and I assume that there isn't a static final (pun intended) list of possible restrictions and it evolves. So despite this being my first ever code retreat and being told that these ideas wouldn't fit the format, I'm documenting them here, so that I can refer back to it after my next code retreats to see whether I still feel the same about these, since now I think they would be similar restrictions like during the traditional code retreat when one is not allowed to speak or use if statements in the code.

I really missed having a clear functional goal for the iterations, since one usually refactors legacy code when some new feature/enhancement is needed - and it has a huge impact on how one approaches a refactoring.

One mistake I have (seen) made when working with legacy code is going on a refactoring spree, touching parts of the codebase which we don't need to change. The danger of it is that we can easily code ourselves into a corner for days and slip on the original delivery. If it ain't broken, don't fix it (and this doesn't contradict the boyscout rule). This issue has been exposed during the iterations, many of us refactored one part of the application that wasn't business logic heavy, but was a low hanging fruit. While one iteration wouldn't be enough time to finish testing that part, the conversation around it (what test cases would be needed to provide sufficient code coverage, what's the minimum refactoring we need to do to achieve that, etc.).

I raised it during the final retrospective, and people agree it's an important aspect, but they suggested it's not fit for the format of the code retreat.

The other great benefit of having a clear goal is that they demonstrate how fragile the regression characterization tests can be. A neat little change request to the core business logic would have left us without the safety net again, and would have made us think back to the previous iterations when we felt skipping writing a specific test is safe. While everyone knows it, that doesn't mean we wouldn't fall victim to it..

And if you prefer to see a concrete example, instead of just reading through this abstract text, I have something like the Double Dawg Dare in mind.

Some technical notes for attending a code retreat:

  • doublecheck with the organizers what you'll need to attend. They probably plan to send out a reminder/notification email before the event, but I so rarely use my laptop in an online environment that their notice was too late for me to actually prepare my laptop for the event.
  • know your settings & IDE. There are a ton of yaks to be shaved, and many minutes have been wasted by setting things up. It doesn't take away from the experience, but it did stress me a bit the first time
  • either know how to use git, or just create two copies of the codebase so you can easily revert to a clean codebase after the sessions. We had some problems with this.

    git clean -x -f -n # remove -n to really remove them git reset HEAD . # remove everything from the changelist in case you added it git checkout -- . # revert everything below the

  • bring a USB stick, and if you are not using your own laptop during all the sessions, make a copy of the golden master textfile onto it after each of your sessions in a new programming language (my laptop was only used during the first and the last iteration, so for the last we had no sample output textfile we could work against, and it took some time to obtain it.

  • bring your own keyboard and know how to change a mac/linux/windows machine's keyboard layout (or install one). I have not been typing in a number of sessions because of this (try typing on a German mac keyboard, when you are used to windows US layout!)

In Summary

It's a great event, you meet great people, and I would be surprised if you came away from a code retreat not having learnt anything new.

Thursday, March 1, 2012

Book Review - Programming Collective Intelligence by Toby Segaran

Book cover photo Disclaimer: I received a free (electronic) copy of this ebook (Programming Collective Intelligence by Toby Segaran) from O'Reilly as part of the O'Reilly Blogger Review Program, which also requires me to write a review about it. That aside, I would have purchased this book this year anyway, and would have reviewed it on this blog too.

About me and why I read this book


I've been programming professionally for ~7.5 years, mainly business applications and reporting, so I already have quite some love for data. While I haven't used math much in my day jobs, I liked (and was good at) it in high school, including taking extra classes - so I have learned basic statistics. Refreshing and advancing my data analytics skills is one of my goals this year, and reading this book was part of the plan.


About the book



The book introduces lots of algorithms that can be used to gain new insight into any kind of data one might come across. The explanations are broken up into digestible chunks, and are supported by great visualizations. While understanding of the previous chunks is required for the later ones, this allowed me to read through most of the book on the train to and from work.


Each of the algorithms is illustrated with real world application examples, and examples where applying them doesn't make sense are brought too. The exercises at the end of the chapters are applied and not purely theoretical - and coming up with exercises from the domain I work with every day was pretty easy! The book is really inspiring, which is great for an introductory book!


In addition to the well written, gradual introduction, the book has a concise algorithm reference at the end, so when one needs a quick refresher, there is no need to wade through the lengthy tutorials.


While the prose and the logic of the explanations are great, I have found the code samples hard to follow: really short, cryptic variable names; leaky abstractions; inconsistent coding style just to name a few. Some code samples are actually incorrect implementations of the given algorithm and there are antipatterns like string sql concatenation in the code without a warning comment to the reader to remind them it's a bad practice.


Nonetheless, it is great to have actual code to play with, just the initial reading and reviewing of it requires some extra effort.

The book claims that you don't need previous Python knowledge to understand the code samples, which I can't confirm (I use Python at my day job), but I wouldn't be surprised if not knowing Python could make understanding the code even more difficult (I've actually learned a few new language features from the samples!). Also, the Python language has come a long way since 2.4, which is the version used in the book - and that old version makes the code feel dated.

The book was written in 2007, but is not dated. First, the foundations of any topic tend to be timeless, and the most recent algorithm the book describes was published in 1990. The Table of Contents is comparable to more recently written ones (though I haven't read other introductory books yet).

In summary: I would recommend it as a great introductory book!

Friday, February 17, 2012

Inversion of Control for Continuous Integration

Problem Description



Our build structure is pretty stable, but the exact content of the steps varies as we discover more smoke tests that we'd like to add to, or when we rearrange the location of these checks.



The CI servers I've used made this a rather cumbersome process:





  • First, I have to leave my development environment to go to the build servers configuration of choice - most of the time it is a web interface, and for some it is a config file


  • I have to point and click, and if it's a shell script, I have to make my modifications without syntax highlighting (for the config files usually take the shell command to execute as a string, so no syntax highlighting)


  • If it's a web interface, I have (or had) no versioning/backup/diff support for my changes (config files are better in this aspect).


  • If it's a config file, then I need to get it to the build server (we version control our config files), so that's at least one more command


  • I need to save my changes, and run a whole build to see whether my changes worked, which is a rather costly thing.


  • Most places have only one build server, so when I'm changing the step, I either edit the real job (bad idea) or make a copy of it, edit it, and then integrate it back to the real job. Of course, integrating back means: copy and paste.


  • If the build failed, I need to go back to the point and click and no syntax highlighting step to fix the failures


  • Last, but not least, with web interfaces, concurrent modifications of a build step lead to nasty surprises!






Normal development workflow





  • I have an idea what I want to do


  • I write the tests and code to make it happen


  • I run the relevant tests and repeat until it's working


  • I check for source control updates


  • I run the pre-commit test suite (for dvcs people: pre-push)


  • Once all tests pass I commit, and move on to the next problem




Quite a contrast, isn't it? And even the concurrent editing problem is solved!




Quick'n'Dirty Inversion of Control for builds



Disclaimer: the solution described below is a really basic, low tech, proof of concept implementation.



Since most build servers at the end of the day





  • invoke a shell command


  • and interpret exit codes, stdout, stderr, and/or log files




we defined the basic steps (update from version control, initialize database, run tests, run checks, generate documentation, notify) using the standard build server configuration, but the non-built in steps (all, except the version control update and the notification) are defined to invoke a shell script that resides in the project's own repository (e.g.: under bin/ci/oncommit/runchecks.sh). These shell scripts' results can be interpreted by the standard ways CI servers are familiar with - exceptions and stack traces, (unit)test output, and exit codes.




Benefits





  • adding an extra smoke test doesn't require me to break my flow, and I can more easily test my changes locally and integrating it back into the main build means just committing it to the repository, and the next build will already pick this up


  • I can run the same checks locally if I would like to


  • if I were to support a bigger team/organization with their builds, this would make it rather easy to maintain a standard build across teams, yet allow each of them to customize their builds as they see it fit


  • if I were to evaluate a new build server product, I could easily and automatically see how it would work under production load, just by:


    • creating a single parameterized build (checkout directory, source code repository)


    • defining the schedule for each build I have


    • and then replaying the past few weeks/months load - OK, I still would need to write the script that would queue the builds for the replay, but it still is more effective than to run the product only with a small pilot group and then see it crash under production load









Shortcomings, Possible Improvements



As said, the above is a basic implementation, but has served a successful proof of concept for us. However, our builds are simple:





  • no dependencies between the build steps, it is simply chronological


  • no inter-project dependencies, such as component build hierarchy (if the server component is built successfully, rerun the UI component's integration tests in case the server's API changed, etc.)


  • the tests are executed in a single thread and process, on a single machine - no parallelization or sharding




All of the above shortcomings could be addressed by writing a build server specific interpreter that would read our declarative build config file (map steps to scripts, define step/build dependencies/workflows), and would redefine the build's definition on the server. By creating a standard build definition format, we could just as easily move our builds between different servers as we can currently do with blogs - pity Google is not a player in the CI space, so the Data Liberation Front cannot help :).




Questions



Does this idea make sense for you? Does such a solution already exist? Or are the required building blocks available? Let me know in the comments!

Friday, February 3, 2012

There Is More To Clean Code Than Clean Code

A post written by Uncle Bob in January (I'm behind my reading list) offended me. I absolutely agree with Uncle Bob's analysis regarding the code itself, and I also prefer the refactored version, but I have a problem with insulting the programmer(s) reluctant to appreciate the change.


We write code in programming languages, and there are different levels of proficiency in a language.


As I'm currently learning a new spoken language, I'm painfully aware of this - initially I probably sounded like a caveman. The first impression you get about me is totally different depending on the language I speak - but I am the same person!


The learning curve of a language is not smooth - the steepness between consecutive levels of proficiency is different. Going from not speaking any German to speaking A1 (tourist) level was easy, getting the basic grammar required for the low intermediate (B1) level wasn't too bad, but to get my German to the level where my English is will take more effort than the sum of all my previous investments1.


Since it is my third foreign language I'm learning, I have no difficulty accepting that the level I think I speak is higher than the level I actually speak. Because of that, whenever someone rephrases my sentences in proper German 2, I start from the assumption that likely their version is better, even if I don't understand first why - and I take the effort to understand their reasoning 3. I do that despite that I was of course convinced that when I spoke, I expressed my thoughts in the best possible way.


However, I don't have much at stake - no ego to hurt, no reputation to loose, and the roles are clear: I'm the beginner, and the people around me are the more experienced ones. In a software team, the roles might not be so clear - I had told colleagues almost twice my age how they should write code after only a few weeks of working there. Bad idea. Since then, I have learned not to start improving the coding style of a team by rewriting the code they have written, and showing off how much better my version is. Rather, I wait until a situation arises when they don't mind having me demonstrate some code improvements. I demo it, and explain why I do it that way. In my experience, the second approach is more effective, though it doesn't have that instant satisfaction and relief the first provides.


As the joke goes, you need only one psychologist to change a light bulb, but the light bulb has to want the change real bad.


Driving Technical Change is hard, because it requires a mental/cultural change, and that change has to come from the inside - but can be catalyzed from the outside of course4. But just forcing practices or ways of working on unwilling recipients generates resistance (e.g.: the story of the EU technocrat appointed to recalculate the 2009 Greek budget).


I would like to see more public code reviews and public refactorings (e.g.: Andrew Parker, GeePawHill), but I would like to see less public judgement passing on people at the lower proficiency levels of programming.




1 there is a great Hanselminutes episode on learning a foreign language if interested. Beware, it may contain programming!


2 German readers might disagree, since most Germans I meet speak Frankish :)


3 Which of course, is sometimes harder for natives to properly explain than for novices to ask questions pointing out the seeming irregularities of the grammar


4 And we won't always be able to foster change in all environments (note: this does not mean the others are at fault for not changing!). The same programmer can be highly productive in one team, and be the one slowing down another team. There is nothing wrong with changing jobs after realizing we are a net loss to a given team.

Thursday, January 12, 2012

Find The Test Structure That Fits Your Team

A number of recent posts by Phil Haack, Ayende Rahien, and Gil Zilberfeld dealt with the topic of test organization. Each approach has its pros and cons, but neither is a silver bullet. Your (and your team's, project's) context determines which approach is right.


Without aiming to provide an exhaustive list, below are some questions that have influence on test organization:



  • Is the team in a consulting project where test documentation is required as part of delivery?

  • Is it a product team? Is the firm in its early stage or is it mature like Oracle with mature products?

  • What is the turnover rate of the team? What are the plans for its growth? The team might have all the knowledge in their head, but if it'll double in size in a year, then the communication value of tests could increase.

  • What is the maturity level of the team? How long have they been working together?

  • How closely and often do team members collaborate?

  • Is there collective code ownership?

  • How does the team and its customers communicate? Some customers can - and willing to - read code, some need English (Turkish/Hungarian/German/etc.). Some teams have a level of (grown and deserved) trust that just saying the software works is accepted, some need a more formal acceptance and regression process.

  • Is there proper IDE support for discoverability? Do all people reporting bugs (as tests) have access to that IDE? If not so, how do they find examples of how to write the bug-report test?


Feel free to add more questions in the comments!