Joseph Wilk

Joseph Wilk

Programming bits and bobs

Mining Cucumber Features

Failure Rates vs Change Rates

I’ve been spending a lot of  time recently data mining our test suite at Songkick.com. I’ve been using our own internal version of www.limited-red.com which records all our test fails for every build on our continuous integration server.

With the Cucumber features I decided to plot the number of times a feature file has been significantly changed vs the number of times the feature has failed in the build. The theory being that a change to a feature file(the plain text specification of behaviour) most often represents a chance in functionality.

So I wrote a simple script checking the number of commits to each feature file.

1
2
3
4
5
6
7
8
features_folder = ARGV[0]
feature_files = Dir["#{features_folder}/**/*.feature"]

feature_files.each do |feature_file|
  change_logs = `git log --oneline #{feature_file} 2>/dev/null`
  change_count = change_logs.split("\n")
  puts "#{feature_file},#{change_count}"
end

Feature changes compared with feature failures

Throwing that into a pretty graph, here is a snapshot of some of the features (I’ve changed the name of the features to hide top secret Songkick business).

Insights

Based on this I identified two possible groups of features:

  • Features failing more often than the code around them is changing

  • Features which are robust and are not breaking when the code around them is changing.

Further investigation into the suspect features highlighted 3 causes:

  • Brittle step definitions

  • Brittle features coupled to the UI

  • Tests with a high dependency on asynchronous behaviour

Holes in the data – step definitions

We only recorded the change rate of features files in Git. Features could be broken without ever changing the feature file, for example if a common step definition is broken. Next steps are to identify all the step definitions used by a feature and examine how often the step definitions have changed.

First find the change count for all the step definitions.

1
2
3
4
5
6
7
step_files = Dir["features/**/*_steps.rb"]

step_files.each do |step_file|
  change_logs = `git log --oneline #{step_file} 2>/dev/null`
  change_count = change_logs.split("\n").count
  puts "#{step_file},#{change_count}"
end

Then working out what step definitions a feature uses. We can do this by running cucumber with the json formatter and match up step definitions (and hence step definition files) to feature files:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
require 'json'

`cucumber features --dry-run --format json --out .json.out`
features_json = JSON.parse(File.read('.json.out'))

stats = Hash.new{|h,k| h[k] = []}
features_json['features'].each do |feature|
  feature_name = feature['name']
  #The JSON does not have the feature file. Find the file via the feature name. Messy
  feature_file = `egrep -riE "feature:? *#{feature_name}" features/`.split(":")[0]
  feature["elements"].each do |element|
    element["steps"].each do |step|
      file_location = step['match']['location']
      file, _ = file_location.split(":")
      if file =~ /_steps\.rb$/
        stats[feature_file] = (stats[feature_file] + [file]).uniq
      end
    end
  end
end
pp stats

Change rate vs Failure rate with step definition changes

Combining those two bits of data we can now add to our original graph the step definition change rates for a feature.

We also can examine an individual break down of the step definition change rates for a feature:

Holes in using the step definition change rate

The step definition changes from git are at the file level (the *_step.rb file) so a change in git may not touch a step definition used by a feature. Hence we may be counting changes which are not relevant for a feature. Further work would be to examine the git diffs and check if a change touched a step definition used by a feature.

Conclusions

Our tests hide lots of interesting information that can provide evidence of areas we can make improvements. It’s important to realise that like anything in statistics our data mining does not yield facts, just suggestions. At Songkick we are already mining this information with Cucumber and using it to help improve and learn about our tests.

Conferences and Accessibility: Please Try Harder

I speak at a lot of conferences around the world and I’ve been happy to hear a lot of healthy debate around the diversity of the programming community. I really enjoyed and felt inspired by the talk from Joshua Wehner on “Must It Always Be About Sex” at Nordic Ruby and Ruby Lugdunum.

It got me thinking about the issue of accessibility at conferences. I use a wheelchair and happily throw myself at any conference irrelevant of the obstacles. But not everyone is as crazy or as flexible as I am. Once at the conference the organisers are always willing to help but I wonder if we are not excluding disabled people who may dimiss the chance to talk or attend as unfeasible due to lack of information about access. Even if thats only a few people, those people are valuable members of our communities and deserve the chance to participate if they want to.

How would we ever know if we were discriminating, if no-one turns up in a wheelchair we are none the wiser.

I understand conferences are expensive and complicated events to organise. I would ask two small things of every conference organiser:

1. Try and select an accessible venue if possible. For both speakers and attendees.

2. Make it clear on the conference website if the venue, party, hack sessions are accessible or not.

My life was profoundly changed through speaking and attending conferences, in many ways beyond programming. I hope we can make sure anyone who wants that opportunity won’t be put off.

Startups Infiltrating Agile and XP

I’m very excited to be talking at this years Agile2011 conference. Its a goal I’ve been working towards for a long time.

I’ve been to lots of agile conferences where I have listen to consultants talk about how they have made their multi-nationals or large organisations better through x. Interesting but often not much I could take away and apply at a startup.

Sadly there are only 4 talks at Agile2011 tagged with Startups and 6 tagged with Lean Startup.

I’m passionate about startups and I want there to be more focus at agile events around them. Having spoken to lots of startups all over the world and living in the London startup hub, I all to often hear people dismiss Agile and XP events as not relevant. While there is a lot of activity around the Lean Startup little is said about Extreme Programming and startups. The rich experience of the Agile community tends to  get lost. The Lean Startup has provided a movement which has helped deliver ideas from agile to startups, something agile conferences have failed at.

I’m happy to be injecting a dose of startup with my talk at Agile2011 about Acceptance testing in the world of the startup. I’ll be drawing from my experiences of working at the startup Songkick.com. Looking at the things we did badly, things we did well and things we still have no clue about.

I’m also helping organise events in London and other major cities around Extreme Programming and Startups.

The Art of Cucumber London Workshop

I’ll be hosting a ½ workshop “The Art of Cucumber” at Skillsmatters in London. If you are interested in learning more about Cucumber, understanding how Cucumber can fit in software development and patterns for writing healthy Cukes this workshop is for you.

The workshop will also give you a sneak peek into my talk at Agile2011 (Salt Lake City) and Spa2011 (London) without the big conference ticket price.

Book a place: http://theartofcucumber.eventbrite.com/

Testing Outside of the Ruby World

I recently spoke at the Scotland Ruby Conference about interesting testing tools and ideas outside of the Ruby community. The goal was to inspire the Ruby community to push the state of the art in testing.

Slides

Further Resources

Story Smells: The Valueless Story

“Why bother discussing the value or writing it down for the stories, everybody already knows what it is”

Problem

The value of a story to a stakeholder is not discussed or written on a card. The group participating in story writing workshop feel there is no point in dealing with the value since it seems obvious.

To a degree this is true, we should (hopefully) already know the high level business values that we want to achieve before we try and write any story cards. Otherwise we may end up with a vomit of user stories

Avoiding discussing or writing down the value on story cards can result in cards like this:

Whose value is it anyway?

While we may already know our business values when writing stories I tend to shift the discussion to the users value. We know what we want, now why is a user going to do what we want? Note that this shift does not always apply, sometimes we want to force a user to do something that they don’t want to do (such as filling in a captcha).

Problems

Not discussing the value

  • Miss uncovering differences in underlying assumptions about why this feature is wanted.

    • I’ve witnessed too many groups who when jogged to think about the value start asking the right questions and discovering underlying differences in otherwise implicit assumptions.
  • Create features you don’t need 

Not writing the value on a story card

  • Difficultly Prioritizing cards

    • Knowing the value to a role of a feature can help guide a cards prioritisation.
  • Physical cards on a board without any value, mean its not clear quickly why you are building this feature.

    • When a card is being worked on by various roles knowing the value helps guide you to re-assess and ask relevant questions through the cards journey to being delivered.

Using high level obvious values

This problem can also be a symptom of expressing the value at too high a level. We all know we need to:

  • Protect revenue
  • Increase revenue
  • Manage cost
  • Increase brand value
  • Make the product remarkable
  • Provide more value to your customers

So putting it as a value for a narrative feels contrived and pointless.

Solutions

Ask ‘why?’

If a group feels the value is obvious they will have no trouble popping the why stack. Popping the why stack quickly uncovers questions and can lead to a refinement of what seemed obvious. At worst everyone in the group has a shared understanding alining what they all saw as obvious.

Feature Injection

Try and structure narratives using the Feature injection format:

Starting with the value as the very first thing you write can help ensure you don’t progress onto the other points until you have at least discussed it.

Avoid obvious values

Make sure when you’re discussing the value you drop down from the highest value to something that seems sensible to everyone.

But what if the value really is obvious.

Ok, sometimes a value does seem obvious, perhaps it’s the same as a previous card. It is still important to write it down for the story and at least ask the question ‘is this value obvious to everyone?’ and ‘is this value meaningful to everyone?’. Sometimes that is discussion enough.

Page Object Pattern

What Is the Page Object Pattern?

The Page model is a pattern that maps a UI page to a class, where for example a page could be a HTML page. The functionality to interact or make assertions about that that page is captured within the Page class. Then these methods may be called by a test. So ultimately we are introducing a gatekeeper to the GUI of a page.

Why use the Page Object Pattern?

  • Readable dsl for tests

  • Promotes Reuse

  • Centralise UI coupling – One place to make changes around the UI.

Implementing the Page Pattern in Cucumber

Within Cucumber there are two main ways we can encapsulate the page UI:

The Page Object Pattern

features/pages/login_page.rb

1
2
3
4
5
6
7
8
9
10
11
class LoginPage
  def login(user, password)
    fill_in :user, user
    fill_in :password password
    click 'login'
  end

  def visit
    visit "/login"
  end
end

features/step_definitions/user_steps.rb

1
2
3
4
5
6
Given /^I login with username "Joseph" and password "cuker"$/ do |username, password|
  login_page = LoginPage.new

  login_page.visit
  login_page.login(username, password)
end

The Page Step definition Pattern

Cucumber step definitions are all defined at the same scope, but we use folders and files to create logical organisation. We can create folders for UI step definitions and domain step definitions.

  • features/domain/step_definitions/*

  • features/ui/step_definitions/*

We create a step definition file mapping to a UI page. features/ui/step_definitions/login_page_steps.rb

1
2
3
4
5
Given /^I login with username "Joseph" and password "cuker"$/ do |username, password|
  fill_in :user, user
  fill_in :password password
  click 'login'
end

Whats the right way to encapsulate the UI?

Just using step definitions for organisation within a project can have a number of problems:

Global scope within Cucumbers world Instance variables are global across all step definitions

1
2
3
Given /^I mess with scope$/ do
  @this_can_be_seen_by_every_other_step = 'uh oh'
end

Managed and run through Cucumber No easy way to be reused outside of Cucumber or test in isolation. By isolating the test code we can easily provide adapters for reuse in different test frameworks (for example similar to what email-spec does).

The Page Object pattern (and adding another layer of abstraction) has a couple of nice properties:

  • Bounded scope (if you use your classes/objects nicely)

  • Isolated units that can be invoked and controlled independently of overarching testing framework

Should I be using the Page Object Pattern?

Yes, No, Maybe.

Extra layers of abstraction introduce complexity and so the Page Object Pattern should be used carefully when there is a sufficiently high burden of maintenance (which usually means lots of step definitions).

Its important outside of the Page object pattern to realise the weaknesses of just using step definitions as your only modelling tool. Irrelevant of what metaphor you decide to organise around it’s a good habit to push the code out of the step definitions.

Awari Kata

The Game

Awari is an ancient game originating from Africa which consists of 12 holes in the ground (called houses) split up into 2 rows of 6. Designed for two players, each player selects a row as their territory. Each house starts with 4 seeds in it.

Awari

A round in the game goes as follows:

The player picks a number of seeds from one of their houses. They sow the seeds one by one in an anti-clockwise direction around the board.

The number of seeds in the last sown house dictates what happens next:

  • If there are not 2 or 3 seeds then the go finishes and its the opponents turn.
  • If the last hole is one of your opponents and the number of beads is 2 or 3 you win those beads
    • If you collect, then you may look at the next-to-last, and if that satisfies the same criteria, you collect those also
  • If you play so many beads that they wrap around the board (more than 11) you do not ‘sow’ a bead in the house from which you picked up

If an opponent’s houses are all empty, the current player must make a move that gives the opponent seeds. If no such move is possible, the current player captures all seeds in his/her own territory, ending the game.

Winning

The game is over when:

  • one player has captured 25 or more seeds
  • each player has taken 24 seeds (draw).
  • If both players agree that the game has been reduced to an endless cycle, each player captures the seeds on their side of the board.

The Challenge

Attempt to write an AI to play Awari against.

Good luck.

Story Smells: In Order to Do X I Want X

I keep coming across story cards written like this:

This is usually a tell tale sign that the why stack was not popped. The value (In order to ) has been written at the lowest possible level of abstraction, the same as the feature (I want ). The card does not tell us why this feature is being built.

If you find yourself writing cards like this take it as a chance to find out a bit more about why you are adding this feature. Hopefully you will uncover a more meaningful value that you can write on the card or perhaps discover that you don’t really need this feature after all.

Allowing Features to Breathe

Your Cucumber features are living documentation. The world is evolving around them and without exposure they tend to rot.

Evolving language

When features are written we use a snapshot of the domain language at some specific timeframe. This ubiquitous language changes and grows outside the codebase and is influenced by more than just developers. When do we refactor features to reflect these changes?

Taints

Since the features are in the code base it becomes the developers/QAs responsibility to maintain them. When writing the features for the first time we have lots of discussion. But at a later date the feature language may be tweaked by a developer to allow reuse or some new technical constraint means they want a special step added. This is where taints can sneak into the language.

Quality

It’s amazing what a difference it makes when people know something they are writing will be published and read by others. Exposing the features can help increase their quality.

Barriers to entry

It’s great that the features sit close to the code but there is a barrier in how you gain access to that code. While developers find source control like second nature its something that gets in the way for non-technical people.

Allowing features to breathe

When someone wants to know about the behaviour of a feature they should be able to turn to the features irrelevant of technical expertise. This engenders discussion which helps bring about changes to the features to better reflect the ubiquitous language and remove taints. The easier it is to access the features the more likely they are to be the first port of call.

So my advice is exposure your Cucumber features to your team (and if possible your users, which Rspec has done with Relish), allowing them to browse and question them.

Relish

To expose features I currently use Relish. This is a web based browser for features.

Relish has a command line gem for pushing features up to the website. I use a post-receive server git hook which upon every push also pushes the features to Relish.

.git/hooks/post-receive

#!/bin/sh
relish push --project air --organization breathe