Joseph Wilk

Joseph Wilk

Programming bits and bobs

Fake Execution

A little RubyGem for faking out execution in your tests and inspecting afterwards what was run.

Why FakeExecution?

I’ve been creating internal tools for developers to help improve productivity. These tools written in Ruby, ended up doing lots of shell scripting. These scripts started becoming fairly complicated so I wanted some test feedback. How could I easily test execution?

Enter FakeExecution.

Installing

gem install fake_execution

How do I use it?

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
require 'fake_execution/safe'

FakeExecution.activate!

`echo *` # This is not executed

`git checkout git://github.com/josephwilk/fake-execution.git`
`touch monkeys`
system("git add monkeys")
system('git commit -m "needs more monkeys"')
`git push`

FakeExecution.deactivate!

cmds[0].should =~ /echo/
cmds[1].should =~ /git checkout/
cmds[2].should == 'touch monkeys'

`echo *` # outputs: echo *

But I use Rspec

1
2
3
4
5
6
7
8
9
10
11
 require 'fake_execution/spec_helper'

 describe "monkeys" do
   include FakeExecution::SpecHelpers

   it "should touch the monkey" do
     `touch monkey`

     cmds[0].should == 'touch monkey'
   end
 end

Source code

http://github.com/josephwilk/fake_execution

Conferences and the Cult of Celebrity

How much should character be a factor for a conference talk being selected?  Big names sell conference tickets. Yet I believe that we can do a lot more to help promote conferences where content takes greater weight than character and in the process help people who have never spoken before start speaking at conferences.

Risk.

Conference organisers take a big risk running a conference.

Will they cover their venue cost?

Which ultimately leads on to will they sell enough tickets?

Once they have enough tickets, will it be a good conference?

Other successful conferences define a pattern of how a conference is laid out. One obviously way of dealing with Risk is to follow an example of a success.

I really respect people who organise conferences, they are putting a lot of effort and their own time to make a successful event.

But I wondering if maybe we could be focusing more on content and less on character.

Selling a presentation

When you submit a talk an aspect of convincing the organisers to allow you to talk is who you are. Persuasion by character (Ethos if you like your ancient Greek).

Most submissions forms give you plenty of space to sell yourself.

Content not character.

For NordicRuby 2012 the organisers took a step pushing for interesting content over character. They recognised that conscious and subconscious persuasion by character is a powerful means of convincing.

During most of their reviewing process they removed all the names from the proposals.

“Another thing we’re doing differently this year is that we’re starting out with anonymous proposals. Each card in Trello just shows the proposal’s title and description. No information about the speaker. We do this to avoid bias in the first stages.”

Talks where selected based on the content. Only later once the talks had been whittled down did they introduce the speaker names. Character was considered, it was just delayed to the late selection stage.

This tweet caught my attention as it highlighted that for Jsconf.com.au proposals are selected without names attached. Content is king, character does not matter!

Lonestar Ruby conference submissions, your name, your email, no big bio to sell how great you are. Content rules.

Keynotes: where character is King

The keynote of a conference is the stable diet of most conferences. Its where the big names are rolled out, luminaries of our industry share their wise thoughts with us.

Akin to live music, they are the headline act we pay for, the others are the  support acts. We might skip them, or give them half of our attention.

What effect does having two tiers of talks at a conference have?

The keynote talk is more important irrelevant of content because of the character of the speaker. Often we don’t even know what they will be talking about, just that they are keynoting.

Some examples (I’m not focusing any blame on these examples, I respect the conferences and the organisers involved. They just help illustrate my point)

We assume (based on authority) that important people will have important things to tell us (and sometimes they do have important ideas to share with us).

Are the other, non-keynote talks of the conference of lesser importance? This is where content starts to win because all these speakers are non-keynoters, we don’t have as many leading assumptions about their authority or character. We go to the talks where the content interests us.

Maybe all talks should be driven by the content and less by character. Maybe all talks are as important as each other.

Conferences are already starting to kill of this idea of having to have a Keynote to sell your conference. FutureRuby and NordicRuby are two examples of conferences that have no Keynote yet produced conferences that are highly regarded by those who attended.

Will a keynote really make your conference better?

The Content Conference

Lets see if we can take this a step further and define the content conference.

  1. Talks authors are not revealed during review/selection. (There is no bias by character)

  2. Only publish the talk titles and content on the conference website. (Sell the conference on its content not its characters.)

  3. No keynotes. All talks are equal.

Conference Mentors

If we want to focus on content and remove character (as much as possible) we have to deal with people with little experience but with great ideas. We need to help make it easier to give feedback and help mentor those people so they can best express their ideas.

Being accepted to speak is not the end of contact its the start.

A group of experience speakers acting as mentors who give feedback and help people give the best presentations they can.

Final Words

There is a place in the conference world for events which focus on bringing big names to an audience. Looking back at the 30 conferences I’ve spoken at so far I’ve come to the conclusion that the conferences I really valued where those that pushed for content over character. Conferences are popping up all the time that follow the content conference ideas and understand the concious and subconsious bias of character.

I feel if we try and push for content over character and improve the proposal/speaker setup we can help find new people with great, interesting, crazy ideas and encourage them to submit proposals and speak.

I hope these ideas about the content conference might help conference organisers think about how they structure their conference and proposal system.

I am offering my time to mentor anyone who wants help writing their first conference proposal.

I am also happy to join any conference that wants a set of experienced speakers to help new speakers get the best out of their presentations.

Crazy, huh?

So what do you think?

A Little Bit of Pig

Currently in the Science team at Songkick I’ve been working with Apache Pig to generate lots of interesting metrics for our business intelligence. We use Amazon’s MapReduce and Pig to avoid having to run complex, long running and intensive queries on our live db, we can run them on Amazon in a timely fashion instead. So lets dive into Pig and how we use it at Songkick.com.

Pig (whats with all these silly names)

The Apache project Pig is a data flow language designed for analysing large datasets. It provides a high-level platform for creating MapReduce programs used with Hadoop. A little bit like SQL but Pig’s programs by their structure are suitable for parallelization, which is why they are great at  handling very large data sets.

Heres how we use Pig and ElasticMapReduce at Songkick in our Science team.

Data (Pig food)

Lets start by uploading some huge and interesting data about Songkicks artists onto S3. We start by dumping a table from mysql (along with a lot of other tables) and then query that data with Pig on Hadoop. While we could extract all the artist data by querying the live table its actually faster to use mysqldump and dump the table as a TSV file.

For example it took 35 minutes to dump our artist table with a sql query ‘select * from artists’. It takes 10 minutes to dump the entire table with mysqldump.

We format the table dump as a TSV which we push to S3 as that makes it super easy to use Amazons ElasticMapReduce with Pig.

shell> mysqldump --user=joe --password  --fields-optionally-enclosed-by='"'
                  --fields-terminated-by='\t' --tab /tmp/path_to_dump/ songkick artist_trackings

Unfortunately this has to be run on the db machine since mysqldump needs access to the file system to save the data. If this is a problem for you there is a Ruby script for dumping tables to TSV: http://github.com/apeckham/mysqltsvdump/blob/master/mysqltsvdump.rb

Launching (Pig catapult)

We will be using Amazons Elastic MapReduce to run our Pig scripts. We can start our job in interactive Pig mode which allows us to ssh to the box and run the pig script line by line.

Examples (Dancing Pigs)

An important thing to note when running pig scripts interactively is that they defer execution until they have to expose a result. This means you can get nice schema checks and validations helping ensure your PIG script is valid without actually executing it over your large dataset.

We are going to try and calculate the average number of users tracking an artist based on the condition that we only count users who logged in, in the last 30 days.

This is what our Pig script is doing:

The Pig script:

1
2
3
4
5
6
7
8
-- Define some useful dates we will use later
%default TODAYS_DATE `date  +%Y/%m/%d`
%default 30_DAYS_AGO `date -d "$TODAYS_DATE - 30 day" +%Y-%m-%d`
    
-- Pig is smart enough when given a folder to go and find files, decompress them if necessarily and load them.
-- Note we have to specify the schema as PIG does not know know this from our TSV file.
trackings = LOAD 's3://songkick/db/trackings/$TODAYS_DATE/' AS (id:int, artist_id:int,  user_id:int); 
users = LOAD 's3://songkick/db/users/$TODAYS_DATE/' AS (id:int, username:chararray, last_logged_in_at:chararray);
trackings
<1, 1, 1>
<2, 1, 2>

users
<1,'josephwilk', '11/06/2012'>
<2,'elisehuard', '11/06/2012'>
<3,'tycho', '11/06/2010'>
1
2
3
-- Filter users to only those who logged in, in the last 30 days
    -- Pig does not understand dates, so just treat them as strings
    active_users = FILTER users by last_logged_in_at gte '$30_DAYS_AGO'
Users
<1,'josephwilk', '11/06/2012'>
<2,'elisehuard', '11/06/2012'>
1
2
3
4
active_users_and_trackings = JOIN active_users BY id, trackings BY user_id
    
    -- group all the users tracking an artists so we can count them.
    active_users_and_trackings_grouped = GROUP active_users_and_trackings BY active_users::user_id;
<1, 1, /\{<1,'josephwilk', '11/06/2012'>, <2,'elisehuard', '11/06/2012'>\/}>`
1
trackings_per_artist = FOREACH active_users_and_trackings_grouped GENERATE group, COUNT($2) as number_of_trackings;
`<\/{<1,'josephwilk', '11/06/2012'>, <2,'elisehuard', '11/06/2012'>\/}, 2>`
1
2
-- group all the counts so we can calculate the average
    all_trackings_per_artist = GROUP trackings_per_artist ALL;
<\/{\/{<1,'josephwilk', '11/06/2012'>, <2,'elisehuard', '11/06/2012'>\/}, 2\/}>
1
2
3
-- Calculate the average
    average_artist_trackings_per_active_user = FOREACH all_trackings_per_artist
      GENERATE '$DATE' as dt, AVG(trackings_per_artist.number_of_trackings);
<{<'11/062012', 2>}>
1
2
3
--Now we have done the work store the result in S3.
    STORE average_artist_trackings_per_active_user INTO
      's3://songkick/stats/average_artist_trackings_per_active_user/$TODAYS_DATE'

Debugging Pigs (Pig autopsy)

In an interactive pig session there are two useful commands for debugging: DESCRIBE to see the schema. ILLUSTRATE to see the schema with sample data:

DESCRIBE users;
users: {id:int, username:chararray, created_at:chararray, trackings:int}

ILLUSTRATE users;
----------------------------------------------------------------------
| users   | id: int | username:chararray | created_at | trackings:int |
----------------------------------------------------------------------
|         | 18      | Joe                | 10/10/13   | 1000          |
|         | 20      | Elise              | 10/10/14   | 2300          |
----------------------------------------------------------------------

Automating Elastic MapReduce (Pig robots)

Once you are happy with your script you’ll want to automate all of this. I currently do this by having a cron task which at regular intervals uses the elastic-mapreduce-ruby lib to fire up a elastic map reduce job and run it with the pig script to execute.

Its important to note that I store the pig scripts on S3 so its easy for elastic-mapreduce to find the scripts.

Follow the instructions to install elastic-mapreduce-ruby: https://github.com/tc/elastic-mapreduce-ruby

To avoid having to call elastic-mapreduce with 100s of arguments a colleague has written a little python wrapper to make it quick and easy to use: https://gist.github.com/2911006

You’ll need to configure where you’re elastic-mapreduce tool is installed AND where you want elastic map-reduce to log to on S3 (this means you can debug your elastic map reduce job if things go wrong!).

Now all we need to do is pass the script the path to the pig script on S3.

./emrjob s3://songkick/lib/stats/pig/average_artist_trackings_per_active_user.pig

Testing with PigUnit (Simulating Pigs)

Pig scripts can still take a long time to run even with all that Hadoop magic. Thankfully there is a testing framework PigUnit.

http://pig.apache.org/docs/r0.8.1/pigunit.html#Overview

Unfortunately this is where you have to step into writing Java. So I skipped it. Sshhh.

References

  1. Apache Pig official site: http://pig.apache.org

  2. Nearest Neighbours with Apache Pig and JRuby: http://thedatachef.blogspot.co.uk/2011/10/nearest-neighbors-with-apache-pig-and.html

  3. Helpers for messing with Elastic MapReduce in Ruby https://github.com/tc/elastic-mapreduce-ruby

  4. mysqltsvdump http://github.com/apeckham/mysqltsvdump/blob/master/mysqltsvdump.rb

Examples Alone Are Not a Specification

The Gherkin syntax used by Cucumber enforces that Feature files contain scenarios which are examples of the behaviour of a feature. However Gherkin has no constraints on if there is a specification present. Examples are great at helping us understand specifications but they are not specifications themselves.

What do we mean when we say specification?

definition: A detailed, exact statement of particulars

In a Gherkin feature the specification lives here:

Lets look at a real example:

A Feature with just Examples

A Cucumber example based on a feature (which I have modified) from the test library Rspec rspec-expectations:

Feature: be_within matcher
  Scenario: basic usage
  Given a file named "be_within_matcher_spec.rb" with:
  """
  describe 27.5 do
  it { should be_within(0.5).of(27.9) }
  it { should be_within(0.5).of(27.1) }
  it { should_not be_within(0.5).of(28) }
  it { should_not be_within(0.5).of(27) }
  # deliberate failures
  it { should_not be_within(0.5).of(27.9) }
  it { should_not be_within(0.5).of(27.1) }
  it { should be_within(0.5).of(28) }
  it { should be_within(0.5).of(27) }
  end
  """
  When I run `rspec be_within_matcher_spec.rb`
  Then the output should contain all of these:
  | 8 examples, 4 failures                     |
  | expected 27.5 not to be within 0.5 of 27.9 |
  | expected 27.5 not to be within 0.5 of 27.1 |
  | expected 27.5 to be within 0.5 of 28       |
  | expected 27.5 to be within 0.5 of 27       |

So where is the explanation of what be_within does? If I want to know how be_within works I want a single concise explanation not 5/6 different examples. Examples add value later to validate that specification.

A Feature with both Specification and Examples

Lets add back in the specification part of the Feature. drum roll

Feature: be_within matcher

  Normal equality expectations do not work well for floating point values.
  Consider this irb session:

      > radius = 3
        => 3 
      > area_of_circle = radius * radius * Math::PI
        => 28.2743338823081 
      > area_of_circle == 28.2743338823081
        => false 

  Instead, you should use the be_within matcher to check that the value
  is within a delta of your expected value:

      area_of_circle.should be_within(0.1).of(28.3)

  Note that the difference between the actual and expected values must be
  smaller than your delta; if it is equal, the matcher will fail.

  Scenario: basic usage
    Given a file named "be_within_matcher_spec.rb" with:
      """
      describe 27.5 do
        it { should be_within(0.5).of(27.9) }
        it { should be_within(0.5).of(27.1) }
        it { should_not be_within(0.5).of(28) }
        it { should_not be_within(0.5).of(27) }

        # deliberate failures
        it { should_not be_within(0.5).of(27.9) }
        it { should_not be_within(0.5).of(27.1) }
        it { should be_within(0.5).of(28) }
        it { should be_within(0.5).of(27) }
      end
      """
    When I run `rspec be_within_matcher_spec.rb`
    Then the output should contain all of these:
      | 8 examples, 4 failures                     |
      | expected 27.5 not to be within 0.5 of 27.9 |
      | expected 27.5 not to be within 0.5 of 27.1 |
      | expected 27.5 to be within 0.5 of 28       |
      | expected 27.5 to be within 0.5 of 27       |

Thats better, we can get an explanation of why this method exists and how to use it.

Imagine RSpec without the specification

I think of a Cucumber feature without a specification much like an Rspec example without any English sentence/description.

context "" do
  it "" do
    user = Factory(:user)
    user.generate_password
    user.activate

    get "/session/new", :user_id => user.id

    last_response.should == "Welcome #{user.name}"
  end
end

Feels a little odd doesn’t it.

Cucumber Features as Documentation (for real)

Rspec is an example of a project that has taken its Cucumber features and published them as its documentation. Just browse through those features and it quickly highlights how important it is to have a specification as well as examples. Imagine an API with nothing but examples, leaving you the detective work of trying to work out what the thing actually does.

Documentation needs to explain/specify what something does as well provide examples. If you really want anyone to read your feature provide both examples and a specification.

Co-chair of the Agile Alliance Functional Testing Tools Group

I’m happy to announce that I have joined Elizabeth Hendrickson as Co-Chair of the Agile Alliance Functional Testing Tools group (AAFTT, yes the name is a bit of a mouthful).

I came across the AAFTT group when they organised a pattern writing workshop in London 2010, facilitated by Linda Rising. It was an opportunity to bringing together some of the thought leaders in testing and share experiences. Somehow I managed to sneak in and proceded to steal all these experts knowledge. The AAFTT also runs an open space pria to the Agile conference. Last year I was surprised to find people who had travelled to the conference just for the AAFTT open space!

As a developer who loves messing around with testing tools, I’ll happily admit I know little about testing as a profession. The scope and breath of ideas that the AAFTT exposed me to has left me excited and hungry to play with new tools, ideas and patterns to make testing as fun as it should be. I hope as a developer who is obsessed about testing I can help blur the lines between developers and testers. I’m both.

Want to know more about the Agile Alliance Functional Testing Tools group?

And keep your ears open for any events!

Mining Cucumber Features

Failure Rates vs Change Rates

I’ve been spending a lot of  time recently data mining our test suite at Songkick.com. I’ve been using our own internal version of www.limited-red.com which records all our test fails for every build on our continuous integration server.

With the Cucumber features I decided to plot the number of times a feature file has been significantly changed vs the number of times the feature has failed in the build. The theory being that a change to a feature file(the plain text specification of behaviour) most often represents a chance in functionality.

So I wrote a simple script checking the number of commits to each feature file.

1
2
3
4
5
6
7
8
features_folder = ARGV[0]
feature_files = Dir["#{features_folder}/**/*.feature"]

feature_files.each do |feature_file|
  change_logs = `git log --oneline #{feature_file} 2>/dev/null`
  change_count = change_logs.split("\n")
  puts "#{feature_file},#{change_count}"
end

Feature changes compared with feature failures

Throwing that into a pretty graph, here is a snapshot of some of the features (I’ve changed the name of the features to hide top secret Songkick business).

Insights

Based on this I identified two possible groups of features:

  • Features failing more often than the code around them is changing

  • Features which are robust and are not breaking when the code around them is changing.

Further investigation into the suspect features highlighted 3 causes:

  • Brittle step definitions

  • Brittle features coupled to the UI

  • Tests with a high dependency on asynchronous behaviour

Holes in the data – step definitions

We only recorded the change rate of features files in Git. Features could be broken without ever changing the feature file, for example if a common step definition is broken. Next steps are to identify all the step definitions used by a feature and examine how often the step definitions have changed.

First find the change count for all the step definitions.

1
2
3
4
5
6
7
step_files = Dir["features/**/*_steps.rb"]

step_files.each do |step_file|
  change_logs = `git log --oneline #{step_file} 2>/dev/null`
  change_count = change_logs.split("\n").count
  puts "#{step_file},#{change_count}"
end

Then working out what step definitions a feature uses. We can do this by running cucumber with the json formatter and match up step definitions (and hence step definition files) to feature files:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
require 'json'

`cucumber features --dry-run --format json --out .json.out`
features_json = JSON.parse(File.read('.json.out'))

stats = Hash.new{|h,k| h[k] = []}
features_json['features'].each do |feature|
  feature_name = feature['name']
  #The JSON does not have the feature file. Find the file via the feature name. Messy
  feature_file = `egrep -riE "feature:? *#{feature_name}" features/`.split(":")[0]
  feature["elements"].each do |element|
    element["steps"].each do |step|
      file_location = step['match']['location']
      file, _ = file_location.split(":")
      if file =~ /_steps\.rb$/
        stats[feature_file] = (stats[feature_file] + [file]).uniq
      end
    end
  end
end
pp stats

Change rate vs Failure rate with step definition changes

Combining those two bits of data we can now add to our original graph the step definition change rates for a feature.

We also can examine an individual break down of the step definition change rates for a feature:

Holes in using the step definition change rate

The step definition changes from git are at the file level (the *_step.rb file) so a change in git may not touch a step definition used by a feature. Hence we may be counting changes which are not relevant for a feature. Further work would be to examine the git diffs and check if a change touched a step definition used by a feature.

Conclusions

Our tests hide lots of interesting information that can provide evidence of areas we can make improvements. It’s important to realise that like anything in statistics our data mining does not yield facts, just suggestions. At Songkick we are already mining this information with Cucumber and using it to help improve and learn about our tests.

Conferences and Accessibility: Please Try Harder

I speak at a lot of conferences around the world and I’ve been happy to hear a lot of healthy debate around the diversity of the programming community. I really enjoyed and felt inspired by the talk from Joshua Wehner on “Must It Always Be About Sex” at Nordic Ruby and Ruby Lugdunum.

It got me thinking about the issue of accessibility at conferences. I use a wheelchair and happily throw myself at any conference irrelevant of the obstacles. But not everyone is as crazy or as flexible as I am. Once at the conference the organisers are always willing to help but I wonder if we are not excluding disabled people who may dimiss the chance to talk or attend as unfeasible due to lack of information about access. Even if thats only a few people, those people are valuable members of our communities and deserve the chance to participate if they want to.

How would we ever know if we were discriminating, if no-one turns up in a wheelchair we are none the wiser.

I understand conferences are expensive and complicated events to organise. I would ask two small things of every conference organiser:

1. Try and select an accessible venue if possible. For both speakers and attendees.

2. Make it clear on the conference website if the venue, party, hack sessions are accessible or not.

My life was profoundly changed through speaking and attending conferences, in many ways beyond programming. I hope we can make sure anyone who wants that opportunity won’t be put off.

Startups Infiltrating Agile and XP

I’m very excited to be talking at this years Agile2011 conference. Its a goal I’ve been working towards for a long time.

I’ve been to lots of agile conferences where I have listen to consultants talk about how they have made their multi-nationals or large organisations better through x. Interesting but often not much I could take away and apply at a startup.

Sadly there are only 4 talks at Agile2011 tagged with Startups and 6 tagged with Lean Startup.

I’m passionate about startups and I want there to be more focus at agile events around them. Having spoken to lots of startups all over the world and living in the London startup hub, I all to often hear people dismiss Agile and XP events as not relevant. While there is a lot of activity around the Lean Startup little is said about Extreme Programming and startups. The rich experience of the Agile community tends to  get lost. The Lean Startup has provided a movement which has helped deliver ideas from agile to startups, something agile conferences have failed at.

I’m happy to be injecting a dose of startup with my talk at Agile2011 about Acceptance testing in the world of the startup. I’ll be drawing from my experiences of working at the startup Songkick.com. Looking at the things we did badly, things we did well and things we still have no clue about.

I’m also helping organise events in London and other major cities around Extreme Programming and Startups.

The Art of Cucumber London Workshop

I’ll be hosting a ½ workshop “The Art of Cucumber” at Skillsmatters in London. If you are interested in learning more about Cucumber, understanding how Cucumber can fit in software development and patterns for writing healthy Cukes this workshop is for you.

The workshop will also give you a sneak peek into my talk at Agile2011 (Salt Lake City) and Spa2011 (London) without the big conference ticket price.

Book a place: http://theartofcucumber.eventbrite.com/