Tag Archives: metrics

Limiting Red: Smarter Test Builds Through Metrics

15 Aug

The Current State of the Art

In the Ruby world there is a wealth of metrics which can provide insight into our code. Looking at such things as:

When it comes to metrics involving our tests we have:

  • Code coverage (Rcov)
  • Tools to help identify missed edge cases (Heckle).
  • Random Testing tools (RushCheck)

Is that it? I think we can do better than that!

What useful metrics are we missing that our tests could provide and what should we be recording?

Recording Test Builds

Your using a Continuous Integration server right? Running all your tests at every checkin in your source control repository. The CI environment represents our pipeline in which all code needs to flow through. It tends to be the place where all of the tests are run before the code flows into the outside world. Hence this is a perfect environment to start capturing detailed metrics about all of our tests. It’s also not the end of the world if we add a little extra time to the test build in-order to capture these metrics.

Mining Metrics from Test Builds

What interesting things can we discover? Here are some suggestions:

  • Failure rates
    • Areas of your product which are prone to failure/bugs and tests which might be fragile. Perhaps highlighting area QA’s should focus extra attention to.
  • Flickering tests
    • If a test keeps failing and passing frequently.
  • Fragile Tests
    • An all or nothing feature where all the tests fail or none fail.
  • Never failing tests
    • Tests which have never failed, do we need to run them all the time, are they now redundant?
  • Average build failures a day
    • How often is the build broken.
  • Discover Shotgun Surgery
    • Small code changes broke all the tests!

What other metrics do you think would be useful?

Kent Beck is Smart

Kent Beck has some additional ideas, lets copy him and pretend to look smart.

Intelligent Selection of the Tests to Run

Kent Beck wrote a tool called JUnit Max which is a plugin for Eclipse and JUnit which helps programmers stay focused on coding by running tests intelligently.

Max fails fast, running the tests most likely to fail first.

One of the key principles behind this tool is that:

“Tests that failed recently are more likely to fail than tests which have never failed.”

Super Fast Feedback

If we prioritise the tests that failed recently and those which have been recorded as being likely to fail we increase the chance that a failure occurs early on in the test build. The closer the distance between pushing the code and knowing there is a fail the better.

One problem this helps alleviate is when a test fails 99% of the way through the build. To know you’re fix worked you have to sit and wait for the entire build to run.

CukeMax (alpha-1)

CukeMax is a project that aims to:

  • Provide a web service to record Cucumber test builds
  • Provide a web based interface to uncover juicy metrics about your tests.
  • Feed recorded metrics back into the running of tests prioritising those most likely to fail.
  • Cool stuff

CukeMax is intended to be used when you run your tests on your CI server. While this initial version just supports Cucumber there is no reason why it cannot be expanded to other test tools such as Rspec. I’m already using this for my own projects and I have a special version working at Songkick.com HQ.

Wanna Play?

You can browser around an example of the web interface at CukeMax - www.cukemax.com

Want to be one of the first Guinea pigs to try out CukeMax? Let me know.

The client tool will be leaked slowly into the world to ensure we can balance server load.

Whats next?

All I can say is there is a lot of activity around this project with some exciting tools in the pipeline

Also Matt Wynne has been working on some similar ideas and we are discussing if we can combine our thoughts.

Metrics for Plain Text Acceptance Tests

10 Nov

There has been lots of activity around the value of metrics for source code and tests. In the Ruby world tools like metric_fu provide a wealth of analysis.

While working on my Cucumber talk for Rails Underground I started investigating how we could apply metrics to the customer focused plain text of Cucumber. For those not familiar with Cucumber it’s an acceptance testing framework which allows non-technical people to write plain-text describing the behaviors of their system. The developers/testers map the plain-text to tests.

Having spent time teaching people about the plain-text side of Cucumber I often found myself recommending the same guidelines and plain-text anti-patterns. This lead me to think about providing metrics scoring the customers plain-text.

Why would we want plain-text acceptance test metrics?

  • Help plain-text beginners avoid bad practices early on.
  • Help improve the quality of plain-text
  • Help quality review with a large frequency of incoming features

Why does the quality of the plain-text matter?

Why focus on quality, the plain-texts primarily goal is to be easy for customer to use?

  • The developer builds the Domain specific language via mapping plain text to ruby. Higher quality plain-text could make it easier to manage these mappings without any major impact to readability.
  • Higher quality text is easier to read, edit and understand.

Who would find it useful?

Initially Developers.

  • In some scenarios the developers write the features from discussions and give to the customer to review.
  • Developers may tweak/review customer written changes/features.
  • Developers often edit/tweak plain-text from the customer to enable reuse of existing test code .
  • In open source projects often developers write Cucumber features. Metrics are something they are comfortable with.

Can you measure quality in plain-text?

First its important to distinguish acceptance tests from pure plain-text. Within acceptance tests we have some degree of structure, for example using Given/When/Then to describe scenarios.

Cucumber Example:

Scenario: Eating all cucumbers
  Given there are 5 cucumbers
  When I eat 5 cucumbers
  Then I should have 0 cucumbers

This structure reduces the complexity of analysing the quality of the text. It provides us with different structural elements which have different rules/guidelines on what their content should be.

The problem with measuring the quality of text is that it is far more subjective in than in code. So while we cannot be absolute in our assessment of quality we can try and codify smells that *could* indicate areas in the text that *could* be improved. This is pretty much true for all metrics, they are guidelines not absolutes (Dan Norths highlights the dangers of absolute metrics in the Parable of Metrics)

So what useful metrics could we look at?

Plain text Metrics

From my experience with Cucumber I would suggest examining:

Feedback

What do you think of the idea?

Can you think of any other useful plain-text metrics?

Ruby Metric-fu Hudson plugin

4 Oct

I have written a plugin for the continuous integration server Hudson which uses a metric-fu rake task at its core to build and present graphs representing different metrics over successful builds.
It currently supports:

Hudson with RubyMetricFu graphs

The source is available on Github:

http://github.com/josephwilk/rubymetricfu

Installing

Currently all Hudson’s plugins are stored in something called SVN. Being more of a GIT myself you have to manually install the plugin rather than using the automatic Hudon GUI install method.

Install steps:

  1. Ensure you have the Ruby and Rake Hudson plugins installed.
  2. Follow the metric-fu installation guide (http://metric-fu.rubyforge.org/)
  3. Ensure project code has a metrics:all rake task (auto added when you require metric-fu)
  4. Download the rubymetricfu.hpi plugin file
  5. Copy the file into your plugins folder within you Hudson install. Hudson’s default is ~/.hudson/plugins
  6. Restart hudson
  7. Go to the ‘configure’ link for a project and select the Ruby Metric-fu report option (see below)
  8. (Optionally) Pick which Rake version you want to use.

Setup Metric-fu

Future

  • Futher metrics to graph:
  • Configure which metrics you want on your project page.
  • Create a Crap4R meter using Rcov and Flog (Similar to Crap4J).
  • Better integration with the html reports generated by metric-fu.