Web Development & Programming Thoughts

By Timothy Elliott

How to git-bisect the Ruby Nokogiri Library

6/28/2011

I recently ran into Nokogiri issue #488 - version > 1.4.4 produces duplicate elements when using Nokogiri::HTML with an invalid HTML doc .

I was able to write a test that reproduced the issue and was curious if git can bisect it down to the commit that caused it. Here are my notes -- they might come in handy if you are trying to bisect another ruby library.

I am also looking for feedback on how the bisect cycle could be made more efficient.

  1. Clone the Nokogiri repository

    $ git clone https://github.com/tenderlove/nokogiri.git
  2. Write a failing test

    The test that I wrote in this case consists of an html file and a unit test , test/test_document2.rb.

  3. Start a git-bisect session

    $ git bisect reset
    $ git bisect start
  4. Give git-bisect the last known good version

    As you can see from the bug report , the last known good version was v1.4.4.

    $ git bisect good v1.4.4
  5. Compile Nokogiri

    $ rake clean; rake compile
  6. Run your test

    $ ruby -Itest -Ilib test/html/test_document2.rb
  7. Tell git-bisect whether your test failed or succeeded

    This should obviously be "bad" on the first run.

    $ git-bisect bad
    # or
    $ git-bisect good

    If the bisection is able to infer which commit caused the bug, git will print out the details of the offending commit and you are done.

    Otherwise, git will already have fetched the next version that needs to be tested into your working directory. In this case, go back to 5. Compile Nokogiri.