the(art).of << fine.code

Technical Blog from the team at Redbubble.

Nokogiri Goes Bump (or Segfaults) in the Night...

| | Comments

Recently we’ve been working on upgrading the version of Ruby the Redbubble site runs on, from 1.8.7 to 1.9.3. We’re doing this for a number of reasons, including improved performance, new language features, and trying to stay relatively current with our tech.

Not for Recycling by Flibble

Not for Recycling by Flibble

Things went fairly smoothly until we hit a problem where we could get one of our rspecs to segfault (i.e. actually crash the ruby interpreter) every time we ran it on our (OSX Lion) development machines:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
ruby(89240,0x7fff7b133960) malloc: *** error for object 0x24: pointer being freed was not allocated
*** set a breakpoint in malloc_error_break to debug

Or sometimes we would see the more dramatic:

.../Users/joff/.rvm/gems/ruby-1.9.3-p194@redbubble/gems/activesupport-3.0.12/lib/active_support/whiny_nil.rb:48: [BUG] Segmentation fault
ruby 1.9.3p194 (2012-04-20 revision 35410) [x86_64-darwin11.4.0]

-- Control frame information -----------------------------------------------
c:0041 p:0054 s:0161 b:0161 l:000160 d:000160 METHOD /Users/joff/.rvm/gems/ruby-1.9.3-p194@redbubble/gems/activesupport-3.0.12/lib/active_support/whiny_nil.rb:48
c:0040 p:---- s:0154 b:0154 l:000153 d:000153 FINISH
c:0039 p:0019 s:0152 b:0152 l:000151 d:000151 METHOD /Users/joff/.rvm/gems/ruby-1.9.3-p194@redbubble/gems/mail-2.2.19/lib/mail/header.rb:165
c:0038 p:0123 s:0148 b:0144 l:000143 d:000143 METHOD /Users/joff/.rvm/gems/ruby-1.9.3-p194@redbubble/gems/mail-2.2.19/lib/mail/header.rb:160
c:0037 p:0033 s:0137 b:0136 l:000135 d:000135 METHOD /Users/joff/.rvm/gems/ruby-1.9.3-p194@redbubble/gems/mail-2.2.19/lib/mail/message.rb:1105
c:0036 p:0018 s:0131 b:0131 l:000130 d:000130 METHOD /Users/joff/.rvm/gems/ruby-1.9.3-p194@redbubble/gems/mail-2.2.19/lib/mail/message.rb:581
c:0035 p:0048 s:0127 b:0127 l:000126 d:000126 METHOD /Users/joff/.rvm/gems/ruby-1.9.3-p194@redbubble/gems/actionmailer-3.0.12/lib/action_mailer/test_case.rb:48

-- Control frame information -----------------------------------------------
c:0041 p:0054 s:0161 b:0161 l:000160 d:000160 METHOD /Users/joff/.rvm/gems/ruby-1.9.3-p194@redbubble/gems/activesupport-3.0.12/lib/active_support/whiny_nil.rb:48
c:0040 p:---- s:0154 b:0154 l:000153 d:000153 FINISH
c:0039 p:0019 s:0152 b:0152 l:000151 d:000151 METHOD /Users/joff/.rvm/gems/ruby-1.9.3-p194@redbubble/gems/mail-2.2.19/lib/mail/header.rb:165
c:0038 p:0123 s:0148 b:0144 l:000143 d:000143 METHOD /Users/joff/.rvm/gems/ruby-1.9.3-p194@redbubble/gems/mail-2.2.19/lib/mail/header.rb:160
c:0037 p:0033 s:0137 b:0136 l:000135 d:000135 METHOD /Users/joff/.rvm/gems/ruby-1.9.3-p194@redbubble/gems/mail-2.2.19/lib/mail/message.rb:1105
c:0036 p:0018 s:0131 b:0131 l:000130 d:000130 METHOD /Users/joff/.rvm/gems/ruby-1.9.3-p194@redbubble/gems/mail-2.2.19/lib/mail/message.rb:581
c:0035 p:0048 s:0127 b:0127 l:000126 d:000126 METHOD /Users/joff/.rvm/gems/ruby-1.9.3-p194@redbubble/gems/actionmailer-3.0.12/lib/action_mailer/test_case.rb:48

... REMOVED MASSIVE AMOUNTS OF STACKTRACE ....

-- Other runtime information -----------------------------------------------

* Loaded script: /Users/joff/.rvm/gems/ruby-1.9.3-p194@redbubble/bin/rspec

* Loaded features:

    0 enumerator.so
    1 /Users/joff/.rvm/rubies/ruby-1.9.3-p194/lib/ruby/1.9.1/x86_64-darwin11.4.0/enc/encdb.bundle
    2 /Users/joff/.rvm/rubies/ruby-1.9.3-p194/lib/ruby/1.9.1/x86_64-darwin11.4.0/enc/trans/transdb.bundle
    3 /Users/joff/.rvm/rubies/ruby-1.9.3-p194/lib/ruby/site_ruby/1.9.1/rubygems/defaults.rb
    4 /Users/joff/.rvm/rubies/ruby-1.9.3-p194/lib/ruby/1.9.1/x86_64-darwin11.4.0/rbconfig.rb

 2635 /Users/joff/.rvm/gems/ruby-1.9.3-p194@redbubble/gems/mail-2.2.19/lib/mail/elements/address_list.rb
 2636 /Users/joff/.rvm/gems/ruby-1.9.3-p194@redbubble/gems/mail-2.2.19/lib/mail/fields/to_field.rb
 2637 /Users/joff/.rvm/gems/ruby-1.9.3-p194@redbubble/gems/mail-2.2.19/lib/mail/fields/subject_field.rb
 2638 /Users/joff/dev/redbubble-19/lib/red_cloth/formatters/compact_text.rb

[NOTE]
You may have encountered a bug in the Ruby interpreter or extension libraries.Bug reports are welcome.
For details: http://www.ruby-lang.org/bugreport.html

Abort trap: 6

A bit of searching around suggested that the Nokogiri gem might be at fault here, specifically during Garbage Collection.

We tested this by adding a ‘GC.disable’ line at the top of our spec, which prevented the crash from happening. Unfortunately, simply turning off garbage collection isn’t a viable option, so a fix needed to be found.

Further research seemed to suggest that the issue lies somewhere in the interaction between the Nokogiri gem and libxml2, and that using a newer version of that library could alleviate the problem.

As we use homebrew on our development machines, we wanted to use that to recompile Nokogiri with the updated library. First, we installed some pre-requisites:

libiconv

Recent versions of homebrew do not provide libiconv, so we compiled it ourselves:

1
2
3
4
5
6
7
8
curl "http://ftp.gnu.org/pub/gnu/libiconv/libiconv-1.13.1.tar.gz" -o /tmp/libiconv-1.13.1.tar.gz
cd /tmp
tar xvfz libiconv-1.13.1.tar.gz
cd libiconv-1.13.1
# This places it alongside your brew installations
./configure --prefix=/usr/local/Cellar/libiconv/1.13.1
make
sudo make install

libxml2 and libxslt

1
brew install libxml2 libxslt

Homebrew does provide these libraries, but by default brew won’t symlink those into the library path. This is because they are already provided by OSX, and are somewhat core to its operation, and putting them on the path could screw things up.

This is fine, as we explicitly point to them when we compile Nokogiri.

Nokogiri

We also use bundler, so instead of manually compiling Nokogiri, we can configure bundler with the compile flags we need:

1
bundle config build.nokogiri --with-xml2-include=/usr/local/Cellar/libxml2/2.7.8/include/libxml2 --with-xml2-lib=/usr/local/Cellar/libxml2/2.7.8/lib --with-xslt-dir=/usr/local/Cellar/libxslt/1.1.26 --with-iconv-include=/usr/local/Cellar/libiconv/1.13.1/include --with-iconv-lib=/usr/local/Cellar/libiconv/1.13.1/lib

Then we can run a bundle install and Nokogiri will be compiled with the correct libraries.

Even after this we were seeing segfaults. Something was still not quite right. We would see in our output, a message like “Nokogiri was built against LibXML version 2.7.8, but has dynamically loaded 2.7.3”

Doing a nokogiri -v showed:

1
2
3
4
5
6
7
8
9
10
warnings: []
nokogiri: 1.4.3.1
ruby:
  version: 1.9.3
  platform: x86_64-darwin11.4.0
  engine: ruby
libxml:
  binding: extension
  compiled: 2.7.8
  loaded: 2.7.8

Which looked correct. So why was it loading the older libxml version when we ran the spec? It turns out that other gems in our Gemfile were also using libxml2, and they were loading the version supplied by the operating system. Then when Nokogiri would load, it would just use the already loaded library. By changing our Gemfile to list Nokogiri at the top, it would ensure the newer library would be loaded before the older one had a chance to, and thus (hopefully) fix the issue.

Once we did this (and removed Gemfile.lock and ran bundler again), our spec showed much better results:

1
2
3
4
5
$ rspec spec/mailers/activity_mailer_spec.rb
...........

Finished in 6.63 seconds
11 examples, 0 failures

New Cup of Awesome by breathee


Psst! Do you enjoy figuring out tough little chestnuts like the one above? Redbubble is looking for awesome people to join the engineering team! Go have a look over at http://redbubble.com/jobs and see if something there tickles your technical fancy…

Comments