Weird behaviour from Nokogiri on Solaris
At work I have been playing around with Webrat to see if it can speed up our acceptance tests. We have been using Selenium, but the tests are taking far too long to run (somewhere around 40 minutes or so, rough guess). And the majority of the tests are not testing JavaScript or AJAX.
In the tests that I did, Webrat worked amazing with a huge speed increase. One test went from 134 seconds down to 2.5 seconds. The glitch is that Webrat is designed to be used with a Ruby web app–Rails, Merb, Sinatra, etc–and we want it to test a Java app running in Tomcat. Thankfully, this is fairly easy to do. You just need to set the mode to Mechanize and add a few lines to include Webrat::Methods and Webrat::Matchers into RSpec.
The problem we ran into was when we tried to run the tests on our Solaris box. Nokogiri is only returning the doctype and comment parts of the fetched HTML. Running the following in IRB only returns the doctype:
>> require 'mechanize'
>> require nokogiri'
>> a = WWW::Mechanize.new
>> r = a.get("http://www.google.com")
>> s = Nokogiri::HTML(r.content.to_s)
>> puts s
=> <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" "http://www.w3.org/TR/REC-html40/loose.dtd">
On Ubuntu and my MacBook Pro, this prints out the entire HTML, under Solaris, it only prints out the Doctype. I have raised a ticket with Nokogiri, but am wondering if anyone else has experienced this? The only difference between Solaris and our Ubuntu dev boxes is the version of Ruby, 1.8.6 p0 and 1.8.6 p111 respectively.