Monday, July 23, 2007

website screen capture in ruby - HowTo

For one of the projects that I am working on, I needed a website capture utility, that will dump output into a png file. I couldn't find any one-stop-shop for getting it done. There are many websites that have this service (see references towards the end in the post), however I needed something where I have access to the source code, so that I could tweak it to my needs Also something in ruby was prefereable, so that I can integrate it with other stuff I am working on. Google, google and google and finally I hit upon an implementation in ruby called moz-snapshooter.rb, which has got a good screen capture facility that uses 'gtkmozembed' , which is an embeddable Gecko rendering widget that GTK applications can embed. This was pretty much it. All that was needed was the ruby binding for gnome2 libraries etc. They can be found here.

So here is the final deal (on ubuntu Dapper) -

1. Get the ruby-gnome2-all package from here

2. Get the firefox-dev package using apt-get. (Note this is required for building the gtkmozembed.)

3. Get the moz-snapshooter.rb

4. Edit it to your needs

5. Fire ruby moz-snapshooter.rb

Most of the things just work out of the box.

Now comes a little tricky part. All of this requires an X server running. So what if one wants to run it on a headless X server? The answer is Xvfb (which is a virtual frame buffer based X server.) Continuing further.

6. Install Xvfb using apt-get.

7. Finally run the 'ruby moz-snapshooter.rb' inside 'xvfb-run' as follows
xvfb-run -s "-screen 800x600x16" ruby moz-snapshooter.rb

Note to pass -s "-screen 800x600x16" option, since most of the websites work well in that resolution.

Caveats:

1. The libgtk-mozembed-ruby package that comes for 'ubuntu' uses the gtkembed.so library provided by mozilla 1.7.X (which is not compiled with pango renderer. It is better to use firefox-dev library and hence build by hand the ruby-gnome2-all package.)

2. The final output resolution image should be of smaller resolution than the one passed to xvfb-run.

3. This is not 'perfect' yet, but a good starting point.

Here are some samples.

1. Google in 150x100 resolution -



2. paahijen.com in 320x200 resolution -



3. Marathi wikipedia in 640x480 resolution -




References :

1. A pygtk implementation by Andrew McCall
2. moz-snapshooter.rb by Mirko Maischberger
3. Ideum screenshot capture prototype

8 comments:

Abhijit said...

one more caveat. Requires LD_LIBRARY_PATH set to /usr/lib/firefox

Unknown said...

Have you had any luck getting this to run in a web server? I've played about with moz_snapshooter and got it working from a command line just fine, but when I try to run it from within a mongrel or webrick server I get a segfault and the whole server dies. Not good :-(

Abhijit said...

I think that is what I am going to be trying next, Guess is the LD_LIBRARY_PATH issue actually is causing the trouble for you in webrick or mongrel? Will definitely update this once I have some luck with this.

Abhijit said...

Just a suggestion, you'd have already tried it, in case you have not.

Try to put the firefox-dev library path in the /etc/ld.so.conf eg. add the following line to /etc/ld.so.conf
"/usr/lib/firefox" and then do "sudo ldconfig" to make sure the required libraries are accessible and then try it inside mongrel or webrick if there's any luck?

'Cos when I gdb'ed the SegFault, i figured it was not able to find out some firefox library (most probably the libgtkembed.so or something like that.. ) Here's the ldd output with firefox dependencies shown.

$ldd gtkmozembed.so | grep -i fire
libgtkembedmoz.so => /usr/lib/firefox/libgtkembedmoz.so (0xb7e19000)
libxpcom.so => /usr/lib/firefox/libxpcom.so (0xb77e5000)
libxpcom_core.so => /usr/lib/firefox/libxpcom_core.so (0xb75a2000)


In case this helps.

marcusherou said...

Hi guys! I love your article. I have modified the script a little so it can use arguments. Now I'm thinking if it could use an already running xvfb instead of xvfb-run of performance reasons.

Any ideas ?

Unknown said...

Hi,

perfect article. However I cannot get is working :(

What am I missing?

When I'm trying to run it, I'm getting the following error:

root@jam:~# ruby moz-snapshooter.rb
/usr/local/lib/site_ruby/1.8/gtk2.rb:13:in `init': Cannot open display: (RuntimeError)
from /usr/local/lib/site_ruby/1.8/gtk2.rb:13
from /usr/local/lib/site_ruby/1.8/gtkmozembed.rb:1:in `require'
from /usr/local/lib/site_ruby/1.8/gtkmozembed.rb:1
from moz-snapshooter.rb:12:in `require'
from moz-snapshooter.rb:12


root@jam:~# xvfb-run -s "-screen 800x600x16" ruby moz-snapshooter.rb
/usr/local/lib/site_ruby/1.8/gtk2.rb:13:in `init': Cannot open display: :99 (RuntimeError)
from /usr/local/lib/site_ruby/1.8/gtk2.rb:13
from /usr/local/lib/site_ruby/1.8/gtkmozembed.rb:1:in `require'
from /usr/local/lib/site_ruby/1.8/gtkmozembed.rb:1
from moz-snapshooter.rb:12:in `require'
from moz-snapshooter.rb:12
kill: 158: No such process

Any idea? System is Ubuntu Feisty x64 Server.

Unknown said...

check if you have xfonts-base package installed

and try this command.

xvfb-run -s "-screen 0 800x600x16" ruby moz-snapshooter.rb

Unknown said...

works like a charm! i modified the ruby code to pass arguments. thanks for the post!