Posted by & filed under Books, Reviews, Techie.

Review of Packt Publishing’s “Getting Started with Phantom JS” by Aries Beltran

I was asked by Packt to review several books, but I chose “Getting Started with PhantomJS“ because I was actually interested in it! I’ve used various faceless web browsers before, in particular webkit in GTK applications like wkhtmltopdf or with python or PHP bindings. wkhtmltopdf (and the related wkhtmltoimage) in particular has been suffering from neglect – it has lots of depedencies and it’s difficult and unreliable to build and use (something I’ve written about before). I’ve most often used server-side browsers for tasks like generating page preview images, and seeing that phantomjs will do that, I’ve long thought I’d like to know more about it so that I could get away from custom-building complicated webkit stacks.

As far as I’m concerned, the two key things that phantomjs brings are a straightforward build process (a simple ‘brew install phantomjs’ for me) and a simple way of scripting the virtual browser, with the ability to inject scripts into the page, without having to resort to peculiar tricks, messing with virtual frame buffers or installing odd browser plugins.

I hadn’t realised phantomjs’s own scripting environment was quite so complete, supporting commonJS module integration. The separation between browser and page contexts (using `evaluate`) is clean and easy to get to grips with, and the book presents this well.

The book mentions several extensions to phantomjs that I had not encountered and look useful (particularly casper).

I hadn’t spent much time reading phantomjs’s own docs, but when I looked at them I found that they are very limited. Even though it’s not long, the book goes deeper into examples and explanations than the docs, so there is genuine value in having the book.

All of the example code I tried worked without a hitch. Packt’s web site sets a wrong MIME type on zip downloads, resulting in a page full of rubbish, but it unzips ok when saved manually. There are more example files than are mentioned in the book, which is a welcome bonus.

One small error I spotted suggested using single quotes around JSON values – that’s not valid JSON, though it is valid Javascript. It also mischaracterises JSON slightly – object syntax is part of the Javascript language, so it’s not a separate thing when you’re already in a Javascript context.

One formatting issue costs a little typing – while all the code samples are provided as text and in files, all the displayed command lines (for example when a long URL is passed as a param) are in images, so you can’t copy and paste commands as text. Call me lazy!

The English is generally good, concise and to the point. This is not a long book, but it doesn’t need to be as a “getting started” guide on something that is a pretty confined subject. The editing had a few holes – several typos had sneaked through, things that would have been caught by any spell checker. The code samples had been updated recently, but there were no errata. Oddly, getting to errata is annoying on Packt’s site – when you’re looking at a book’s page there is no link for it. You have to go to the “code and errata page” and select the book from the pop-up menu. This menu is sorted by the exact book title (and contains ALL their book titles!), so I had to look under ‘getting…’ rather than ‘phantomjs’. This could be made much easier.

There are several other books and resources for learning and using phantomjs, and they may be sufficient for some users as it is a fairly small subject to cover. Overall I was impressed with the book. It does exactly what the title says, provides useful links for further reading, and provides effective, useful scripts that cover much of what many will want phantomjs to do in sufficient detail to make it easy to derive your own. Well done Aries!

Posted by & filed under Uncategorized.

I needed to set up my rear derailleur from scratch yesterday and thought up a nice simple mechanism for doing it that I’ve not seen described before. This is for a ‘normal’, not inverse-pull derailleur, where increased gear cable tension makes it change down.

  1. Put the bike in middle ring at the front, set the rear shifter to top gear.
  2. There should be no tension on the gear cable – you may not even have it connected at this point.
  3. Adjust the limit screw so that the bike pedals smoothly in top gear with no clicking or rubbing.
  4. Screw in the barrel adjuster on the derailleur (and the shifter) as far as it will go.
  5. Pull the gear cable tight with your fingers and tighten the retaining bolt.
  6. On the shifter, change down ONE gear (e.g. 8th if you have a 9-speed cassette).
  7. While turning the pedals, turn the barrel adjuster until it shifts into the selected gear.
  8. Adjust so it’s not rubbing and that the top jockey wheel aligns nicely.
  9. Check that it changes into all the gears smoothly when changing both up and down.
  10. You may also want to set the limit screw for 1st gear as well.
  11. Job done!

This whole procedure can be done in about a minute – the important bit is step 7. You may need to fine-tune the barrel adjuster slightly in some lower gears, but this procedure will get you to the right ballpark with minimal effort. It’s much easier if you have a workshop stand or similar means of holding the back wheel off the ground.

Posted by & filed under PHP, Techie.

Wkhtmltopdf is extremely cool. I’ve used qtwebkit for generating server-side page images before using python-webkit2png, and that’s fine (unlike using Firefox running in xvfb!), but I need to produce PDFs. So, I looked around and found several neat, simple PHP wrappers for calling wkhtmltopdf, and even a PHP extension. “Great”, I thought, “I’ll just install that and spend time working on the layouts since the code looks really simple”. I spoke too soon.

To use it requires that you have a working copy of wkhtmltox and libwkhtmltox. Getting those is not as straightforward as it should be, and the docs are really pretty inadequate (hence this post). For Linux there is a simple download of a binary, but the OS X version (despite being the most recent version posted) is curiously supplied as an OS X app bundle. When you run it one of two things happens: nothing, or an interminable bounce requiring a force-quit, i.e. as supplied it’s apparently useless, though I eventually solved this mystery. In a bug report (why does anyone use google code? it’s horrible!) I found a reference to a binary lurking inside the app bundle, and sure enough, it’s there, and it works. Here’s the magic to make it accessible in a ‘normal’ way:

sudo ln -s /Applications/wkhtmltopdf.app/Contents/MacOS/wkhtmltopdf /usr/local/bin

That could well be enough for many uses, but this version is built for 32-bit OS X 10.4, which makes it about 327 in computer years. Homebrew has a recipe for wkhtmltopdf, but it’s not built against a custom qt stack, and so is missing several features. I figured it would be worth trying to do better than that, targeting 64-bit 10.7, so I found some build instructions (thanks to comments on Mar 13, 2012 on this page and this one (no, google code doesn’t provide IDs for comments, duh)) which I was able to adapt.

Environment

Before starting, make sure you have the latest Apple toolchain: Run system update, then run XCode, go to preferences -> downloads and make sure you’ve got the latest command line tools installed. You may also want to check your shell’s environment vars. I use these in my /etc/zshenv:

export MACOSX_DEPLOYMENT_TARGET=10.7
export CHOST='x86_64-apple-darwin11'
export CFLAGS='-arch x86_64 -O3 -fPIC -mmacosx-version-min=10.7 -pipe -march=native -m64'
export LDFLAGS='-arch x86_64 -mmacosx-version-min=10.7'
export CXXFLAGS=${CFLAGS}

Those settings suit my MacBook Air: yours may need to be different.

Building QT

Compiling against the specially wkhtmltopdf-patched version of Qt adds several features to wkhtmltopdf that are not available in most distributed and/or statically compiled versions:

  • Printing more then one HTML document into a PDF file.
  • Running without an X11 server.
  • Adding a document outline to the PDF file.
  • Adding headers and footers to the PDF file.
  • Generating a table of contents.
  • Adding links in the generated PDF file.
  • Printing using the screen media-type.
  • Disabling the smart shrink feature of webkit.

It’s normal for packaging systems like MacPorts, HomeBrew and Fink not to add this dependency as it makes the build very large and take a long time, and these features just may not be needed for many users – those packagers could perhaps add custom-Qt ‘flavours’ of the builds so it’s at least possible without straying outside the packager, though it could have implications for other packages built against Qt, of which there are many.

First we need to compile a copy of the the qt library, and to do that we have to get the whole thing, even though we’re only going to use some of it.

git clone git://gitorious.org/+wkhtml2pdf/qt/wkhtmltopdf-qt.git
cd wkhtmltopdf-qt
git checkout staging

This takes quite a while since it’s a 970M download! In order to make it compile for x86_64 we need to change the arch option in the build config, and tell it where the 10.7 SDKs are (they’ve moved since 10.4). So I edited configure on line 4875 (of 9133 – this is a BIG configure file!) to look like this:

echo "export MACOSX_DEPLOYMENT_TARGET = 10.7" >> "$mkfile"

Now we can set it up to build, specifiying the location of the 10.7 SDK and the x86_64 arch, and deleting any references to the x86 arch (if you leave it in it may try to build for both):

QTDIR=. ./bin/syncqt
./configure -sdk /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.7.sdk -arch x86_64 -release -static -fast -exceptions -no-accessibility -no-stl -no-sql-ibase -no-sql-mysql -no-sql-odbc -no-sql-psql -no-sql-sqlite -no-sql-sqlite2 -no-qt3support -xmlpatterns -no-phonon -no-phonon-backend -webkit -no-scripttools -no-mmx -no-3dnow -no-sse -no-sse2 -no-ssse3 -qt-zlib -qt-libtiff -qt-libpng -qt-libmng -qt-libjpeg -openssl -graphicssystem raster -opensource -nomake "tools examples demos docs translations" -no-opengl -no-dbus -no-framework -no-dwarf2 -no-multimedia -no-declarative -largefile -rpath -no-nis -no-cups -no-iconv -no-pch -no-gtkstyle -no-nas-sound -no-sm -no-xshape -no-xinerama -no-xfixes -no-xrandr -no-xrender -no-mitshm -no-xkb -no-glib -no-openvg -no-xsync -no-javascript-jit -no-egl -carbon --prefix=../wkqt/
make -j3
make install

This takes a long time. Using -j3 made a big difference on my 8-core Mac Pro and 4-core MacBook Air. Note that the configure step may use an i386 arch; that doesn’t mean that the build itself will.

Build the wkhtmltopdf app

wget http://wkhtmltopdf.googlecode.com/files/wkhtmltopdf-0.11.0_rc1.tar.bz2
tar xvjf wkhtmltopdf-0.11.0_rc1.tar.bz2
rm wkhtmltopdf-0.11.0_rc1.tar.bz2
cd wkhtmltopdf-0.11.0_rc1

This code also needs to be set to build for x86_64, so edit these two files: src/image/image.pro and src/pdf/pdf.pro and change this section in each:

macx {
#    CONFIG -= app_bundle
    CONFIG += x86_64
}

This sets them to build for 64-bit and not to omit building as an app bundle.
Now build it:

../wkqt/bin/qmake
make
sudo make install

This installs two app bundles in /bin/wkhtmltopdf.app and /bin/wkhtmltoimage.app.
When I tried to actually use it, I ran into the reason why it’s built as an app – it has dependencies on a qt component resource that needs to be bundled with it (why it needs a graphical menu resource when it has no GUI of any kind is beyond me!). To fix this I copied the necessary parts into the apps and set up symlinks to the binaries:

cd wkhtmltopdf-qt
sudo cp -pr src/gui/mac/qt_menu.nib /bin/wkhtmltopdf.app/Contents/Resources
sudo cp -pr src/gui/mac/qt_menu.nib /bin/wkhtmltoimage.app/Contents/Resources
sudo ln -s /bin/wkhtmltopdf.app/Contents/MacOS/wkhtmltopdf /usr/local/bin
sudo ln -s /bin/wkhtmltoimage.app/Contents/MacOS/wkhtmltoimage /usr/local/bin

After this running wkhtmltopdf --version gives:

Name:
  wkhtmltopdf 0.10.0 rc2

License:
  Copyright (C) 2010 wkhtmltopdf/wkhtmltoimage Authors.

  License LGPLv3+: GNU Lesser General Public License version 3 or later
  . This is free software: you are free to
  change and redistribute it. There is NO WARRANTY, to the extent permitted by
  law.

Authors:
  Written by Jan Habermann, Christian Sciberras and Jakob Truelsen. Patches by
  Mehdi Abbad, Lyes Amazouz, Pascal Bach, Emmanuel Bouthenot, Benoit Garret and
  Mário Silva.

The version number string is wrong (it’s supposedly 0.11.0-rc1) and there’s a bug report for that. We can check we’ve built for the right architecture too:

file /usr/local/bin/wkhtmltopdf
/usr/local/bin/wkhtmltopdf: Mach-O 64-bit executable x86_64

Building the PHP extension

First I needed to copy the libs and include files somewhere the compiler would find them:

cd ..
sudo cp -r wkhtmltopdf-0.11.0_rc1/include/wkhtmltox /usr/local/include
sudo cp wkhtmltopdf-0.11.0_rc1/bin/libwkhtmltox.* /usr/local/lib

For some reason it was building for i386 (which is no use with a 64-bit lib), and specifying a host of x86_64 didn’t work – it builds, but produces a .a library instead of a .so shared object, claiming that libtool couldn’t make shared objects. A bit of rummaging led me to the correct host type for 10.7 which allowed it to link correctly.

git clone https://github.com/mreiferson/php-wkhtmltox.git
cd php-wkhtmltopdf
phpize
./configure --host=x86_64-apple-darwin11.4.0
make
make install

After that I added extension=phpwkhtmltox.so to an appropriate ini file and PHP then listed the extension in php -m output. There are a couple of test scripts included with the extension files, so I ran php test_pdf.php, which makes a bunch of test PDFs in /tmp, and all looks pretty good. Don’t forget to restart apache if you want it to show up in there too.

I hope someone finds this useful.

Update May 18th 2012

I repeated this build on my MacBook Air and ran into several issues, and one section that worked completely differently to my original, so I’ve updated the article with these changes.

Update June 9th 2012

Added notes about what using the custom Qt libs buys you.