Notes to self

Comparing wkhtmltopdf to Prawn for generating PDF in terms of speed, memory and usability

HTML-to-PDF or Prawn? Let’s look how these compare by generating an invoice using PDFKit and InvoicePrinter and examine performance, memory and usability.

When I announced InvoicePrinter someone on Reddit asked me what’s really better in comparison to solutions based on wkhtmltopdf.

I replied:

The trade-off [of wkhtmltopdf] is using an external system dependency which always can introduce some troubles and probably higher memory consumption (although I should probably write some benchmark to see if that’s really valid point). I am definitely not against wkhtmltopdf but since I was writing this for a Rails Engine that might be added to existing projects having only Ruby dependency is a win.

Now I would like to put my assumption for test. Do we really get at least less memory consumption by using Prawn? The test is really simple, generate a 1-page-long simple invoice with one logo. Nothing else. Hell, the examples are not even exactly the same (I didn’t feel like coding my InvoicePrinter layout to HTML today so I grabbed a first example I found on the internet). Nevertheless I still think the following benchmarking shows us something.

Speed

So, first, regarding the speed:

require 'benchmark/ips'

# Loading libs beforehand
require 'pdfkit'
require 'invoice_printer'

Benchmark.ips do |x|
  x.report("InvoicePrinter (Prawn)") {
    # edited /examples/simple_invoice.rb, one logo loaded from file system
  }
  x.report("PDFKit (wkhtmltopdf)") {
    html = <<HTML
# https://www.nextstepwebs.com/open-source/invoice using the same logo loaded from file system
HTML
  }
  x.compare!
end

And the results:

$ ruby benchmark.rb
Warming up --------------------------------------
InvoicePrinter (Prawn)
                         5.000  i/100ms
PDFKit (wkhtmltopdf)     1.000  i/100ms
Calculating -------------------------------------
InvoicePrinter (Prawn)
                         55.480  (± 5.4%) i/s -    280.000
PDFKit (wkhtmltopdf)      1.579  (± 0.0%) i/s -      8.000

Comparison:
InvoicePrinter (Prawn):       55.5 i/s
PDFKit (wkhtmltopdf):        1.6 i/s - 35.13x slower

Wow. Prawn-based solution just gets the job 35 times faster. Actually using just time utility you would see only 3 times better performance (starting the standalone script), but Benchmark/ips actually runs the code more times, so where Prawn solution got really optimized and scales, wkhtmltopdf did not. I suspect this is due to shelling out and calling wkhtmltopdf binary.

Even if we reduce the HTML to minimum:

html = <<HTML

HTML

We are still 31.28x slower with Benchmark/ips so the initial costs are already high.

This means I can generate 100 invoices on my hardware in slightly under 3 seconds with Prawn but it takes one minute with wkhtmltopdf. Wondering if that approach can be somehow further optimized.

Memory

So what about memory?

/usr/bin/time -v ruby invoice_printer.rb
	System time (seconds): 0.02
...
	Elapsed (wall clock) time (h:mm:ss or m:ss): 0:00.28
...
	Maximum resident set size (kbytes): 19784

/usr/bin/time -v ruby pdfkit.rb
	System time (seconds): 0.08
...
	Elapsed (wall clock) time (h:mm:ss or m:ss): 0:00.85
...
	Maximum resident set size (kbytes): 75060

I am using the basic /usr/bin/time -v (gives me somehow reasonable numbers on Fedora, but I am no expert) and looking for a RAM peak memory. For InvoicePrinter we used almost 20 MB and for PDFKit over 75 MB. That’s almost 4 times as much! And we all know projects running on basic VPS with little memory, right?

Size

Let’s also examine another important thing. The output PDF size. In this matter and for the example above I got around 10 kB for Prawn’s output and almost 30 kB for wkhtmltopdf. But to be fair, you can style much more using CSS, so what if we remove all CSS altogether? 15 kB! We still take half as much without any styling.

Usability & variety

Now the goal for both Prawn and InvoicePrinter is actually to be super easy to use. No system dependencies means no provisioning of extra program for the server and only one requirement - Ruby. InvoicePrinter then goes one step further and don’t require you to style anything. You can seriously start in seconds.

Styling with CSS is on the other hand probably much easier for current web developers than learning Prawn. Being able to style PDF documents using CSS is a big win and sometimes necessary. It can also help you to reuse the styling for regular HTML template. InvoicePrinter offers only one opinionated layout and pure Prawn takes some time to learn (and even then can only go so far).

Conclusion

With this little test I confirmed myself my assumptions about the performance of wkhtmltopdf. Using Prawn-based solution is 3-35 times faster (35 on a server generating one document after another), 4 times less memory hungry in peak and most likely generates PDF documents smaller in size. On top of this it can be done with pure Ruby alone. For me personally those are big wins.

If you didn’t yet, I suggest you check out Prawn and InvoicePrinter.

Note: Examples were ran on my ThinkPad T420s running Fedora 24 with Ruby 2.3.1

Check out my book
Deployment from Scratch is unique Linux book about web application deployment. Learn how deployment works from the first principles rather than YAML files of a specific tool.
by Josef Strzibny
RSS