The E-Book Workflow for Unofficial Charterpedia

[This post follows on from this one on law libraries as publishers]

I started pulling together an e-book version of the Department of Justice’s Charterpedia as soon as I saw it.  It seems to cry out for an alternate, portable version.  Also, there were some obvious opportunities to improve the text for a general legal audience.  Law libraries who struggle to identify value added opportunities might consider their role as a publisher.  In some cases, it’s enhancing free information to make it more broadly usable.  This is the workflow I used to create an ebook based on Charterpedia.

Skip the blather and just get the e-book.

Ebook Platform

The first thing you need is a platform.  I have used WordPress and Pressbooks in the past, and Pressbooks has been used by a number of open textbook sites.  Both are open source.  Both are free to acquire.

When I first installed the Pressbooks WordPress plugin a couple of years ago, it was pretty straight forward.  Since they have pulled the plugin from the WordPress site, it’s a lot more fiddly to get running and their documentation seems to have lapsed somewhat.

This time round – because of technical problems with the export of Pressbooks – I tried the Anthologize plugin.  It’s easier in some ways, but less featured.  However, it is easier in that you can drop it into any WordPress and create a book out of any collection of posts.  It uses a similar part concept, so you can group content into sections.

Once you have your tool selected and installed, you can release multiple publications on a single platform.  You don’t need to do it this way – you can create ebooks and edit them with programs like Sigil.  But I like the WordPress-based approach because it’s the best of both worlds.  At the end, readers can access a web page on any device or they can use an ebook.  But I only put in the content once.


One thing that sites like Charterpedia show is that content developed by a government entity for an internal audience may not be ready for an external one.  As you read through Charterpedia, there are significant supra references.  That’s fine if you’re familiar with the content, but the average reader would benefit from links each time to help them with context.  The same with case citations, some of which were for commercial legal publishers.  Many of these were actually freely available and using a free version will make the information more broadly accessible.

At the same time, the goal isn’t to boil the ocean.  If I was serving lawyers primarily dealing with criminal law, I might only convert the Legal Rights section of the Charterpedia.  So the resource can be adapted and narrowed to the law library’s users.

Editorial Fixes

The next thing to do is to populate the book.  Once you’ve added the content in WordPress, you can organize it.  This is easier in Pressbooks, where you can create chapters and other book components.  I decided to make each page of Charterpedia it’s own chapter.  But as I created the chapters, it became clear to me that this – like other government documents – needed some tweaking.

Here’s a book I did earlier, when I was first playing around with the concept, using the Ontario Residential Tenancies Act as a base.  I took a free public statute, enhanced it with cases, and created an ebook.  This is low hanging fruit:

  • the base content is free
  • landlord tenant is a large legal problem for many people, so an e-book may have access to justice possibilities, as well as being a library resource
  • lawyers and law librarians are buying annual statutes all the time and this could be a resource to provide to lawyers that they can contrast with an actual cost they incur

The Ontario statutes are numbered in a confusing way, however, so that it is often not clear to me what the subparagraph I’m reading actually applies to.  In this case, I took the liberty of renumbering the sections from (20), (1), (2) to (20), (20)(1), (20)(2).

Official Ontario statute numbering on the left. Unofficial renumbering for ebook on right.

There is a risk that doing this could fox someone who tries to cite to the subsection.  However, I think it’s a small risk.

I made the same decision with Charterpedia.  While the table of contents indicates Section 1, Paragraph 2(b), and so on, it’s really based around paragraphs.  Because you don’t want to do any editing of the base content, if this had been text, I’d have left alone.  But since it was chapter headings, I felt confident changing it.

And I didn’t change errors in the text.  This is a significant point: unless you’re an expert in the field, you need to err on the side of not changing anything.  So I left all the text alone, and inserted a couple of parentheticals where I was pretty sure a case citation was wrong.

The same goes for linking cases – I was very careful about getting the case right, and leaving some citations blank when I wasn’t sure.  But the typical law librarian has plenty of skill – particularly when the sleuthing involves an opinion where there’s a pinpoint cite to a paragraph – to figure what might otherwise be difficult citating.

There was also the work that I didn’t do.  Legal writing is often unfriendly and I didn’t try to fix that, even though I thought this ebook might have use for people unfamiliar with the law.  Especially the use of supra, which is barely acceptable in a print document, and really should have been hyperlinked throughout.

The link checking took a long time.  This was by far – 90% of the time expended – the heaviest lifting, because the citations weren’t uniform, they weren’t always correct, and they cited cases that weren’t published on CanLII.  All of this slows down the process, so if you’re planning to do something that is link intensive – and isn’t already hyperlinked – plan for the sheer grunt work.  I used:

and various other sites for references to UN, EU, and U.S. law.  A lot of this was repetitive.  If I were doing this a second time, I’d cut the HTML into a Word document, create a Table of Authorities to eliminate the duplication, and then do the linking based on the table.

The Ebook

This is not the full Charterpedia.  It covers Sections 1 through 14.  I copied off the entire Charterpedia and may eventually link up the rest of it, but I felt that it was enough to create this example to just complete the first part.

Download the unofficial Charterpedia

[not made in association with the Canadian government or, frankly anyone, not for commercial use or re-use, although you should feel free to redistribute it as much as you like]  If you are using Microsoft Edge, you may need to right-click the link and choose Save As, since Edge is an ePub reader.

Of course, once you’ve created an e-book – or populated WordPress with content – it requires updating and measuring.  Your audience will determine when the content needs to get updated.  I’d think annually would be fine.  The Charterpedia site is already out of date, as some cases that are listed as being before the Canadian Supreme Court have already been decided.

You can monitor changes to the content using change moniting sites, like  WordPress’ version control may help in doing a redline comparison of different versions, although you may not want to use it for actual editing.

Is it worth it?  If no-one is going to use an epub of an online site, then that’s a project to discontinue after you’ve done some measuring.  Use analytics to monitor downloads at the very least.  If it seems worthwhile, you can queue up a couple more that you think your law library patrons would like.




E-book Readers and Project Gutenberg

A new Android app called Gutenberg Books appeared recently.  Not surprisingly, it is an e-book reader that attempts to integrate with Project Gutenberg, the 45 year old digital text project.  I use a variety of e-book apps for reading on my Android tablet or phone.  Gutenberg Books is an interesting addition but it is not as good as other free e-book readers that have more seamless integrations with Project Gutenberg.

Gutenberg Books Gut, Could be … Besser

I wrote a review of Gutenberg Books – which enables access to the Pro version of the app – which boils down to this:

  • it’s an acceptable e-book reader with basic customization
  • it has good integration with Project Gutenberg, as you’d expect
  • it’s search is very weak and so the ultimate experience isn’t as strong as it could be

If I were to recommend an e-book reader that integrates with Project Gutenberg, it would be either Cool Reader or FB Reader.  Both are open source (and if you aren’t already using it, F-Droid is a great open source app store for finding these and other apps) and FB Reader’s integration with not only Project Gutenberg but many other online, free e-book services is exceptional.

Project Gutenberg’s collection is public domain titles.  I use it most often to download classics, like Jane Austen’s Persuasion or Erskine Childers’ The Riddle of the Sands, or history including American Civil War memoirs.  One of these latter is by an Alabama soldier named Samuel Watkins, and is quoted heavily in Ken Burns’ The Civil War documentary.  It’s a great book, particularly hysterical when discussing his encounters with a mule, so I wondered how easy it would be to find with each of these e-book readers.

Need to Search

Cool Reader doesn’t enable search or browse.  Once you have connected Project Gutenberg within the app, all you can do is pull down popular and random titles.  Weirdly, Gutenberg isn’t one of Cool Reader’s default libraries, but it is the example when you want to add a library.  Both Gutenberg Books and FB Reader return results.

FB Reader’s results look very much like the Project Gutenberg native search results.  It proposes a bunch of alternate search terms and then returns matches.  These include the book by Samuel Watkins.  Since Project Gutenberg is a default library for FB Reader, there’s no set up or configuration necessary to do the search.  You select the Libraries portion of the app, select Gutenberg, and you’re off.

FB Reader e-book app search results for "Samuel Watkins"
Search results in FB Reader e-book app for “Samuel Watkins”

Gutenberg Book’s search appears to be based on a boolean OR so that it looks at each of your keywords separately.  Not only did it not return Samuel Watkins memoir, it wouldn’t return Ulysses Grant’s despite that relatively unique name.

Gutenberg Books app search results
Screenshot of search results in Gutenberg Books app for “Samuel Watkins”

The inability to search over Project Gutenberg diminishes the otherwise nice approach this app takes.  Gutenberg Books is cleanly designed and very visually appealing.  Cool Reader has a very 1980s-feeling wooden background and even FB Reader starts off with an odd color combination.  But for all of its modern look, Gutenberg Books doesn’t deliver as good an app as FB Reader.

In addition to better integration with Project Gutenberg’s collection – although I’m not sure quite how they do it, exactly, since Project Gutenberg doesn’t allow automated access – FB Reader provides many more options for the actual e-book reader part of the app.  You can scroll in different directions, fiddle with the look and feel of the pages themselves, and access a variety of other e-book libraries.  If it supports the Open Publication Distribution System (OPDS), you can add it to FB Reader.  Libraries can set up their own OPDS offering, to create a custom library of e-books for, say, a law firm or corporate library.

Gutenberg Books is off to a good start but it has a long way to go to catch up with other free e-book readers with better Project Gutenberg integration.

Law E-Books Without Legal Publishers

Peter Martin, a law prof at Cornell, published a research paper on SSRN on possible futures of legal texts.  He gives a history of both the consolidation of the US publishers as well as the digital shift in law library collections.  He argues that legal publishers have missed an opportunity by shifting their commentary from print to a linear e-book format, which are in fact a step backwards.  This has left an opening for others to create commentary using blogs, wikis, and other online forms.

I don’t believe crowdsourcing in legal information works.  We’re not like the open source developers.  I identified a few examples and Bob Ambrogi looked at it recently and more broadly.  It is attractive to point at free online tools like blogs or wikis and intuit that they could take share away from commercial legal publishers.  It’s appealing to many of us who are currently spending millions on ephemeral legal content.

Lawyers will contribute content – see Lexology, Mondaq, and JD Supra – but there has to be a return on the investment.  In the cases of those sites, content marketing is the underlying goal.  Without some incentivization of authoritative authors to create content, replacing legal publisher content with open resources has no opportunity of success.

While you have some experts who create authoritative content, like William Patry did as cited by Peter Martin, it’s an ad hoc process.  It’s not always clear who the authorities are (or why some people shouldn’t be) nor any reason for them to continue to contribute except out of good will and personal interest. Those of us who connect lawyers with large amounts of information would be challenged to wrangle these sources; actual practitioners would be at a greater disadvantage.

E-books have a similar challenge.  I’ve dabbled with some – on law practice technology, confidentiality, and an annotated statute – out of personal interest.  For lots of contributors to make the long term commitment to do it, you’d need a funding source to ensure they’re kept up.  Legal publishers, despite the sorry state of their e-books, make an investment in a stable of authors.  As a counterpoint, CALI is an example of success using reputation and minor financial rewards to generate a body of free law school texts.  If only it happened to content available for practitioners.

I’m sympathetic to Martin’s approach but so far, e-book innovation in the legal information market will only happen when enough money is invested to create new, authoritative models.

Originally posted on LinkedIn

Turned On Pressbooks Plugin

Ever since Pressbooks announced it was open sourcing its plugin, I had it on my to do list to download it and give it a try.  Their own site is pretty slick and I’d enjoyed making an e-book with their tools.  The plugin install has gone pretty smoothly, with most bumps coming as I prepared my system for the it.  I’m still not 100% operational but I’m making progress.

Prepare the Backend

The plugin comes as a zipped download.  You will want to get the readme.txt file out of it for additional configuration information, particularly in relation to exports.  I’m running Ubuntu 12.04, MySQL, Apache 2, and PHP 5.  Since I was already running WordPress (both multisite and standalone), I already had the basic stack of software to get the plugin running.

Except that my version of PHP was older than the version required for Pressbooks.  To get this, I had to add a new repository for the PHP 5.4 software (I was on 5.3) and upgrade the software.  It made some additional changes to my Apache environment which caused me some problems but otherwise it was pretty straightforward.

My PHP.ini Cheese Moved

First, the PHP.ini file changes.  So if you are pointing at a extension in the /usr/lib/php5 folder, you may find that the folder name has changed/disappeared.  Mine was now /usr/lib/php5/20121212/  The change needs to be made in your PHP.ini file, which in my case is at /etc/php5/apache2/php.ini

Apache, Rewrites, and .HTACCESS

something else goes haywire.  I have not figured out the why for this one but all of my sites went offline, reporting back a 500 error.  It was tied to how Apache was handling the .htaccess files at the root of each WordPress site.  As you may know, WordPress uses a bunch of rewrites to make the blog post and other content Web addresses more user-friendly.

When I reviewed my Apache error logs (at /var/log/apache2/error.log), I saw that it was choking on the rewrites.  Specifically, I was getting this error:  Rewrite engine not available here or Rewrite cond not available here.  I fooled around with it for a bit before I found the answer, which was to add a <Directory> directive to the Apache2.conf file and enable AllowOverride.

As I mentioned, I run a bunch of WordPress sites and thought that I would need to create a Directory entry for each one.  But it’s applies to subdirectories, so I was able to add this:

<Directory /www>
AllowOverride All

to my Apache2.conf file and solve the problem for all of my sites below that www folder.

Adding the Plugin

I won’t go through installing WordPress multi-site.  This is well documented and is, in essence, installing a single WordPress site and adding a line to your wp-config.php file afterwards.  Pressbooks says to create a new, clean multisite install.  So I didn’t, although I already have a multi-site running.  I created a new folder, dropped WordPress in, and got started.  If you’ve got that readme.txt file from the zip file, it will walk you through all the preliminaries.

This was just as easy as any other WordPress plugin.  Once you’ve download the zip file, you can choose to upload the new plugin.  The file format it needs is zip, so you’re ready to go.  Make sure you’ve activated the network from the WordPress dashboard and then add the plugin.

If you haven’t used multisite before, you’ll want to use the Network dashboard to enable all of the Pressbook e-book themes.  When I create a new book (which is a new site in the multisite network), the Pressbook plugin is not automatically activating for me, like it does on their site.  To get the proper look and feel, navigate to the new book (site), go to plugins and activate Pressbook.

I exported my original e-books from Pressbook – one I’d made public and a second I was fooling around with – to see if they’d import.  I used the WordPress XML export and they import fine.  You will need to activate the WordPress importer (free plugin) and you will need to make sure the Pressbook plugin is active.  Otherwise, you may get “type” errors because Pressbook uses chapters, front matter, etc., not just posts and pages.

Punch List

All in all, it probably couldn’t have gone much more smoothly.  However, I’m not quite operational and need to finish some final items.  First, while I’ve downloaded the relevant EPUB and MOBI format export tools from Google and Amazon, I have not been able to successfully export with them.

The dashboard UI relies on jQuery and the installed version is out of date.  To be honest, I have no idea where this is located.  It’s an easy fix, though, and I’ll grab the latest version and knock that off the list of wrap up items.  Update 08042012:  versions 2.1 and 2.2 of the Pressbooks plugin are not compatible with WordPress 3.6; stay on 3.5.2.  You can also replace the contents of the blockUI script located in wp-content/plugins/pressbooks/symbionts/jquery/jquery.blockUI.js as some other 3.6 users are, but I’m not finding it fixes my editor problem (which is that it doesn’t appear).  Since I’m just noodling, I’m going to stick with 3.6 and keep digging around.

Also, while the dashboard works just as I expected, I cannot get any of the books to load from the visitor side.  While the books are set to be publicly visible, I’m hanging on a sending request message.  There were other weird calls, like one to Facebook even though I don’t use the Facebook button.  I will need to dig around a bit to figure out where the hang up is.  I have a feeling it’s got something to do with the themes.

Once I can get the front end sorted out, and get the export features working, I’ll be in great shape.  I’m looking forward to having the option of creating e-books in non-PDF format.  The themes are excellent for making texts readable.  While you can create e-books using desktop tools as well, it makes sense to me that a document that’s born in HTML stays there.