I am working on creating a tablet-friendly paper on confidentiality, technology, and the practice of law. The alpha version (feel free to try it out) is a PDF created from Microsoft Word 2010. It is clearly a hybrid, because I am still at that point of creating something that can be printed out for the session, at the planner’s request. The margin links are to things like podcasts, video, and so on that are tangential but might fill an information gap in the paper.
The next step is to cut out the footnotes and to create an epub format e-book. The PDF is a satisfactory first step but I’m already running into weird problems with it. For example, while it opens in the Adobe Reader app on Android and the links work, it doesn’t seem to work on iOS devices. In the case of the latter, I’ve heard of link + PDF issues so it may be the reader app rather than my document.
Weird side note: the document was created in Word 2010. It shares the format with Word 2007. But the PDF converter in 2010 will retain the links that I created and applied to the text boxes and graphics in the margins. Word 2007 strips those out, leaving just the boxes/images.
In any event, I’m playing around with Sigil to take the Word 2010 document and convert it into an epub format. One reason is that there appears to be some support for embedding the podcast audio and video tutorials within the file. That way, this so-called enhanced e-book could be stand-alone, without an Internet connection.
Now to the point of this post. It was interesting to learn that the .epub file is just like the Microsoft Office .docx file: it’s a folder structure, not a single entity. If you rename an .epub or .docx file to .zip on a Windows PC (I think Vista, 7 or 8 is necessary), you can then explore the contents of the file’s folder structure. It will warn you that you are renaming the file and it will change the icon. Your document hasn’t changed and naming it back will make it look and act like the original file.
Now right click on the .zip file and select Open. You should see the folder structure of the inside of the document. Navigate to the Word folder and double-click document.xml to see the main text of your document. For the curious, this is where there can be metadata about changes that were made to the document.
One of the suggestions I came across took this idea, using the epub format, in order to embed non-text into the epub file. The essence is that you rename the epub format, drop in the new media files into the OEBPS text folder, pull out the necessary HTML file to edit it, and then replace the HTML file before renaming the .zip to .epub.
So far, this method hasn’t worked for me. But I have been fiddling with the HTML and that may be my problem, since the HTML exported from Microsoft Word 2010/2007 is pretty nasty. Sigil won’t import the Microsoft Word XML or Word 2003 XML cleanly. I’m going to continue to crack on this because it would be a practical use of ebook formats to create this sort of embedded content. The flip side to that is whether (a) anyone will ever bother to read this paper anyway and, if they do, (b) whether it wouldn’t be better to have them read it on a PC rather than a tablet.
I’ll follow up in a week or so when I have worked on this some more. The physical paper is due this week but I have some time yet to fool around with the virtual one.