This page provides an overview of the EPUB document format
and sketches out how AxAp might provide access to EPUB documents.
An EPUB file is basically a "web site in a can".
Specifically, it's a renamed ZIP archive,
containing a variety of web-related assets
(e.g., CSS, HTML)
and some associated metadata.
For a gentle introduction, see Matt Garrish's excellent publications, published by O'Reilly:
We are currently analyzing some example EPUB documents we have on hand,
attempting to discover issues that we'll need to address.
For more information, see our Examples pages.
Here's a simplified guess at an approach:
obtain an EPUB document
import it, if necessary
provide it to a browser
The "import" step clearly deserves some clarification.
Basically, we're using Git to retain immutable cached versions
of the reference data (e.g., document, generated file tree).
We optimize on processing time and storage space
by assuming that identical documents (per Git)
will produce identical file trees.
Here's some pseudo-code:
commit the document to Git
if any new data blobs were created:
unpack a copy of the document
commit the file tree to Git
add auxiliary files for browsers
commit the file tree again
return the commit ID to Sinatra
Although the unpacked HTML files can be displayed by a web browser,
there is no support for indexing or navigation.
Some of this can be provided by means of added HTML files;
alternatively, JSON files can be supplied for use by a client-side app.
Although there are many differences between static documents and remote web sites,
they can be handled in a similar manner, using cached file trees, Git, etc.
As a minor detail, we need to get a (renamed) copy of the archive
and unpack it into a directory tree.
Here is some BASH code that does the trick: