[Xapian-tickets] [Xapian] #780: Support for epub (was: Support for epub and .pkl file)

Xapian nobody at xapian.org
Tue Apr 23 05:26:22 BST 2019


#780: Support for epub
----------------------------------------+-----------------------------
 Reporter:  jugnu                       |             Owner:  jugnu
     Type:  enhancement                 |            Status:  reopened
 Priority:  normal                      |         Milestone:  1.5.0
Component:  Omega                       |           Version:  1.4.11
 Severity:  normal                      |        Resolution:
 Keywords:  omega file support omindex  |        Blocked By:
 Blocking:                              |  Operating System:  Linux
----------------------------------------+-----------------------------
Description changed by jugnu:

Old description:

> Git PR for issue :  https://github.com/xapian/xapian/pull/235
>
> EPUB ------
>
> Skipping - unknown MIME type 'application/epub+zip'
> Skipping - unknown MIME type 'application/zip'
>
> .pkl (less priority) -------
>
> I was just playing around with different formats on my computer. I found
> and tested .pkl formatted. file command shows it to be 8086 relocatable
> (Microsoft). Also I ran omindex with .pkl inside, and it fails :
>
> {{{
> iamglass: Skipping - unknown MIME type 'application/octet-stream'
> polarities.pkl: Skipping - unknown MIME type 'application/octet-stream'
> Exception: DatabaseError: Modifications failed (DatabaseError: Error
> reading block 0 (Protocol error)), and couldn't open at the old revision:
> Error reading block 0.
> }}}

New description:

 Git PR for issue :  https://github.com/xapian/xapian/pull/235

 EPUB ------

 Skipping - unknown MIME type 'application/epub+zip'

 Skipping - unknown MIME type 'application/zip'

 Current situation is the omindex is able to index the epub files. However,
 there is work needed to perfectly parse the metadeta information correctly
 such as overall title, individual chapters, authors etc. There are many
 other parts within index_file.cc which does the basic indexing but lacks
 perfect metadata parsing. For instance, the formats can be found through
 searching for : // FIXME: Implement support for metadata.

 To do :

 1. Also tests are needed to ensure that epub supports is generalized
 across different generation of epubs, their file directory structures
 etc..
 2. Correct metadata parsing

--

--
Ticket URL: <https://trac.xapian.org/ticket/780#comment:11>
Xapian <https://xapian.org/>
Xapian



More information about the Xapian-tickets mailing list