GSoC 2020 Project : Text Extraction Libraries

Emmanuel Engelhart kelson at
Wed May 6 09:29:44 BST 2020

Hi Parth

This is promising to see you will work on this specific topic.

EPUB support is on the top of my personal list. There is a ticket and
even a draft PR at

Kind regards

On 06.05.20 06:36, Parth Kapadia wrote:
> Hello
> I am Parth Kapadia, a computer science undergrad studying in Institute
> of Technology, Nirma University, India.
> I have been given the opportunity to work with Xapian in GSoC 2020 and
> have selected the project "Text Extraction Libraries" to work on.
> This project aims to add support for various new file formats to Omega
> using libraries such as libarchive, libcdr, libpagemaker etc.
> You can find my detailed proposal here : (which is still changing)
> In case you have any idea or suggestion for the project I would be
> grateful for your input in the same.
> I look forward to hearing from you.

Kiwix - Wikipedia Offline & more
* Web:
* Twitter:
* Wiki:

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 195 bytes
Desc: OpenPGP digital signature
URL: <>

More information about the Xapian-devel mailing list