GSoC 2020 Project : Text Extraction Libraries

Parth Kapadia prkandmap07 at
Wed May 6 05:36:44 BST 2020

I am Parth Kapadia, a computer science undergrad studying in Institute of
Technology, Nirma University, India.

I have been given the opportunity to work with Xapian in GSoC 2020 and have
selected the project "Text Extraction Libraries" to work on.
This project aims to add support for various new file formats to Omega
using libraries such as libarchive, libcdr, libpagemaker etc.
You can find my detailed proposal here : (which is still changing)

In case you have any idea or suggestion for the project I would be grateful
for your input in the same.

I look forward to hearing from you.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

More information about the Xapian-devel mailing list