GSoC 2020 Project : Text Extraction Libraries

Emmanuel Engelhart kelson at kiwix.org
Wed May 6 09:29:44 BST 2020


Hi Parth

This is promising to see you will work on this specific topic.

EPUB support is on the top of my personal list. There is a ticket and
even a draft PR at https://github.com/xapian/xapian/pull/235.

Kind regards
Emmanuel

On 06.05.20 06:36, Parth Kapadia wrote:
> Hello
> I am Parth Kapadia, a computer science undergrad studying in Institute
> of Technology, Nirma University, India.
> 
> I have been given the opportunity to work with Xapian in GSoC 2020 and
> have selected the project "Text Extraction Libraries" to work on.
> This project aims to add support for various new file formats to Omega
> using libraries such as libarchive, libcdr, libpagemaker etc.
> You can find my detailed proposal here : (which is still changing)
> https://github.com/Exter-dg/xapian-gsoc-proposal/blob/draft/proposal.rst
> 
> In case you have any idea or suggestion for the project I would be
> grateful for your input in the same.
> 
> 
> I look forward to hearing from you.


-- 
Kiwix - Wikipedia Offline & more
* Web: https://kiwix.org/
* Twitter: https://twitter.com/KiwixOffline
* Wiki: https://wiki.kiwix.org/

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 195 bytes
Desc: OpenPGP digital signature
URL: <http://lists.xapian.org/pipermail/xapian-devel/attachments/20200506/32a96b7a/attachment.sig>


More information about the Xapian-devel mailing list