[Xapian-tickets] [Xapian] #583: Spin off Omega's filetype conversion code as a library

Xapian nobody at xapian.org
Wed Jul 29 06:11:58 BST 2020


#583: Spin off Omega's filetype conversion code as a library
-------------------------+-------------------------------
 Reporter:  Olly Betts   |             Owner:  Olly Betts
     Type:  enhancement  |            Status:  new
 Priority:  low          |         Milestone:
Component:  Omega        |           Version:
 Severity:  normal       |        Resolution:
 Keywords:               |        Blocked By:
 Blocking:               |  Operating System:  All
-------------------------+-------------------------------
Comment (by Olly Betts):

 This isn't really something that's being actively worked on I'm afraid.

 One key problem is that making something a public API locks down the
 design significantly - to actually be useful a public API needs to come
 with a commitment to stability, whereas a private API inside our code we
 can change pretty much at will (the only real concern is any patches in
 progress which touch the same code).

 Another issue is that design decisions that make total sense in a narrower
 context are problematic for other potential uses.

 Just hacking out code from inside omindex into a library and advertising
 it as a public API isn't really enough - this requires quite a lot of work
 to do properly.

 If you look at the design in the old email message, it requires passing a
 filename.  That's a reasonable requirement if you always have a local file
 you want to process, but requires a temporary file to be created in other
 situations (like spidering websites to index or indexing files extracted
 from compound file formats like ISOs, tarballs, ZIP archives, attachments
 from emails, etc).

 And indeed in the meantime omindex's code has evolved to allow extracting
 files from a file descriptor.  If we were working with a public API, that
 would have been much harder to do, because we'd have to have maintained
 compatibility with the existing API.  Or else we'd have had to make an
 incompatible major version bump and forced all users of the API to rewrite
 their code.  There are libraries that do that - I've used a few and they
 aren't fun to be a user of.

 If we're going to add a public API for this (or anything else really) I
 think we need to do it well.  Doing it badly doesn't actually help users,
 but still takes developer energy away from other areas.
-- 
Ticket URL: <https://trac.xapian.org/ticket/583#comment:4>
Xapian <https://xapian.org/>
Xapian


More information about the Xapian-tickets mailing list