[Xapian-tickets] [Xapian] #771: omindex: Handle "directory documents"

Xapian nobody at xapian.org
Sat Mar 25 21:25:00 GMT 2023


#771: omindex: Handle "directory documents"
--------------------------+-------------------------------
 Reporter:  Olly Betts    |             Owner:  Olly Betts
     Type:  enhancement   |            Status:  new
 Priority:  normal        |         Milestone:  1.4.x
Component:  Omega         |           Version:
 Severity:  normal        |        Resolution:
 Keywords:  GoodFirstBug  |        Blocked By:
 Blocking:                |  Operating System:  All
--------------------------+-------------------------------
Comment (by Olly Betts):

 I have something that basically works.

 > Not sure what's best we do about the checksum we store to support
 collapsing duplicates (seems like we'd have to iterate the directory
 recursively in a sorted order (which is more awkward to do) and checksum
 across all the files, or something like that). For now I'm going to leave
 the checksum blank for directory documents I think, which means duplicates
 won't get collapsed.

 Other issues:

 * The size we currently store is what `stat()` reports for the directory -
 that's somewhat FS dependent, but tends to vary with the number of entries
 in the directory which isn't really a useful number for our purposes.  If
 we iterated the contents we could sum the sizes of the files to get a
 better number.

 * The mtime and ctime we store are for the directory, which means that
 modifications to a directory document may not always be correctly
 detected.  It depends how programs which save them do it - if they always
 create a new directory with a temporary name and then once saved delete
 the old one and rename we'll be good.  If we iterated the contents we
 could find the newest mtime and newest ctime from among the files inside.
 For ctime, we also should include the directory itself if we take the user
 and group from it (as we currently do, and as seems reasonable).
-- 
Ticket URL: <https://trac.xapian.org/ticket/771#comment:5>
Xapian <https://xapian.org/>
Xapian


More information about the Xapian-tickets mailing list