[Xapian-tickets] [Xapian] #771: omindex: Handle "directory documents"
Xapian
nobody at xapian.org
Sat Mar 25 21:25:00 GMT 2023
#771: omindex: Handle "directory documents"
--------------------------+-------------------------------
Reporter: Olly Betts | Owner: Olly Betts
Type: enhancement | Status: new
Priority: normal | Milestone: 1.4.x
Component: Omega | Version:
Severity: normal | Resolution:
Keywords: GoodFirstBug | Blocked By:
Blocking: | Operating System: All
--------------------------+-------------------------------
Comment (by Olly Betts):
I have something that basically works.
> Not sure what's best we do about the checksum we store to support
collapsing duplicates (seems like we'd have to iterate the directory
recursively in a sorted order (which is more awkward to do) and checksum
across all the files, or something like that). For now I'm going to leave
the checksum blank for directory documents I think, which means duplicates
won't get collapsed.
Other issues:
* The size we currently store is what `stat()` reports for the directory -
that's somewhat FS dependent, but tends to vary with the number of entries
in the directory which isn't really a useful number for our purposes. If
we iterated the contents we could sum the sizes of the files to get a
better number.
* The mtime and ctime we store are for the directory, which means that
modifications to a directory document may not always be correctly
detected. It depends how programs which save them do it - if they always
create a new directory with a temporary name and then once saved delete
the old one and rename we'll be good. If we iterated the contents we
could find the newest mtime and newest ctime from among the files inside.
For ctime, we also should include the directory itself if we take the user
and group from it (as we currently do, and as seems reasonable).
--
Ticket URL: <https://trac.xapian.org/ticket/771#comment:5>
Xapian <https://xapian.org/>
Xapian
More information about the Xapian-tickets
mailing list