deleting terms with a given prefix

David Bremner david at tethera.net
Thu Aug 28 12:09:07 BST 2025


I'm looking at some benchmarks where the bottleneck is in the following
function, and wondering if there is a better way to delete all the terms
with a given prefix from a document?  According to perf, 41% of the time
is in this function, almost all (40%) in
Xapian::Document::Internal::remove_term.

At a higher level, this particular benchmark is replacing all of the
tags for a bunch of messages, and the current strategy is to delete all
existing ones and then add the new ones.

void
_notmuch_message_remove_terms (notmuch_message_t *message, const char *prefix)
{
    Xapian::TermIterator i;
    size_t prefix_len = 0;

    prefix_len = strlen (prefix);

    while (1) {
	i = message->doc.termlist_begin ();
	i.skip_to (prefix);

	/* Terminate loop when no terms remain with desired prefix. */
	if (i == message->doc.termlist_end () ||
	    strncmp ((*i).c_str (), prefix, prefix_len))
	    break;

	try {
	    message->doc.remove_term ((*i));
	    message->modified = true;
	} catch (const Xapian::InvalidArgumentError) {
	    /* Ignore failure to remove non-existent term. */
	}
    }

    _notmuch_message_invalidate_metadata (message, "property");
}



More information about the Xapian-discuss mailing list