[Xapian-discuss] Q prefix / Unique ID not being found

Sig Lange sig.lange at gmail.com
Fri Mar 11 15:39:53 GMT 2005


I'm creating a unique ID for every document, I have about 3500
documents so far and seem to have ran into a problem while testing.
Here's what I did to "discover" my issue.

The first term I add to a document is in the form of (python):
uid = sha.new( str(random.random()) + str(time.time()) ).hexdigest()
"Q" + uid
which is basically a random float + unix timestamp as float. I used
.add_term() for this.

I can ensure that every key is unique and actually being added to the document.
So I came up with some code like this to list all terms
-- listterms.py --
iter = xapdb.allterms_begin()
end = xapdb.allterms_end()
while not iter == end:
        print iter.get_term()
        iter.next()
-- listterms.py --

making a little bash loop like this I then requested each document off
my server:

./misc/listterms.py  | grep ^Q | cut -c2- | while read id; do curl -s
"http://localhost:8080/bin/read?id=$id" | grep -n ^ERROR; done

-- the /bin/read (hacked down for brevity)--
sys.stderr = sys.stdout
FieldStore = cgi.FieldStorage()
print "Content-Type: text/html"
print
xapdb = xapian.Database("..")

enquire = xapian.Enquire(xapdb)
stemmer = xapian.Stem("english")

qp = xapian.QueryParser()
# i do have other prefixes but only Q is important to my example
qp.set_prefix("id", "Q")

id = FieldStore.getvalue("id", "")
q = "id:" + id

query = qp.parse_query(q)

enquire.set_query(query)
matches = enquire.get_mset(0, 1)

if  matches.get_matches_upper_bound() == 0:
   print "ERROR: Oops, unable to find a message %s" % (id)
   sys.exit(0)

match = iter(matches).next()

print "ID %i %i%% [%s]" % \
(match[xapian.MSET_DID], match[xapian.MSET_PERCENT],
match[xapian.MSET_DOCUMENT].get_data())

-- the /bin/read --

I tried calling flush thinking it could be a python side effect. tried
doing every 50 documents, after every record, and not at all. Ended up
with pretty much the same results. There is about 200 unique ID's that
are not found. How can this be?

TIA
Sig



More information about the Xapian-discuss mailing list