Logging the click data

James Aylett james at tartarus.org
Fri Jun 9 06:55:16 BST 2017

On 8 Jun 2017, at 12:07, Vivek Pal <vivekpal.dtu at gmail.com> wrote:

> > In case I wasn't clear: I don't think you have to modify the command
> > at all. Just create a template that uses the command as it currently
> > works.
> I thought we needed a new template only for the second log file?

Yes, that's correct.

> To generate the first log file using the existing $log command, I have
> introduced another $log command in query template that looks like:
> $log{search.log,"$qid{$query}\t$query\t$did\t$topdoc"}

You'll need to be a little more subtle to get all the document ids in there.

> - $did: to return a list of doc ids on the result page. I'm aware of
>   $id command that returns doc id of the "current" doc but not sure
>   what current doc means there.

Within $hitlist{} (to iterate over the MSet), $id is the current document. It's probably clearest to see this in the opensearch template in the source code.

> I'm currently working towards implementing the support for new
> commands i.e. $qid and $did.

You don't need $did, as indicated above.

> An example log entry assuming that we allow only 4 docs on a
> single result page:
> q101 "simple query text" [doc0, doc1, doc2, doc3] 0  
> q101 "simple query text" [doc4, doc5, doc6, doc7] 4
> q101 "simple query text" [doc8, doc9, doc10, doc11] 8

Look at $csv{} for escaping (and quoting) the query string.

> Also, I noticed that the existing log command in query template i.e.
> $log{query.log} doesn't really log anything. I created query.log in
> log_dir as specified in omega.conf with read and write permission
> granted to the current system user but I see no logs in that file.
> Should the log command be included inside the html body for it to
> work (it currently appears after the closing html tag)?

The log command will be executed wherever it is in the template. By "current system user", what do you mean? There'll need to be write permission for the CGI process, which is probably the same identity as the web server process.

> Another thing that concerns me is that whether logging happens
> whenever a new result page is loaded or it happens just once for
> each search? We certainly don't want to log the same page again
> in case user returns back to an already visited page but we do want
> to log each page once as that is how we'll be able to record offset
> values.

Why does it matter? You can dedupe in post-processing, which is probably easier than detecting a genuinely new search.


 James Aylett
 devfort.com — spacelog.org — tartarus.org/james/

More information about the Xapian-devel mailing list