[Xapian-devel] Applying the Google Summer code

Zhang Fan zhangfan555 at gmail.com
Fri Apr 1 17:44:08 BST 2011


Hi all:

Glad to meet you!!

My name's Zhang Fan, a Phd Student from Nankai University, China. I have
been doing the information retrieval research work for 3 years. I have
several papers published in the top-tier computer science conferences such
as WSDM, VLDB, CIKM and ACL. I have many years of coding experiments
and participated several projects about search engines.

I want to take part in the suggested project "weighting schemes". It is a
good chance for me to contribute to open source community and add my idea to
Xapian.

Besides DfR, I would like to add two more interesting weighting sachems:
term proximity and document structure information.
The term proximity suggest that if the document in which the query terms
appear close to each other should have higher relevance score. Some research
work already prove this idea.
The document structure information is: we distinguish different parts of a
document, we will assign different weight to title, body, anchor text and
url in the documents.

I have two papers involving weighting schemes, please refer to the
followings:

*Fan Zhang*, Shuming Shi, Hao Yan, and Ji-Rong Wen. Revisiting Globally
Sorted Indexes for Efficient Document Retrieval. Third ACM International
Conference on Web Search and Data Mining (*WSDM'10*), New York, 2010.

Hao Yan, Shuming Shi, *Fan Zhang*, Torsten Suel and Ji-Rong Wen. Efficient
Term Proximity Search with Term-Pair Indexes. In CIKM'10


!!Please give me some feedback of my ideas. Thank you very much.
--
My Homepage: http://sites.google.com/site/zhfan555/

PhD Student at Nankai U and Intern at MSRA
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.xapian.org/pipermail/xapian-devel/attachments/20110402/b6581551/attachment.htm>


More information about the Xapian-devel mailing list