[Xapian-discuss] Xapian Terms vs. Document Partition.
Alex Brasetvik
alex-xapian at brasetvik.com
Thu May 8 16:28:32 BST 2008
On Tue, 6 May 2008 16:48:01 -0700, "Kevin Duraj" <kevin.softdev at gmail.com>
wrote:
> Xapian Terms vs. Document Partition.
>
> On December 2007, Diego Puppin from Google had interesting talk about
> parallel architecture distributing index based on terms rather than
> documents.
> Reference:
> http://youtube.com/watch?v=KpZpsu2wM1s
[snip]
> I would like again encourage Xapian community to
> start looking into distributing index based on terms rather than
> documents. To make each server be responsible for set of terms rather
> then set of documents would enable us to scale our search engine to
> Google's level.
If you watch the talk again and read their paper[1], you'll see that the
gist of the talk is *not* about neither document- nor term-partitioning.
Also,
in their paper, they suggest ``Document partitioning is the strategy
usually
chosen by the most popular web search engines'', citing Page and Brin's
paper on Google's architecture. You may want to read it.
~
[1] http://scholar.google.no/scholar?hl=en&lr=&cluster=10013139656811614516
--
Alex Brasetvik
More information about the Xapian-discuss
mailing list