[Xapian-discuss] using Xapian as backend for google
rurban at x-ray.at
Wed Dec 13 17:08:34 GMT 2006
I've read in an early google document, that they favored more cheap servers with
cheap IDE drives (some maxtor 250GB) all around the world, over better
servers with better discs.
indices were not used over the line (xapian-tcpsrv), but rsync'ed over night.
local copies everywhere.
so the web overhead was far more important than the search time.
2006/12/13, Chris Good <chris at g2.nu>:
> Olly Betts wrote:
> > Webtop used xapian-tcpsrv to spread searches over a number of boxes
> > (10 or so IIRC). The index size was around 500 million documents, but
> > with modern hardware that's much less of a challenge than it was more
> > than 6 years ago.
> Gosh is it that long ago. Back then we used 20 dual processor boxes
> per cluster, each cluster having a complete dataset, processor speeds
> ranged from 500-850MHz and I have a recollection of 2GB
> of ram being fitted to the machines.
> For webtop I don't think that we bothered with any redundancy in the
> disks, I certainly wouldn't do so these days as I'd just keep a couple
> of spare machines around and upon failure assign one of those to
> take over the DB file of the failed machine. This necessitates having
> a centralised repository of all your data that machines can sync from,
> a nice fast network (gigabit is essentially free these days and is
> more than adequate) and the means to do an automated boot/installation
> of machines.
> For webtop we used a NetApp server for the central store and had a
> kickstart configuration and bootp/tftp server (you'd use PXE and kickstart
> these days). When a machine failed you'd replace the disk or whatever,
> network boot, which would install the local OS and RPMs containing all
> our localised configuration.
> In this day and age I'd be looking at lowish cost servers with a couple
> of SATA drives (you don't really need the capacity but the cost of SAS
> is too high to justify in my mind) and as much memory as I could shove
> in the box.
More information about the Xapian-discuss