[Xapian-discuss] using Xapian as backend for google

James Aylett james-xapian at tartarus.org
Mon Dec 11 18:04:40 GMT 2006


On Fri, Dec 08, 2006 at 05:23:27AM +0000, Olly Betts wrote:

> Incidentally, there are many more RAID configurations than just these
> two.  Wikipedia has an overview:
> 
> http://en.wikipedia.org/wiki/RAID

In a vaguely similar system we use RAID 10, or 1+0 or something (the
names keep on changing). Basically: linear striping over RAID 1
redundant pairs. Gives very good performance with pretty solid
resilience. But we consider disk to be relatively cheap compared to
lots of other things.

> > SCSI VS SATA
> 
> It depends on budget and how big you want to grow.  SATA is cheaper and
> probably similar in speed to where SCSI was a few years ago, but iSCSI
> and Fibre Channel are likely to end up faster in most cases.

However lots of SATA drives with striping may give better performance
than FCAL with only one or two disks. Better for large systems may be
FC connected to a JBOD (Just a Bunch Of Disks :-) of SATA disks,
because you'll have a good chance of getting close to fibre speeds if
you have enough SATA spindles behind it.

It all gets very complex at this level, though. If you genuinely need
to go to this level, you probably want to get expert advice from your
channel partner. (If you don't have a channel partner, find one -
they can save you lots of money!)

> > What would be the bottleneck (i think DISC I/O)?
> 
> It's likely to be.  Note that there's scope for improving matters with
> enhancements to Xapian here - there are some obvious things to improve
> (which I'm working my way through), and profiling should reveal more.

We keep on vaguely mentioning getting a set of tests which stretch a
setup to make this kind of tuning investigation easier. Of course that
only enables tuning for that particular profile, but even so it would
point the way for others.

If someone has some time to help on building the initial tests, that
would probably be worth doing. (But Olly, Richard, feel free to
correct me if there's more useful stuff in the short term.)

J (who has been thinking about lots and lots of disks recently :-)

-- 
/--------------------------------------------------------------------------\
  James Aylett                                                  xapian.org
  james at tartarus.org                               uncertaintydivision.org



More information about the Xapian-discuss mailing list