[Xapian-discuss] Writing a Quick Start Guide to Xapian

Chris chris at s-4-u.net
Thu Jul 28 15:27:47 BST 2011


inline...

On 07/28/2011 03:36 PM, Justin Finkelstein wrote:
> I think what would be good is if we had a representative from each of
> the major programming languages that use it, so that the documentation
> is compiled of:
>
>     o Core concepts: what Xapian is (and isn't); how it works; what each
> of the different capabilities it offers can do
I think most of this is already online, maybe it needs some reformatting
or cleanups (but i never had any problems finding these infos, so i
can't tell).
>     o Then a series of worked-through examples, in a each language, of
> how to do pretty much everything
Sounds great, but i'd also be fine with Ruby only ;)

Some things/problems i remember immediatelly (and it's been a few monthes)

- Understanding the whole picture.
Somehow i didn't find a document, which gives the big overview how a
complete system is built and on what soft/hardware.
If i'm not totally Alzheimer, i think, there even was a topic a while
back, about which filesystem to use (for example), and there is still no
reliable data available, yet.
Same goes for combinations of block sizes, software raids (i'm limited
to a 2 disk setup), not using SSD, howto prevent host cache, howto lower
sync aggressiveness, etc..

- Howto work with remote databases, multiple readers, nonstop updater.
In my specific case, the fsync was so aggressive, that i had to find a
solution to divide the sync from the readers, as it was slowing down the
whole system by factor 10 or more (only when flushing, no matter what
transaction size i choosed, even a single document ment a 10sec 'near stop'.

I did some (non-scientific) tests and went with a rsync job instead
software raid, with xfs and 4kb blocksize (filesystem and xapian), but
i'm not sure it is the optimal solution and i dont have time to test all
the possible combinations (any pointers welcome). I believe these
background information, in comparision with a predefined dataset (like
for example a fixed dump of wikipedia), ready to download and benchmark,
would be quite helpful for spreading the word about xapian and it's
unsurpassed performance and also help Root to decide, if xapian is the
right choice for his problem.

- How does Xapian in a Rails context work?
I'm still not sure, but as far as i understand, the only cache xapian
uses, is the linux disk cache?
So if i have 8 Rails webservers running, they all access the same cached
area, which is quite nice, but...
If i use 2 xapian databases, 1 for readers, 1 for updater, which get's
rsynced after each update, then the updater job is using 50% of the
disk-cache, which it shouldn't, as the writer can be slow and i dont
care, but i need speedy readers.
So whats the prefered solution here? Multiple Systems? Weird virtual
machine setups? Or just more RAM?

As i didn't recheck the current docs since a few monthes, please have
mercy and provide pointers in case i missed something.

> To me, this would produce what I believe is commonly called a
> 'cookbook', giving people a real head-start in adoption.
+1
> If we could get this done, then we're well on the way to an excellent
> set of docs.
Which is imho the single most needed thing, for xapian world domination :)
> In addition, my company'd still like to offer our design services for a
> fancy redesign of the site (for free, in return for a link/credit
> somewhere), if there's interest.
You know the old saying: "a picture say's more than a thousand words" ;)


Greets, Chris

> On Thu, 2011-07-28 at 14:14 +0100, Andrew Betts wrote:
>
>> I would also be interested in this, and happy to contribute from my experience of developing a number of Xapian powered sites, the most complex of which is http://tilt.ft.com.  I'm also based in London.
>>
>>> Date: Wed, 27 Jul 2011 23:16:08 +1200
>>> From: Olly Betts <olly at survex.com>
>>> Subject: [Xapian-discuss] Writing a Quick Start Guide to Xapian
>>> To: xapian-discuss at lists.xapian.org
>>> Message-ID: <20110727111607.GA3346 at cavity>
>>> Content-Type: text/plain; charset=us-ascii
>>>
>>> Google are holding a GSoC "Doc Camp" this year the week before the
>>> annual mentor summit - the dates for Doc Camp are 17-21 October, and the
>>> location is Google HQ in Mountain View, California, USA.
>>>
>>> A major part of this will be several Book Sprints for writing Quick
>>> Start guides for specific organisations taking part in GSoC.  They're
>>> currently inviting proposals, and I'd like to submit one.  I took part
>>> in a book sprint of this format in 2009, and have been wanting to hold
>>> one to do a "Xapian book" ever since.  The issue has been how to get
>>> enough people together in one place, and this seems a good opportunity,
>>> particularly as a Quick Start guide would fix one of the weaker aspects
>>> of the current docs.
>>>
>>> Details are at the URL below, but a key thing to note is that
>>> participation is open to *ANYONE*, not only those involved as mentors
>>> this year.  Google are offering to pay for food and accommodation costs,
>>> and you can also apply for some or all of your travel costs.  It's right
>>> before the mentor summit, so if you're attending that (for Xapian or
>>> another org) transport costs aren't an issue anyway.  The URL is:
>>>
>>> https://sites.google.com/site/docsprintsummit/home
>>>
>>> As part of the proposal, we can nominate up to 5 people.  It would be
>>> good to have a range of familiarity with Xapian - we clearly need there
>>> to be at least one person with plenty of Xapian experience in the team
>>> to ensure the content is accurate, but we also really need some people
>>> who are fairly new to Xapian and are better able to put themselves in
>>> the shoes of the people who most need such a book.
>>>
>>> We need to submit a proposal before 5th August, so we have a week and a
>>> bit.  So if you can be available that week and are interested, or would
>>> just like to know more, please get in touch.
>>>
>>> Cheers,
>>>     Olly
>>
>> _______________________________________________
>> Xapian-discuss mailing list
>> Xapian-discuss at lists.xapian.org
>> http://lists.xapian.org/mailman/listinfo/xapian-discuss
>
>
> _______________________________________________
> Xapian-discuss mailing list
> Xapian-discuss at lists.xapian.org
> http://lists.xapian.org/mailman/listinfo/xapian-discuss




More information about the Xapian-discuss mailing list