Amount of writes during index creation

Sat Feb 2 12:43:07 GMT 2019

This is quite possibly part of the underlying write explosion that we ran into when we wrote:

https://fastmail.blog/2014/12/01/email-search-system/

Which now almost 5 years on, has been running like a champion! We're really pleased with how well it works. Xapian reads from multiple databases are really easy, and the immediate writes onto tmpfs and daily compacts work really well. We also have a cron job which runs hourly and will do immediate compacts to disk from memory if the tmpfs hits more than 50% of its nominal size, and it keeps us from almost ever needing to do any manual management as this thing indexed millions of new emails per day across our cluster.

And then when we do the compact down to disk, it's a single thread compacting indexes while new emails still index to tmpfs, so there's always tons of IO available for searches.

I think even with more efficient IO patterns, I'd still stick with the design we have. It's really nice :)

Bron.

On Fri, Feb 1, 2019, at 06:47, Jean-Francois Dockes wrote:
> Olly Betts writes:
> > On Mon, Jan 21, 2019 at 03:25:01PM +0100, Jean-Francois Dockes wrote:
> > > I have had a problem report from a Recoll user about the amount of writes
> > > during index creation.
> > > 
> > > https://opensourceprojects.eu/p/recoll1/tickets/67/
> > > 
> > > The issue is that the index is on SSD and that the amount of writes is
> > > significant compared to the SSD life expectancy (index size > 250 GB).
> > > 
> > > From the numbers he supplied, it seems to me that the total amount of block
> > > writes is roughly quadratic with the index size.
> > > 
> > > First question: is this expected, or is Recoll doing something wrong ?
> > 
> > It isn't expected.
> > 
> > I think this is probably due to a bug which coincidentally was
> > discovered earlier this week by Germán M. Bravo. I've now fixed it
> > and backported ready for 1.4.10. If you're able to test to confirm
> > if this solves your problem that would be very useful - see
> > f19bcb96857419469f74f748e7fe8eaccaedc0fd on the RELEASE/1.4 branch:
> > 
> > https://git.xapian.org/?p=xapian;a=commitdiff;h=f19bcb96857419469f74f748e7fe8eaccaedc0fd
> > 
> > Anything which uses a term for a unique document identifier is likely to
> > be affected.
> > 
> > Cheers,
> > Olly
> 
> I have run a number of tests, with data mostly from a project gutenberg dvd
> and other books, with relatively modest index sizes, from 1 to 24 GB.
> 
> Quite curiously, in this zone, with all Xapian versions I tried, the ratio
> from index size to the amount of writes is roughly proportional to the index
> size to the power 1.5
> 
> TotalWrites / (IndexSize**1.5) ~= K
> 
> So, not quadratic, which is good news. For big indexes, 1.5 is not so good
> but probably somewhat expected.
> 
> The other good news is that the patch above decreases the amount of writing
> by a significant factor, around 4.5 for the biggest index I tried.
> 
> The amount of writes is estimated with iostat before/after. The disk has
> nothing else to do.
> 
> idxflushmb is the number of megabytes of input text between Xapian commits.
> 
> xapiandb,kb writes,kb K*1000 sz/w
> 
> xapian 1.4.5 idxflushmb 200
> 
> 1544724 6941286 3.62 4.49
> 3080540 16312960 3.02 5.30
> 4606060 21054756 2.13 4.57
> 6123140 33914344 2.24 5.54
> 7631788 50452348 2.39 6.61
> 
> xapian git master latest idxflushmb 200
> 
> 1402524 1597352 0.96 1.14
> 2223076 3291588 0.99 1.48
> 2678404 4121024 0.94 1.54
> 3842372 7219404 0.96 1.88
> 4964132 10850844 0.98 2.19
> 6062204 14751196 0.99 2.43
> 19677680 125418760 1.44 6.37
> 
> xapian git master before patch idxflushmb 200
> 
> 24707840 750228444 6.11 30.36
> 
> So that was 750 GB of writes for the big index before the patch...
> 
> As you can see my beautiful law does not hold so well for the biggest index :)
> (K = 1.44)
> It's not quite the same data though, so I would need more tests, but I
> think I'll stop here...
> 
> The improvement brought by the patch is nice. It remains that for people
> using big indexes on SSD, the amount of writes is still something to
> consider, and splitting the index probably makes sense ? What do you think ?
> 
> I'll run another test this night with a smaller flush interval to see if it
> changes things.
> 
> Cheers,
> 
> jf
> 
> 

-- 
 Bron Gondwana
 brong at fastmail.fm