[Xapian-tickets] [Xapian] #698: invalid cross-device link (NFS?)
Xapian
nobody at xapian.org
Thu Dec 10 20:50:31 GMT 2015
#698: invalid cross-device link (NFS?)
---------------------------+--------------------------
Reporter: srepmub | Owner: olly
Type: defect | Status: new
Priority: normal | Milestone:
Component: Backend-Chert | Version:
Severity: normal | Resolution:
Keywords: | Blocked By:
Blocking: | Operating System: Linux
---------------------------+--------------------------
Changes (by olly):
* component: Other => Backend-Chert
Old description:
> hi,
>
> we've recently switched to using xapian for version 7.2 of our groupware
> solution called zarafa (https://www.zarafa.com/), and are very happy with
> it so far!
>
> we've run into a weird error recently though, which at first glance looks
> like a bug in xapian when used with NFS. we don't have an strace yet, but
> we do have one traceback:
>
> File "/usr/lib/python2.7/dist-packages/zarafa_search/__init__.py", line
> 159, in main
> plugin.commit()
> File "/usr/lib/python2.7/dist-packages/zarafa_search/plugin_xapian.py",
> line 115, in
> commit
> db.delete_document('XK:'+doc['sourcekey'].lower())
> File "/usr/lib/python2.7/contextlib.py", line 154, in __exit__
> self.thing.close()
> DatabaseError: Couldn't update base file
> /srv/zarafa/index/1909D712B7DF49A0B1253DC64DD954CF-
> 7B15C461919C4934A43DFC2D7479B7B8/spelling.baseB:
> Invalid cross-device link
>
> the customer has recently moved their xapian databases to NFS, and is
> experiencing this issue now and then.
>
> according to someone here who is more into file systems, xapian should
> perhaps check for this error, and if it occurs retry the respective
> operation in a safer way..?
New description:
hi,
we've recently switched to using xapian for version 7.2 of our groupware
solution called zarafa (https://www.zarafa.com/), and are very happy with
it so far!
we've run into a weird error recently though, which at first glance looks
like a bug in xapian when used with NFS. we don't have an strace yet, but
we do have one traceback:
{{{
File "/usr/lib/python2.7/dist-packages/zarafa_search/__init__.py", line
159, in main
plugin.commit()
File "/usr/lib/python2.7/dist-packages/zarafa_search/plugin_xapian.py",
line 115, in
commit
db.delete_document('XK:'+doc['sourcekey'].lower())
File "/usr/lib/python2.7/contextlib.py", line 154, in __exit__
self.thing.close()
DatabaseError: Couldn't update base file
/srv/zarafa/index/1909D712B7DF49A0B1253DC64DD954CF-
7B15C461919C4934A43DFC2D7479B7B8/spelling.baseB:
Invalid cross-device link
}}}
the customer has recently moved their xapian databases to NFS, and is
experiencing this issue now and then.
according to someone here who is more into file systems, xapian should
perhaps check for this error, and if it occurs retry the respective
operation in a safer way..?
--
Comment:
I'd generally not recommend hosting databases on NFS. There are many
corner cases that NFS doesn't really handle correctly, and good
performance is rather too dependent on the exact configuration.
But aside from NFS infelicities, I would expect it to work.
The operation which fails works like so:
* The new base file is created in a temporary file in the same directory
as where the final file should be.
* We then call `rename()` to move the temporary file to its final name.
This is a very standard pattern for creating a file without having a
partial file in place - it effectively allows atomic creation of a file.
It seems rename is failing because the source and destination aren't on
the same filing system, which doesn't seem like it should be the case here
- from `man rename`:
{{{
EXDEV oldpath and newpath are not on the same mounted
filesystem.
(Linux permits a filesystem to be mounted at multiple
points,
but rename() does not work across different mount points,
even
if the same filesystem is mounted on both.)
}}}
I suspect your "someone" is suggesting the file should be moved in a way
which works across filing systems, but that breaks it being an atomic
update, which is the whole point of creating it as a temporary file in the
first place.
I think you need to work out why `rename()` thinks it's being asking to
rename across filing systems when the rename is within a directory.
Perhaps there's some sort of overlay or union filing system in play too?
If so, I think it needs to be configured such that `rename()` within a
directory works.
Or it could be an NFS bug - older kernels had bugs which could return
`EXDEV` incorrectly, such as:
http://www.spinics.net/lists/linux-nfs/msg17306.html
That's from 2010, and a quick look at recent kernel source suggests that
it's since been addressed, but perhaps they're running an old enough
kernel to be affected, or it's not entirely fixed, or there's another
similar bug.
I think we need more info to determine what's actually going on.
Also, for completeness, which Xapian version is being used?
--
Ticket URL: <http://trac.xapian.org/ticket/698#comment:1>
Xapian <http://xapian.org/>
Xapian
More information about the Xapian-tickets
mailing list