[Xapian-tickets] [Xapian] #714: Use C++ API optional reference counting in bindings
Xapian
nobody at xapian.org
Sun Oct 9 23:32:47 BST 2022
#714: Use C++ API optional reference counting in bindings
-----------------------------+-------------------------------
Reporter: Olly Betts | Owner: Olly Betts
Type: defect | Status: new
Priority: normal | Milestone: 1.4.x
Component: Xapian-bindings | Version: git master
Severity: normal | Resolution:
Keywords: | Blocked By:
Blocking: | Operating System: All
-----------------------------+-------------------------------
Comment (by Olly Betts):
Having dug into this some more, I don't think this approach can work, at
least for languages with directors. I can get a segfault out of the new
PHP8 bindings with this patch to `smoketest.php`:
{{{
diff --git a/xapian-bindings/php/smoketest.php b/xapian-
bindings/php/smoketest.php
index dbc8a6ce455d..eea7d4c9b995 100644
--- a/xapian-bindings/php/smoketest.php
+++ b/xapian-bindings/php/smoketest.php
@@ -281,7 +281,13 @@ class testfieldprocessor extends XapianFieldProcessor
{
}
}
-$qp->add_prefix('spam', new testfieldprocessor);
+$fp = new testfieldprocessor;
+{
+ $qptmp = new XapianQueryParser;
+ $qptmp->add_prefix('spam', $fp);
+ unset($qptmp);
+}
+$qp->add_prefix('spam', $fp);
$qp->add_boolean_prefix('filter', new testfieldprocessor);
$query = $qp->parse_query('spam:ignored');
if ($query->get_description() !== 'Query(spam)') {
}}}
I'll talk about this in the context of PHP and setting
`Xapian::FieldProcessor` on `Xapian::QueryParser`, but I believe this
applies to all Xapian optional-reference-counted classes in all SWIG
target languages where we enable directors.
Consider the case of `testfieldprocessor` which is a PHP subclass of
`XapianFieldProcessor`.
We need to do a "director disown" for a `testfieldprocessor` object when
it gets set on a `XapianQueryParser` object so that the underlying C++
object owns the PHP proxy object because we need the PHP proxy to live on
until Xapian deletes the underlying C++ object so that virtual method
calls can be routed to it.
However with the patched testcase above, the underlying C++ object gets
deleted at `unset($qptmp)` because the C++ side only knows about the C++
references to this object - it doesn't know that the PHP proxy object
still exists, so `$fp` now has a live PHP proxy object pointing to a
deleted C++ object, and when we try to set it again we're into C++
undefined behaviour.
SWIG's `ref` and `unref` features seem like them might allow us to address
this, but they don't actually solve the problem completely. We can
increment the reference count of the underlying C++ object while the PHP
proxy object is alive, then decrement it when the PHP proxy object is
destroyed, but this effectively creates a reference loop: the PHP proxy
has a Xapian-specific C++ reference to the Xapian C++ object, and the swig
subclass of that created to implement directors has a PHP reference to the
PHP proxy object. Since this reference loop is partly outside the target
language's reference tracking there's not even any hope that it can be
detected by the target language garbage collector (if there even is one).
It seems fundamentally "director disown" assumes that a target language
reference isn't kept (in Python a weak reference is even returned to
assist with that).
We could potentially wrap these methods in a C++11 move-like way, where
the target language object is deliberately left unusable (SWIG 4.1.0 adds
explicit support for move semantics) but that's an incompatible change.
So I think we need to stay with keeping a PHP-reference to the currently
set `XapianFieldProcessor` objects, (which for PHP specifically will need
some custom typemaps now there's no longer a PHP wrapper script generated
we can patch).
--
Ticket URL: <https://trac.xapian.org/ticket/714#comment:5>
Xapian <https://xapian.org/>
Xapian
More information about the Xapian-tickets
mailing list