R bindings for Xapian: API modifications
Amanda Jayanetti
amandajayanetti at gmail.com
Sat Apr 30 16:02:54 BST 2016
Hi,
I'm currently reviewing my originally proposed API design and I added two
new fields(idField, stemmer) to the xapian_index() function. As my next
task I'm planning to determine the output data structure and format of
xapian_search() function. Afterwards I will focus back on xapian_index()
function and review the format of valueSlots parameter.
An outline of 'simple indexing' functionality:
xapian_index(dbpath=””, datapath=””, idField=c(0), indexFields=NULL,
stemmer=””,valueSlots=NULL, …)
dbpath: Path to a xapian database
datapath: Path to a data source
idField: Column number of a column in the data frame whose row value will
be used as a unique identifier
indexFields: A list of character vectors each containing a field name and a
prefix
stemmer: language stemmer
xapian_index() function can be used to index the content of a data frame.
Convert the data frame(df) to a csv. (Skip this step if data source is
already a csv file):
>> write.csv(df, ”location/of/data.csv”)
>> f1 <- c(“Title”,”S”)
>> f2<- c(“Description”,”XD”)
>> fields<- list(f1,f2)
>> idField <-c(0)
>> xapian_index(“path/to/database”,”location/of/data.csv”, idField=c(0),
indexFields=fields,stemmer=”en”)
For indexing multiple data frames of similar format:
>> dataLoc <-c(“path1”,”path2”,”path3”, …)
>> for(dataSource in dataLoc){
xapian_index(“path/to/database”,dataSource, idField=c(0),
indexFields=fields,stemmer=”en”)
}
Best regards,
Amanda
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.xapian.org/pipermail/xapian-devel/attachments/20160430/482c4f32/attachment.html>
More information about the Xapian-devel
mailing list