[Xapian-discuss] How to Retrieve content of the document?

Rohit 76.rohit at gmail.com
Thu Apr 21 12:07:14 BST 2011


When you say the terms and values are already stored , can you tell me how i
can retrieve the words of a particular document(assuming i dont store
anything in the documents data).

Thanks,
rohit.

On Thu, Apr 21, 2011 at 6:12 AM, Sym Roe <sym.roe at talusdesign.co.uk> wrote:

> On Thu, Apr 21, 2011 at 10:45 AM, Rohit <76.rohit at gmail.com> wrote:
> > Hi,
> > Another question although I have already read it somewhere i just need
> > clarification. Xapian is able to handle data of the size of about 6 gb
> > right?
>
> Yes, I expect there are much bigger indexes out there.  Having said
> that, your (total) index size will be much bigger if you store every
> document you are searching through in the document data.  You don't
> actually need to store *anything* in the document's data, as the terms
> and values are already stored for each document.
>
>
> > On Thu, Apr 21, 2011 at 5:43 AM, Rohit <76.rohit at gmail.com> wrote:
> >>
> >> Oops my bad.. Noobie mistake indeed.. Thanks for the prompt reply much
> >> appreciated..
> >>
> >> Cheers,
> >> Rohit.
> >>
> >> On Thu, Apr 21, 2011 at 5:39 AM, Sym Roe <sym.roe at talusdesign.co.uk>
> >> wrote:
> >>>
> >>> On Thu, Apr 21, 2011 at 10:24 AM, Rohit <76.rohit at gmail.com> wrote:
> >>> > This returns to me 8 documents which I know is the correct answer
> >>> > becuase I
> >>> > have made a search engine which gives me the same results. The
> problem
> >>> > is i
> >>> > only get the document numbers(ids) but not the content. the
> >>> > $doc->get_data(); is supposed to give me the content if i am not
> >>> > mistaken.
> >>> > It isnt doing so. Any help would be appreciated.
> >>>
> >>> I don't know perl, so forgive me if I make an obvious mistake here, but
> >>> this:
> >>>
> >>> > if ($doc->set_data("$File::Find::name")){
> >>>
> >>> Looks like it's setting the file name as the document data, and then
> >>> $doc->get_data() is correctly returning the file name you set.
> >>>
> >>> So everything is working fine, you're just not actually settings the
> >>> data to what you want.
> >>>
> >>> Am I missing something here?  You'll need to read the file content and
> >>> store that, or, when the results are used you could open the file
> >>> based on the file name you're storing (this would save index size).
> >>>
> >>>
> >>> --
> >>> E: sym.roe at talusdesign.co.uk
> >>> M: 07742079314
> >>
> >
> >
>
>
>
> --
> E: sym.roe at talusdesign.co.uk
> M: 07742079314
>


More information about the Xapian-discuss mailing list