[Xapian-devel] My Introduction and Ideas

Olly Betts olly at survex.com
Fri Feb 28 12:25:23 GMT 2014


On Thu, Feb 27, 2014 at 09:03:27AM +0800, Chu Bingxiang wrote:
> Before I sent the first e-mail to the e-mail list, I have already know
> this "SCWS" project. But it seems a little unreliable. The recent
> commit was a month ago and that was a "web module" commit. Most of the
> commits are almost a year ago.

While it doesn't seem to be a hugely active project, it's clearly not
dead.

> And I did some test on the demo page, there are some errors whit the
> result.

Can you give some examples of errors it makes?

And how does its output compare with other segmentation algorithms for
Chinese?

Perfection probably isn't attainable, but good search results should be
possible even if the segmentation is occasionally imperfect.  Most
English documents contain the occasional spelling error or typing error,
but you can still produce good search results.

> I hope we can make our own Chinese segmentation algorithm base on Dai
> Youli's work or a new one in the future. And if the work is too big,
> maybe we can continue to do it in the another GSoC season which was
> recommended by GSoC.

Another option would be to improve SCWS.

> >But if you (or anyone else) wants to work on translations outside of
> >GSoC, I'd suggest the newer "Getting Started with Xapian" guide would be
> >the best document to work on:
> 
> >http://getting-started-with-xapian.readthedocs.org/en/latest/
> 
> Yes, and actually I didn't want to do it in the GSoC season. I'm
> already working on this for some time. But my English is poor, so it
> could be a long way to go.

If you have some done already, I'm happy to merge it in.  If there was a
partial translation there, people would probably be more inclined to
contribute to it.

> And I noticed that in this year, there are
> many students from China, maybe two from Peking University,one from
> Fudan University and one in Canada now. In China, almost 80% of
> students who is learning Computer Scinece and 99% of all don't know
> what is Open Source. I hope we should work on pushing the development
> of the Open Source Project in China. And these words are for the
> students whose country's situation is like China. Also I hope Jiarong
> Wei and the other Chinese students can help with the translations.

It'd be great if people wanted to collaborate on it.

Cheers,
    Olly



More information about the Xapian-devel mailing list