[Xapian-discuss] GSoC Project, Support Erlang Language

Vladimir Zaytsev vladimir at zvm.me
Mon Mar 28 20:41:53 BST 2011


"Support Erlang Language" By Vladimir Zaytsev, Xapian, 2011

*About me*


Name: Vladimir Zaytsev

E-mail address: vladimir at zvm.me

WWW: zvm.me/, facebook.com/vladimir.zaytsev<http://www.facebook.com/vladimir.zaytsev>

Emergency contact phone number: +79028195844

Short biography:

I was born in 5th Febrary, 1991 in Donetsk, USSR; now live in
Khanty-Mansiysk, Russia. In 2008 finished Ugra Lyceum of Physics and
Mathematics. Now study at the Ugra State University, Institute of Applied
Mathematics, Computer Science and Management (GPA:5.0/5.0). I was
participated in the 9th Estonian Summer School on Computer and Systems
Science, August 2010; XLIX International Scientific Students Conference in
Novosibirsk, April 2010. My career and research interests are included:
software engineering, functional programming, information retrieval, machine
learning and data mining.


*Eligibility*


I fulfil the eligibility requirements.


*Background Information*



   - Have you taken part in GSoC and/or
GHOP<http://code.google.com/opensource/ghop/2007-8/>and/or
    GCI <http://code.google.com/gci> before?

I have not take part in GSoC, GHOP, GCI before.



   - Please tell us about any previous experience you have with Xapian, or
   other systems for indexed text search.

I don’t have any practical experience with Xapian or another indexed search
but I have some theoretical knowledge, I’ve read Manning’s “Introduction to
Information Retrieval” and Segaran's “Collective Intelligence” and similar,
so I would have chance to use that theory in practice.



   - Do you have previous experience with Free Software and Open Source
   other than Xapian?
   - I have previous experience with such OpenSource software as Python,
   Erlang/OTP, Linux(Debian), GCC, PostgreSQL and so on.
   - Do you have any other relevant prior experience?

I have similar experience from November 2009 to June 2010 I was working on
the project of developing a facts extracting system (Erlang and Python) at
the Ugra Research Institute of Information Technologies.



   - What development platforms, tools and methods do you prefer to use?

I prefer to use:

   - OS: Mac OS X and Linux;
   - Languages: Erlang, C++, Python;
   - Environment: Emacs, Textmate, Eclipse, git, make, gdb, valgrind;
   - I prefer to use a functional programming style where it is appropriate.



   - Have you previously been responsible (as an
   employee/volunteer/student/etc) for a project of a similar size?

No, I have not.



   - What timezone will you be in during the coding period?

GMT/UTC + 6:00



   - Will your Summer of Code project be the main focus of your time during
   the program?

Yes it will.



   - How many hours a week will you realistically be able to devote to your
   project?

I plan to invest 10-15 hours per week until 25th of April, 40 hours during
the GSoC.



   - Are you applying for other projects in GSoC 2011? If so, with which
   organisations?

Yes, I’m also applying for Shogun Toolbox.


*Project*


Title: “Support Erlang Language”

Summary: Add to Xapian bindings which allowed Xapian to be used from Erlang
language.


There are three reasons why I have chosen this project. Firstly, Erlang is
gaining popularity language for developing distributed scalable web(and not
only) applications where it is often needed fast search so it would be nice
to have a comfortable support of Xapian. Secondly, I’m interested in
information retrieval and similar areas so this project would be a good
starting point for practice. In conclusion I am familiar with Erlang and
enthusiastic about using my knowledge and skills to help OpenSource
community and gain new experience. In addition I have my own Erlang-driven
project where I plan to use Xapian.

*
*

*Benefits*


Nowadays there are lots small and big companies(Amazon, Facebook,
Mochimedia, JS-Kit, etc) which use Erlang and need to use search engines in
their projects so I think some of them would be interested about use
Xapian-Erlang interface.


*Project Details*


Main concepts:

   - I plan to use Erlang NIF
<http://www.erlang.org/doc/tutorial/nif.html>mechanism, which will
allow to run C++ code inside Erlang VM to minimize
   latency.
   - Of course I plan to use OTP primitives(firstly, gen_server and
   supervisors) which provide most useful behavior patterns to not invent the
   wheel.
   - I’m familiar with various Erlang interfaces for accessing to some
   applications like a DBMS, format converters, web servers, ect; so I think it
   would be better to implement Xapian interface in similar Erlang-style way to
   be more compatible with some of them.
   - On the other hand I plan to take into account all features of Xapian to
   provide the most complete access to the library.


I think it is important implement basic parts of interface first and make it
less complex. In case not everything works out exactly as planned we will
ensure at least operability of these parts.


*Approximate Project Timeline*



  before 30 April

Read documentation and source code to familiarize myself with functionality,
architecture and C++ API of the Xapian.

2 - 16 May

Learn and understand another languages bindings and SWIG.

16 - 21 May

Prepare environment to code

23 May - mid June

Define and implement all the required Erlang modules.

mid-June - 26 June

Improve speed and functionality, scrub code.

26 June - mid-July

Integrate code into Xapian. Write tests, fix bugs.

after mid-July

Write documentation, tests and examples.


More information about the Xapian-discuss mailing list