New user registration is currently disabled due to spam abuse / Регистрация новых пользователей в настоящее время приостановлена из-за злоупотреблений спаммерами

Full text search - status?

General discussion

Re: Full text search - status?

Postby Nazar » Thu Jun 16, 2011 12:33 pm

Nazar
 
Posts: 14
Joined: Wed May 06, 2009 6:27 am

Re: Full text search - status?

Postby ikm » Thu Jun 16, 2011 6:27 pm

No one's working on it right now. If someone wants to, that someone's welcome.
ikm
Автор GoldenDict
 
Posts: 1595
Joined: Wed Feb 04, 2009 10:40 am

Re: Full text search - status?

Postby Tvangeste » Thu Jun 16, 2011 8:01 pm

ikm wrote:No one's working on it right now. If someone wants to, that someone's welcome.

I've been looking into full text search libraries for C++ that we could use, just in case there is something that already works.

So far, it seems that Xapian is the most interesting library (GPL licensed C++): http://xapian.org/features

Looks promising. Are you aware of any other projects that could be of use?
Tvangeste
 
Posts: 893
Joined: Thu Jun 02, 2011 11:42 am

Re: Full text search - status?

Postby ikm » Fri Jun 17, 2011 3:01 am

Xapian indeed looks the most promising. As for the FTS implementation in GoldenDict, well, here's my proposal: I am willing to implement the dictionary crawling required to perform the FTS indexing, if someone (wink, wink!) is willing to implement the GUI for indexing and searching (using Xapian as a backend).
ikm
Автор GoldenDict
 
Posts: 1595
Joined: Wed Feb 04, 2009 10:40 am

Re: Full text search - status?

Postby Tvangeste » Fri Jun 17, 2011 8:16 am

ikm wrote:Well, here's my proposal: I am willing to implement the dictionary crawling required to perform the FTS indexing, if someone (wink, wink!) is willing to implement the GUI for indexing and searching (using Xapian as a backend).


Whoa! Abgemacht! I mean, yeah, I"ll do my best. Hopefully, my abilities with Qt will be advanced enough by then! 8-)
Tvangeste
 
Posts: 893
Joined: Thu Jun 02, 2011 11:42 am

Re: Full text search - status?

Postby ikm » Fri Jun 17, 2011 8:27 am

Ok then. I'll post back once I'm done.
ikm
Автор GoldenDict
 
Posts: 1595
Joined: Wed Feb 04, 2009 10:40 am

Re: Full text search - status?

Postby betwee » Fri Jun 17, 2011 1:17 pm

yay, :D.
betwee
 
Posts: 33
Joined: Tue May 17, 2011 8:10 am

Re: Full text search - status?

Postby ikm » Sat Jun 18, 2011 7:43 am

Ok, the crawling interface is ready. Dictionary::Class objects now have the following two functions: isCrawlingSupported() and crawl(). If the former returns true, then the latter can be used to create a Dictionary::Crawler object. From that moment on, it is completely independent from the originating dictionary object, and can be used to traverse through all the articles. For each article, a list of headwords (first one is the main one, the others are alternates, if any) and a body in html is returned. A simple crawling example:
Code: Select all
    Dictionary::Class & d = .....  // Obtain a dictionary instance from somewhere

    if ( d.isCrawlingSupported() )
    {
      printf( "Gonna crawl it!\n" );

      File::Class f( "/tmp/crawled.txt", "wb" );

      sptr< Dictionary::Crawler > crw = d.crawl();

      vector< string > headwords;
      string body;
      while( crw->fetchNextArticle( headwords, body ) )
      {
        for ( int x = 0; x < headwords.size(); ++x )
        {
          string const & str = headwords[ x ];
          f.write( str.c_str(), str.size() );
          f.write( "\n", 1 );
        }
        f.write( body.c_str(), body.size() );
        f.write( "\n", 1 );
      }

      printf( "Done!\n" );
      }
    }

This interface right now is implemented for Dsl dictionaries. Support for others will follow.

All the code lives in the 'fts' branch at github. This should be sufficient to implement dictionary indexing and searching with Xapian.
ikm
Автор GoldenDict
 
Posts: 1595
Joined: Wed Feb 04, 2009 10:40 am

Re: Full text search - status?

Postby Nazar » Mon Jun 27, 2011 6:46 pm

Well, it seems Konstantin has done his part as prompt as one could only wish. But what comes next? We want our full-text search!!! lol.
Nazar
 
Posts: 14
Joined: Wed May 06, 2009 6:27 am

Re: Full text search - status?

Postby Tvangeste » Mon Jun 27, 2011 7:04 pm

Nazar wrote:Well, it seems Konstantin has done his part as prompt as one could only wish. But what comes next? We want our full-text search!!! lol.

Heh, Konstantin was WAAAY to fast for me. :) But no worry, once I'm done with the UI tweaks, this is the biggest priority.
Tvangeste
 
Posts: 893
Joined: Thu Jun 02, 2011 11:42 am

PreviousNext

Return to General

Who is online

Users browsing this forum: No registered users and 54 guests

cron