xapian search plugin

Francis Irving sent me a note about his work on a new Rails search plugin, acts_as_xapian. It uses the Xapian engine, which is a C++ indexer similar to Lucene. A particularly neat feature is built-in spellcheck.

I still plan to benchmark all these plugins on the Wikipedia dataset…it’s been delayed by the new job. If anyone has a big piece of iron I could use for a couple weeks I would appreciate it (16GB ram, hundreds of GB of free diskspace, no production load).

7 Comments

  1. Posted May 26, 2008 at 11:11 PM | Permalink

    If you do a new benchmark, please do not miss act_as_searchable/HyperEstraier.

    A.t.m., an updated/better commented version can be found at my branch. Install instructions are here.

  2. Posted May 27, 2008 at 11:00 AM | Permalink

    “If anyone has a big piece of iron I could use for a couple weeks I would appreciate it.”

    Try an EC2 X-Large Instance. $0.80 per hour (15GB RAM, 4 cores).

  3. Posted May 27, 2008 at 11:14 AM | Permalink

    I thought about that, but I’d rather not spend the $268 it takes to keep it running for two weeks, since I can’t finish the whole task at once.

  4. Posted July 16, 2008 at 7:49 PM | Permalink

    Any results so far? We’re really curious here about the results… :-)

  5. Posted August 11, 2008 at 5:34 PM | Permalink

    I’ve used HyperEstraier before on a small dataset (< 100,000) and it was fine. Did some complex filtering with it too that worked a treat. The only problem is that _after_ I implemented it I discovered a number of comments along the lines of it having scale issues, especially as it approached 1 million entries. Slowness, lots of long reindexing needed all the time etc… I didn’t experience this myself as our dataset didn’t get anywhere near that large. I’d be interested to see how it performs with Wikipedia, don’t have any iron for you though sorry :-(

  6. Posted October 14, 2008 at 4:47 AM | Permalink

    gh

  7. Posted November 19, 2008 at 11:29 PM | Permalink

    Acts_as_searchable is dead, long life search_do ;)

    It basically does the same, but has a simpler architecture, a lot less bugs and quirks and is 100% tested.

    Its a x-search-backend plugin but so far the only implemented module is hyperestraier.

Follow

Get every new post delivered to your Inbox.

Join 66 other followers