xapian search plugin

Francis Irving sent me a note about his work on a new Rails search plugin, acts_as_xapian. It uses the Xapian engine, which is a C++ indexer similar to Lucene. A particularly neat feature is built-in spellcheck.

I still plan to benchmark all these plugins on the Wikipedia dataset…it’s been delayed by the new job. If anyone has a big piece of iron I could use for a couple weeks I would appreciate it (16GB ram, hundreds of GB of free diskspace, no production load).

7 responses

  1. I thought about that, but I’d rather not spend the $268 it takes to keep it running for two weeks, since I can’t finish the whole task at once.

  2. I’ve used HyperEstraier before on a small dataset (< 100,000) and it was fine. Did some complex filtering with it too that worked a treat. The only problem is that _after_ I implemented it I discovered a number of comments along the lines of it having scale issues, especially as it approached 1 million entries. Slowness, lots of long reindexing needed all the time etc… I didn’t experience this myself as our dataset didn’t get anywhere near that large. I’d be interested to see how it performs with Wikipedia, don’t have any iron for you though sorry :-(

  3. Acts_as_searchable is dead, long life search_do ;)

    It basically does the same, but has a simpler architecture, a lot less bugs and quirks and is 100% tested.

    Its a x-search-backend plugin but so far the only implemented module is hyperestraier.