超快速部分文本匹配的数据库/语言选项[关闭]

I am building a project and require a super fast way of supplying an autocomplete feed with results based on a partial text match.

I will be indexing/searching on only one field in a database, though the database row will have additional data I won't be indexing those fields. I will have approx. 25k rows.

Requirements:

Must match anywhere in the field (Lorem Ipsum Dolor Sit Amet would be found when starting to type "Lor", "Ipsum", "olor", "Sit Amet")
Needs to be extremely quick at returning results in a JSON feed (though the original source of the data doesn't matter too much)
Scalable solution for high traffic

I have reviewed a few options...

Using MongoDB like such like query in mongoDB
ElasticSearch - not sure if a bit overkill for what I need to do, and haven't seen any exaples of matching the partial text as above
SQL LIKE query, but imagine this won't be nearly fast enough?

Programming language isn't too much of an issue but Python or PHP would be preferred.

As others have mentioned, a full-text index that performs linguistic and syntactic analysis (tokenizing, stemming, case and accent-normalization, etc) will give you the best results. But this won't come without a certain amount of setup and configuration.

Check out Solr's Suggester component: http://wiki.apache.org/solr/Suggester, and there is a new one - I think it's called AnalyzingSuggester or some such, which is available with Lucene only, I think, so if you want an in-memory solution you could use that (Java only though).

This sounds like a typical full-text searching thing. Depending on your application and the database the data is in, an in-process whoosh might do what you need (Like Lucene for Java).

You're right to say that an SQL LIKE query is going to perform horribly compared to an actual full-text index. MongoDB might not be a very good fit either, though is tunable to do roughly what you suggest.