智能Ajax / PHP FAQ脚本

I'm working as a programmer for a major travelling agency. I'm quite experienced, but now I ran into a problem that requires kind of an AI to be built up. I know these kinds of scripts exist everywhere, but I can't seem to find anything useful.

Basically we're building a FAQ script. We're getting loaded up with e-mails of the same types of questions every day, so we want to build a contact form which works just as it does while I'm writing this: In the right side it presents a number of already answered questions which are somehow similar to what I'm writing right now. The same happens while I write the subject.

Well, to get down to business. I'm making a contact form, but as the client is writing the subject and/or message, I want a number of pre-defined Q&A's to be presented for them as they write. I believe that I'm not able to use soundex as the FAQ will be in danish and thereby not sounding phonetically like english.

So.. How would I:

  • Build up the database? Should I use fulltext or tags, and extract the known tags from the message?
  • Build up the PHP script? Are there any functions that you think I should know of?

Basically I'm researching, so I would be very grateful for merely simple SQL queries as well as full scripts designed for the purpose! Anything is useful.

Was going to add this as a comment to Zane, but it got rather long:

Depending on danish grammar you might need some rather large cut-off point for the Levenshtein distance to find probable matches.

If you have some more time to spend on this, you might want to split at the word-boundaries, stem the individual words and then compare counts of those stems with what you already have in the database. There appears to be a stemming library at http://pecl.php.net/package/stem (I've never used it but it appears to support danish).

Since it appears that pecl-stem has no formal documentation I could find (well, and I was curious), you'd use it like this after installing the pecl package:

$stem = stem($myInputWord, STEM_DANISH);

And since I was perusing the PHP manual anyways, I might as well add that for larger applications (I wouldn't introduce it just for your case) you might want to take a look at the Search Engine Section of the PHP manual for setting up Solr or the like. But again, that's probably overkill in your case.