I'm working as a programmer for a major travelling agency. I'm quite experienced, but now I ran into a problem that requires kind of an AI to be built up. I know these kinds of scripts exist everywhere, but I can't seem to find anything useful.
Basically we're building a FAQ script. We're getting loaded up with e-mails of the same types of questions every day, so we want to build a contact form which works just as it does while I'm writing this: In the right side it presents a number of already answered questions which are somehow similar to what I'm writing right now. The same happens while I write the subject.
Well, to get down to business. I'm making a contact form, but as the client is writing the subject and/or message, I want a number of pre-defined Q&A's to be presented for them as they write. I believe that I'm not able to use soundex
as the FAQ will be in danish and thereby not sounding phonetically like english.
So.. How would I:
Basically I'm researching, so I would be very grateful for merely simple SQL queries as well as full scripts designed for the purpose! Anything is useful.
Look into Levenshtein distance
Was going to add this as a comment to Zane, but it got rather long:
Depending on danish grammar you might need some rather large cut-off point for the Levenshtein distance to find probable matches.
If you have some more time to spend on this, you might want to split at the word-boundaries, stem the individual words and then compare counts of those stems with what you already have in the database. There appears to be a stemming library at http://pecl.php.net/package/stem (I've never used it but it appears to support danish).
Since it appears that pecl-stem has no formal documentation I could find (well, and I was curious), you'd use it like this after installing the pecl package:
$stem = stem($myInputWord, STEM_DANISH);
And since I was perusing the PHP manual anyways, I might as well add that for larger applications (I wouldn't introduce it just for your case) you might want to take a look at the Search Engine Section of the PHP manual for setting up Solr or the like. But again, that's probably overkill in your case.