I'm trying to translate user input in real-time into, what is effectively, a language they've defined, using PHP.
For example, a user creates the following dictionary (where the left-hand-side is the input, and the right-hand-side is the output):
[
"A" => "alpha",
"B" => "bravo",
"CD" => "charlie delta"
]
Then, the user inputs the following (see the EDIT below for details):
"A", "B", "C", "D"
How do I translate those inputs in real-time:
|-------------|---------------------------------|
| input | output |
|-------------|---------------------------------|
| "A" | "alpha" |
| "A" + "B" | "alpha" + "bravo" |
| "AB" + "C" | "alpha bravo" + ? |
| "ABC" + "D" | "alpha bravo" + "charlie delta" |
|-------------|---------------------------------|
If it was one-to-one relation between input strings and output strings, it would be no problem. However, multiple input strings may relate to a single output string (e.g., "CD" is "charlie delta").
Possible solution
I thought about tokenizing the input string into n-grams, where n is the maximum number of inputs for a single output in the user's dictionary (in the example above, n would be 2 because of "CD").
Something like this algorithm:
I tokenize the input string:
|--------|---------|
| tokens | hits |
|--------|---------|
| "A" | "alpha" |
|--------|---------|
I tokenize the new input into bigrams:
|--------|--------|
| tokens | hits |
|--------|--------|
| "B" | "beta" |
| "AB" | |
|-----------------|
I tokenize the new input into bigrams:
|--------|--------|
| tokens | hits |
|--------|--------|
| "C" | |
| "BC" | |
|-----------------|
I tokenize the new input into bigrams:
|--------|-----------------|
| tokens | hits |
|--------|-----------------|
| "D" | |
| "CD" | "charlie delta" |
|--------------------------|
Of course, the n-grams grow with the number of inputs possible. Is there a simpler or faster solution that I'm not seeing?
EDIT March 19, 2015:
The user's dictionary may involve tens of thousands of terms. So, I store it in a database. I also store the output in a database for later use.
On the front-end, the user enters their input in a text input, and the input's value is sent to PHP via an AJAX request in the background.
For example...
I might collect the text input and send it every 30-seconds or so for processing on the server so requests don't start to stack, but you get the idea.
PHP isn't going to be a great idea for this application. PHP is a server side technology, meaning you'd have to fire a submit each time you wanted to have it interpret and change the value that's input. The only way this would be feasible is to have the user complete entry (fill in the entire field) submit it to the server, string split it, parse and replace, and then return the value with a page refresh. Not terribly user-friendly.
For that reason, you'll almost certainly want to go Javascript.
In javascript, it's not terribly difficult. You'd have to define your conversion list, likely via an ajax call that gets it from your server. You'd assign that resultant data to an object that you can do lookups on to get values. You'd create a keyup or change event on your input field, in that event you'd evaluate the input, determine the output from your definition object, and return it to another field.