I'm trying to write a small routine responsible for permuting all possible abbreviations for a string. This string is a full name, separated by spaces. Like this:
James Mitchell Rhodes
I want to output:
J. Mitchell Rhodes
James M. Rhodes
J. M. Rhodes
And so on... however, i also have to consider "stopwords":
James the Third Rhodes
I want to output:
J. the Third R.
James The Third R.
Is there a known algorithm for this? I've been trying to fix this problems for quite some time now.
UPDATE: Having each word in array is dead easy. Just explode(' ', $string) and then array_map, to exclude the stop words considering in_array($word, $stopWordsMap). This is NOT the problem, and NOT the focus of the question. The problem is how to discover the combination of possible Original words (O) and Abbreviated words (A):
O A A
O A O
O A A
A A A
O O O
I am not gonna write the full code, since you said permuting is not the issue. This is about figuring out which words to permute for all scenarios.
I had to think of the binary system, stay with me on this on xD if you want to have all the possible inputs to a function with n inputs, you need 2^n input scenarios.
so for you 2 inputs you wolud have
0 0
0 1
1 0
1 1
alright? we can get this as an array in php with
$map = array();
$inputs = 2;
for($i=0;$i<=2^$inputs;$i++){
$bin = decbin($i); // returns string
$array = preg_split('//', $bin, -1, PREG_SPLIT_NO_EMPTY); // but i want a array
$map[] = $array;
}
Now if your string that you want to permute has three words, see them as three inputs, and then all the $map rows tell you which word to permute every time to get all possibly strings, if the first item in that row is 0, dont permute the first word, if it is 1 permute the first word and so on..
Here are all the rows and the resulting string for your example
0 0 0 James Mitchell Rhodes
0 0 1 James Mitchell R,
0 1 0 James M. Rhodes
0 1 1 James M. R.
1 0 0 J. Mitchell Rhodes
1 0 1 J. Mitchell R.
1 1 0 J. M. Rhodes
1 1 1 J. M. R
First instinct is to iterate the binary permutations with a for loop, i.e., strip out stop words (remembering their positions if you wish), 2 ^ numOfRemainingElements, AND
out the state (meaning abbreviated or not) for each word:
$names = array('James', 'Earl', 'Jones');
$nameCount = count($names);
$permCount = pow(2, $nameCount);
for ($p = 1; $p < $permCount; $p++) {
for ($n = 0; $n < $nameCount; $n++) {
echo $p & pow(2, $n) ? $names[$n][0] . '.' : $names[$n];
echo ' ';
}
echo "
";
}
/* output:
J. Earl Jones
James E. Jones
J. E. Jones
James Earl J.
J. Earl J.
James E. J.
J. E. J.
*/
You can tweak it further but you can see where I'm going with it.