使用PHP正则表达式搜索精确的重复序列(并删除所有重复序列)

I'm looking for a regex that will find an exact repeating pattern (case-sensitive). For instance, middle initials in a name string. Examples:

  • Jim G G Bob is a match on " G " x 2
  • Jim G. G. Bob is a match on " G. " x 2
  • Tom H H H Ford is a match on " H " x 3 and so on ...
  • Sarah H Howard is NOT a match because we have "h ", "H ", and " H" (which are all unique)

I only want to keep the 1st occurrence and remove all others. What will find and remove exact duplicates?

You can use this in php:

$repl = preg_replace('/\b([a-zA-Z]\W+)\1+/u', '$1', $str);

RegEx Demo

RegEx Breakup:

\b           # word boundary
(            # capturing group #1 start
   pL        # match a single unicode letter 
   \W+       # match 1 or more non-word character
)            # capturing group #1 start
\1+          # match 1 or more of captured group #1 to match *repeats*

A non-regex way to solve this general problem: explode on space and loop over the resulting array, unsetting each key where the value is the same as the previous, then implode to form the sequential-duplicate-free string.

$words = explode(' ', $string);
$previous = null;
foreach ($words as $key => $value) {
    if ($value == $previous) unset($words[$key]);
    $previous = $value;
}
$string = implode(' ', $words);