I am newbie in regular expression.
I want to replace repeated characters from my string. here some example
$str1 = "aaa bbb cc dddd"; // output : a b c d
$str2 = "Google is the best"; // output : Google is the best
I found lots of question related this question on stackoverflow. But it does not satisfy my requirement.
I tried this (\w)\1
but this is not my solution
Any idea ? Thanks in advance
Edit :
More examples
$str1 = "this is tesaaat. are you ook?"; // output : this is tesaaat. are you ook?
$str2 = "Good morning mmmm yyyy friendssss "; // output : Good morning m y friends
$str3 = "Hello friendd okk"; // output : Hello friend okk
Inshort I want to replace repeted charactor followed by space only.
You can use the following regex: \b(\w)\1+\b
.
Explanation:
\b
)EDIT: With more details, I would say you can get rid of the first \b
. So, it becomes: (\w)\1+\b
$text = "aaa bbb cc dddd";
$replacedText = preg_replace('{(\w)\1+}','$1',$text);
if you don't want repetitive whitespaces as well, try the following:
$replacedText = preg_replace('{(.)\1+}','$1',$text);
Try something like:
preg_replace('/(\b)(\w)\2+(\b)/', '$2', $string);
Following regex would work for all letters in any language with u
-unicode flag:
/([\p{L}\W])\1+(?= )/u
Explanations:
( # beginning of 1st capturing group
[ # beginning of characters class
\p{L} # any letter from any language
\W # any non-word character
] # end of character class
) # end of 1st capturing group
\1 # back reference to our 1st capturing group for repetition
+ # one or more character repetition
(?= ) # using positive lookahead to be sure it's followed by a space
Using preg_replace
to achieve the job:
$string = preg_replace("/([\p{L}\W])\1+(?= )/u", "$1", $string);
Output for your examples:
"aaa bbb cc dddd " => "a b c d "
"Google is the best" => "Google is the best"
"this is tesaaat. are you ook?" => "this is tesaaat. are you ook?"
"Good morning mmmm yyyy friendssss " => "Good morning m y friends "
"Hello friendd okk" => "Hello friend okk"