Replace all dots in paragraph with new line except that dots between numbers and after Dr., Mr., Bsc. .... etc.
For Example:
Consider this Paragraph
My name is Ayman. I'm 31 years. I'm 1.92M. I have BSc. degree in Computer Engineering
I want to apply such REGEX and convert it as the following:
My name is Ayman.
I'm 31 years.
I'm 1.92M. <===== note the '.' between 1 and 92 did not replace with new line
I have BSc. degree in Computer Engineering <=== the same . after BSc did not replace with new line
I tried the following but this REGEX replace all dots.
$desc['contents']=preg_split("/(?<!\..)([\?\!\.]+)\s(?!.\.)/",$desc['contents'],-1, PREG_SPLIT_DELIM_CAPTURE);
Try
$str = "My name is Ayman. I'm 31 years. I'm 1.92M. I have BSc. degree in Computer Engineering";
$str = preg_split("/([\?\!\.]+)(?=\s+[A-Z])/",$str);
foreach($str as $new_str)
{
echo $new_str.".<br />";
}
Output
My name is Ayman.
I'm 31 years.
I'm 1.92M.
I have BSc. degree in Computer Engineering.
You can use this regex for search:
(?:BSc|[JSMD]r|Mr?s|\d)\.(*SKIP)(*F)|(\.\h*)
and replace by "$1 "
$str = preg_replace('/(?:BSc|[JSMD]r|Mr?s|\d)\.(*SKIP)(*F)|(\.\h*)/i', '$1
', $str);
You can add more word patterns in (?:BSc|[JSMD]r|Mr?s|\d)
that you want to ignore before DOT.
(*SKIP)(*F)
together provide a nice alternative of restriction that you cannot have a variable length lookbehind in above regex.
I think you can use capturing group like this:
/\.\d|BSc\.|Mrs?\.|Dr\.|([.!?])/
And replace all matches of substitution \1
with .
.
Note that I think you need to ignore .
before numbers like .1
not after a number like And counter is 30.