I see many examples here to how split string to sentences based on .
but my question is about how to split string to sentences based on words count, and forgot the .
or ,
for example:
function splitToSentences($wordsCount){
....
}
$string = "orem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum."
print_r( splitToSentences(10) );
Output:
[
'0' => 'orem ipsum dolor sit amet, consectetur adipiscing elit, sed do',
'1' => 'eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut',
.....
]
$content = "orem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.";
$dataSet = explode(' ', $content);
$dataSet = array_chunk($dataSet, 10);
$dataSet = array_map(function($string) {
return implode(' ', $string);
},$dataSet);
var_dump($dataSet);
Result:
array(7) {
[0]=>
string(62) "orem ipsum dolor sit amet, consectetur adipiscing elit, sed do"
[1]=>
string(62) "eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut"
[2]=>
string(68) "enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi"
[3]=>
string(57) "ut aliquip ex ea commodo consequat. Duis aute irure dolor"
[4]=>
string(64) "in reprehenderit in voluptate velit esse cillum dolore eu fugiat"
[5]=>
string(71) "nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in"
[6]=>
string(54) "culpa qui officia deserunt mollit anim id est laborum."
}
Working Example http://ideone.com/zNXPi3
Try this:
$string = 'Contrary to popular belief, Lorem Ipsum is not simply random text. It has roots in a piece of classical Latin literature from 45 BC, making it over 2000 years old. Richard McClintock, a Latin professor at Hampden-Sydney College in Virginia, looked up one of the more obscure Latin words, consectetur, from a Lorem Ipsum passage, and going through the cites of the word in classical literature, discovered the undoubtable source.';
$sentences = preg_split('/(?<=[.?!;])\s+(?=\p{Lu})/', $string);
$ii = 0;
$paragraphs = array();
foreach ( $sentences as $value ) {
if ( isset($paragraphs[$ii]) ) {
$paragraphs[$ii] .= $value;
} else {
$paragraphs[$ii] = $value;
}
if ( str_word_count($paragraphs[$ii]) > 9 ) {
$ii++;
}
}
print_r($paragraphs);
Hope this helps.
Peace! xD
I'm not familiar with regex in php, but I do believe this regex will do the trick:
((?:\s*\S+){10})\s*
It matches up to 10 words preceded by any number of spaces or newlines, followed by any number of white space. The '10' is the number of words to match.
Demo: https://regex101.com/r/yR5uZ8/3
This seems to work:
<?php
function splitToSentences($wordsCount) {
$str = "orem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.";
preg_match_all("/((?:\s*\S+){".$wordsCount."})\s*/", $str, $match);
return $match[0];
}
print_r(splitToSentences(10));
Test online: http://ideone.com/ADpFff