Given a text, how could I count the density / count of word lengths, so that I get an output like this
Found this but for python
You could start by splitting your text into words, using either explode()
(as a very/too simple solution) or preg_split()
(allows for stuff that's a bit more powerful) :
$text = "this is some kind of text with several words";
$words = explode(' ', $text);
Then, iterate over the words, getting, for each one of those, its length, using strlen()
; and putting those lengths into an array :
$results = array();
foreach ($words as $word) {
$length = strlen($word);
if (isset($results[$length])) {
$results[$length]++;
}
else {
$results[$length] = 1;
}
}
If you're working with UTF-8, see mb_strlen()
.
At the end of that loop, $results
would look like this :
array
4 => int 5
2 => int 2
7 => int 1
5 => int 1
The total number of words, which you'll need to calculate the percentage, can be found either :
foreach
loop,array_sum()
on $results
after the loop is done.And for the percentages' calculation, it's a bit of maths -- I won't be that helpful, about that ^^
You could explode the text by spaces and then for each resulting word, count the number of letters. If there are punctuation symbols or any other word separator, you must take this into account.
$lettercount = array();
$text = "lorem ipsum dolor sit amet";
foreach (explode(' ', $text) as $word)
{
@$lettercount[strlen($word)]++; // @ for avoiding E_NOTICE on first addition
}
foreach ($lettercount as $numletters => $numwords)
{
echo "$numletters letters: $numwords<br />
";
}
ps: I have not proved this, but should work
You can be smarter about removing punctuation by using preg_replace.
$txt = "Sean Hoare, who was first named News of the World journalist to make hacking allegations, found dead at Watford home. His death is not being treated as suspiciou";
$txt = str_replace( " ", " ", $txt );
$txt = str_replace( ".", "", $txt );
$txt = str_replace( ",", "", $txt );
$a = explode( " ", $txt );
$cnt = array();
foreach ( $a as $b )
{
if ( isset( $cnt[strlen($b)] ) )
$cnt[strlen($b)] += 1;
else
$cnt[strlen($b)] = 1;
}
foreach ( $cnt as $k => $v )
{
echo $k . " letter words: " . $v . " " . round( ( $v * 100 ) / count( $a ) ) . "%
";
}
My simple way to limit the number of words characters in some string with php.
function checkWord_len($string, $nr_limit) {
$text_words = explode(" ", $string);
$text_count = count($text_words);
for ($i=0; $i < $text_count; $i++){ //Get the array words from text
// echo $text_words[$i] ; "
//Get the array words from text
$cc = (strlen($text_words[$i])) ;//Get the lenght char of each words from array
if($cc > $nr_limit) //Check the limit
{
$d = "0" ;
}
}
return $d ; //Return the value or null
}
$string_to_check = " heare is your text to check"; //Text to check
$nr_string_limit = '5' ; //Value of limit len word
$rez_fin = checkWord_len($string_to_check,$nr_string_limit) ;
if($rez_fin =='0')
{
echo "false";
//Execute the false code
}
elseif($rez_fin == null)
{
echo "true";
//Execute the true code
}
?>