I'm trying to trim some HTML text and found a thread but can't comment in it yet because I'm new (Using PHP substr() and strip_tags() while retaining formatting and without breaking HTML)
First i create the function preview (input: html text or plain text, number of char, boolean if you want plain text output) but when I tried to extend the functionality to work with HTML tags, the problem begin
I used the function html_cut()
from the other post to close tags but I need some nested tags and I think the function closed every tag it found so it breaks the hierarchy. (Is it in fact the problem or i'm wrong?)
function preview($text, $char, $sinhtml){
if(strlen($text) > $char){
$post = substr($text, $char, 1);
if ($post != " "){
$i = true;
while($post != " "){
if($char > 0 && $i){
$char--;
$post = substr($text, $char, 1);
}elseif($char == 0){
$i = false;
$char++;
}else{
$char++;
$post = substr($text, $char, 1);
}
}
}
$post = substr($text, 0, $char);
$post .= " …";
if($sinhtml){
return strip_tags($post);
}else{
--> return $post;
}
}else{
return $text;
}
}
The input text is something like this
<p> Some text… </p>
<ul>
<li>Technical Description</li>
<li>or Details (weight, size, etc.)</li>
<li>…</li>
</ul>
<p>may be some more text</p>
The function html_cut()
has a line that I´ve never seen before and don´t know what it does… $symbol = $text{$i}
function html_cut($text, $max_length)
{
$tags = array();
$result = "";
$is_open = false;
$grab_open = false;
$is_close = false;
$in_double_quotes = false;
$in_single_quotes = false;
$tag = "";
$i = 0;
$stripped = 0;
$stripped_text = strip_tags($text);
while ($i < strlen($text) && $stripped < strlen($stripped_text) && $stripped < $max_length)
{
$symbol = $text{$i};
$result .= $symbol;
switch ($symbol)
{
case '<':
$is_open = true;
$grab_open = true;
break;
case '"':
if ($in_double_quotes)
$in_double_quotes = false;
else
$in_double_quotes = true;
break;
case "'":
if ($in_single_quotes)
$in_single_quotes = false;
else
$in_single_quotes = true;
break;
case '/':
if ($is_open && !$in_double_quotes && !$in_single_quotes)
{
$is_close = true;
$is_open = false;
$grab_open = false;
}
break;
case ' ':
if ($is_open)
$grab_open = false;
else
$stripped++;
break;
case '>':
if ($is_open)
{
$is_open = false;
$grab_open = false;
array_push($tags, $tag);
$tag = "";
}
else if ($is_close)
{
$is_close = false;
array_pop($tags);
$tag = "";
}
break;
default:
if ($grab_open || $is_close)
$tag .= $symbol;
if (!$is_open && !$is_close)
$stripped++;
}
$i++;
}
while ($tags)
$result .= "</".array_pop($tags).">";
return $result;
}
Try using HTML parser or Tidy HTML. For checking the nested tags