I have the following html, which I have extracted from an email using imap_fetchbody,
<div dir=\"ltr\"><br><div class=\"gmail_quote\"><div dir=\"ltr\"><br><div class=\"gmail_quote\"><div class=\"\">
---------- Forwarded message ----------<br>
<span style=\"font-family:"Helvetica","sans-serif"\"><\/span>
From: <span style=\"font-family:"Helvetica","sans-serif"\">"
<span>xyz<\/span>" <<a href=\"mailto:support@xyz.com\" target=\"_blank\">support@<span>xyz<\/span>.com<\/a>><\/span><br>
Date: Fri, Apr 18, 2014 at 7:17 PM<br>
Subject: Bla bla xyz<br><\/div><div><div class=\"h5\">To: XYZ <<a href=\"mailto:xyz@gmail.com\" target=\"_blank\">xyz@gmail.com<\/a>><br><br><br>
<div dir=\"ltr\">
<div class=\"gmail_quote\"><div><div><div dir=\"ltr\"><div class=\"gmail_quote\"><div dir=\"ltr\"><div><div class=\"gmail_quote\">
<div dir=\"ltr\"><div><div><div class=\"gmail_quote\"><div style=\"word-wrap:break-word\" lang=\"EN-US\">
<div>
<div>
<div>
<blockquote style=\"margin-top:5pt;margin-bottom:5pt\">
<div><div>
<table style=\"width:100%;background:none repeat scroll 0% 0% rgb(207,207,207)\" cellpadding=\"0\" cellspacing=\"0\" border=\"0\" width=\"100%\">
<tbody>
<tr>
<td style=\"width:325pt;padding:0in\" width=\"650\">
<div align=\"center\"><table style=\"width:325pt;background:none repeat scroll 0% 0% rgb(207,207,207)\" cellpadding=\"0\" cellspacing=\"0\" border=\"0\" width=\"650\">
<tbody><tr>
<td style=\"padding:0in 0in 5.25pt\"><p style=\"text-align:center\" align=\"center\">
<span style=\"font-size:7.5pt;font-family:"Arial","sans-serif";color:rgb(64,64,64)\">If you are unable to see this message,
<a href=\"http:\/\/click.e.xyz.com\/?qs=3771d7c90c958f02a4b2e78494f12a3116ddb15df79b8d04cdf5aeba42012b118\" target=\"_blank\">
<span style=\"color:rgb(64,64,64)\">click here<\/span><\/a> to view.<br>
To ensure delivery to your inbox, please add <a href=\"mailto:support@xyz.com\" target=\"_blank\">support@xyz.com<\/a> to your address book. <\/span><\/p>
<\/td>
<\/tr>
<\/tbody>
<\/table>
<\/div><\/div><\/div><\/div>
I want to get rid of all the \
,,
and still keep
<
and >
of the html as is. I have tried stripslashes, stripcslashes, nl2br, htmlspecialchars_decode. But I am not able to achieve what I want. Here is what I have tried along with imap_qprint
function,
$text = stripslashes(imap_qprint($text));
$body = preg_replace('/(\v|\s)+/', ' ', $text );
Res: It doesn't remove all the white space characters.
Match the following regex:
(\|\ |\\)
with the g
modifier
and replace with
''
(empty string)
If string functions can do the trick, always favor stringfunctions above regex´s. Performace/speed will be better compared to regex's, and they's easier to read in the code:
$message = str_replace("
", '', $message ); // replace all newlines, use double quotes!
$message = stripslashes( $message );
First you have to remove the newlines. As far as I can tell, the and
always come together, so I replace them in 1 go. After that, the stripslashes will remove all escaping slashes.
You have to the the stripslashes after the newlines, else would result in
rn
, making them harder to find
This works perfect in my tests:
echo '<textarea style="width:100%; height: 33%;">'.$message.'</textarea>';
echo '<hr />';
$message = str_replace("
", '', $message); // use double quotes!
echo '<textarea style="width:100%; height: 33%;">'.$message.'</textarea>';
echo '<hr />';
$message = stripslashes($message);
echo '<textarea style="width:100%; height: 33%;">'.$message.'</textarea>';
$html = preg_replace('/[\\\\
]/', '', $html);
Match a single character present in the list below «[\\
]»
A \ character «\\»
A carriage return character «»
A line feed character «
»
UPDATE:
Based on your comment I've updated my answer:
$html = preg_replace('%\\\\/%sm', '/', $html);
$html = preg_replace('/\\\\"/sm', '"', $html);
$html = preg_replace('/[
]/sm', '', $html);
You could use something like this to interpret the escape sequences:
function interpret_escapes($str) {
return preg_replace_callback('/\\\\(.)/u', function($matches) {
$map = ['n' => "
", 'r' => "", 't' => "\t", 'v' => "\v", 'e' => "\e", 'f' => "\f"];
return isset($map[$matches[1]]) ? $map[$matches[1]] : $matches[1];
}, $str);
}
If you can open the file in vi, it would be as easy as:
%s/\\|\ //g
on vi cmd mode