从html中删除 和转义字符

I have the following html, which I have extracted from an email using imap_fetchbody,

<div dir=\"ltr\"><br><div class=\"gmail_quote\"><div dir=\"ltr\"><br><div class=\"gmail_quote\"><div class=\"\">
---------- Forwarded message ----------<br>
<span style=\"font-family:&quot;Helvetica&quot;,&quot;sans-serif&quot;\"><\/span>
From: <span style=\"font-family:&quot;Helvetica&quot;,&quot;sans-serif&quot;\">&quot;
<span>xyz<\/span>&quot; &lt;<a href=\"mailto:support@xyz.com\" target=\"_blank\">support@<span>xyz<\/span>.com<\/a>&gt;<\/span><br>




Date: Fri, Apr 18, 2014 at 7:17 PM<br>
Subject: Bla bla xyz<br><\/div><div><div class=\"h5\">To: XYZ &lt;<a href=\"mailto:xyz@gmail.com\" target=\"_blank\">xyz@gmail.com<\/a>&gt;<br><br><br>

<div dir=\"ltr\">




<div class=\"gmail_quote\"><div><div><div dir=\"ltr\"><div class=\"gmail_quote\"><div dir=\"ltr\"><div><div class=\"gmail_quote\">
<div dir=\"ltr\"><div><div><div class=\"gmail_quote\"><div style=\"word-wrap:break-word\" lang=\"EN-US\">




<div>
<div>
<div>
<blockquote style=\"margin-top:5pt;margin-bottom:5pt\">
<div><div>
<table style=\"width:100%;background:none repeat scroll 0% 0% rgb(207,207,207)\" cellpadding=\"0\" cellspacing=\"0\" border=\"0\" width=\"100%\">
<tbody>
<tr>




<td style=\"width:325pt;padding:0in\" width=\"650\">

<div align=\"center\"><table style=\"width:325pt;background:none repeat scroll 0% 0% rgb(207,207,207)\" cellpadding=\"0\" cellspacing=\"0\" border=\"0\" width=\"650\">




<tbody><tr>
<td style=\"padding:0in 0in 5.25pt\"><p style=\"text-align:center\" align=\"center\">
<span style=\"font-size:7.5pt;font-family:&quot;Arial&quot;,&quot;sans-serif&quot;;color:rgb(64,64,64)\">If you are unable to see this message, 
<a href=\"http:\/\/click.e.xyz.com\/?qs=3771d7c90c958f02a4b2e78494f12a3116ddb15df79b8d04cdf5aeba42012b118\" target=\"_blank\">
<span style=\"color:rgb(64,64,64)\">click here<\/span><\/a> to view.<br>










To ensure delivery to your inbox, please add <a href=\"mailto:support@xyz.com\" target=\"_blank\">support@xyz.com<\/a> to your address book. <\/span><\/p>
<\/td>
<\/tr>
<\/tbody>
<\/table>
<\/div><\/div><\/div><\/div>

I want to get rid of all the \,, and still keep < and > of the html as is. I have tried stripslashes, stripcslashes, nl2br, htmlspecialchars_decode. But I am not able to achieve what I want. Here is what I have tried along with imap_qprint function,

$text = stripslashes(imap_qprint($text));
$body = preg_replace('/(\v|\s)+/', ' ', $text );

Res: It doesn't remove all the white space characters.

Match the following regex:

(\|\ |\\) with the g modifier

and replace with

'' (empty string)

Demo: http://regex101.com/r/mS3wM2

If string functions can do the trick, always favor stringfunctions above regex´s. Performace/speed will be better compared to regex's, and they's easier to read in the code:

$message = str_replace("
", '', $message ); // replace all newlines, use double quotes!
$message = stripslashes( $message );

First you have to remove the newlines. As far as I can tell, the and always come together, so I replace them in 1 go. After that, the stripslashes will remove all escaping slashes.
You have to the the stripslashes after the newlines, else would result in rn, making them harder to find


This works perfect in my tests:

echo '<textarea style="width:100%; height: 33%;">'.$message.'</textarea>';
echo '<hr />';

$message = str_replace("
", '', $message); // use double quotes!
echo '<textarea style="width:100%; height: 33%;">'.$message.'</textarea>';
echo '<hr />';

$message = stripslashes($message);
echo '<textarea style="width:100%; height: 33%;">'.$message.'</textarea>';
$html = preg_replace('/[\\\\
]/', '', $html);

Match a single character present in the list below «[\\
]»
   A \ character «\\»
   A carriage return character «»
   A line feed character «
»

UPDATE:

Based on your comment I've updated my answer:

$html = preg_replace('%\\\\/%sm', '/', $html);
$html = preg_replace('/\\\\"/sm', '"', $html);
$html = preg_replace('/[
]/sm', '', $html);

You could use something like this to interpret the escape sequences:

function interpret_escapes($str) {
    return preg_replace_callback('/\\\\(.)/u', function($matches) {
        $map = ['n' => "
", 'r' => "", 't' => "\t", 'v' => "\v", 'e' => "\e", 'f' => "\f"];
        return isset($map[$matches[1]]) ? $map[$matches[1]] : $matches[1];
    }, $str);
}

If you can open the file in vi, it would be as easy as:

%s/\\|\ //g

on vi cmd mode