从img src中删除http:[关闭]

Using php is it possible to remove the http: protocol from an img src?

So img src will be:

<img src="//www.example.com/image.jpg" />

instead of

<img src="http://www.example.com/image.jpg" />

Would str_replace be a good option here? I know I can define:

$contentImg = str_replace(array('http', 'https'), '', $filter);

I'm just not sure how to define $filter.

Assuming that $filter works fine and is the source is fetched correctly, you can also use a regular expression replace:

$contentImg = preg_replace('/^https?:/','', $string);

'/^https?:/' is here a regex: - the ^ character means the beginning of a string, such that you only removes potential protocols in the front. - the ? is a special character that specifies that the s is optional. It will thus match both http: and https:.

Using regexes, you can write some queries more compact. Say (for the sake of answer) that you also wish to remove ftp and sftp, you can use:

'/^(https?|s?ftp):/'

Since | means or and the brackets are for grouping purposes.

You also forgot to remove the colon (:).

I'm however more worried that your $filter will contain the entire page source code. In that case, it can do more harm than good since text containing http: can also get removed. In order to parse and process XML/HTML, one better uses a DOMParser. This will introduce some overhead, but as some software engineers argue: "Software engineering is engineering systems against fools, the universe currently produces more and more fools, the small bit of additional overhead is thus justifiable".

Example:

You should definitely use a DOMParser as argued before (since such approach is more failsafe):

$dom = new DOMDocument;
$dom->loadHTML($html); //$html is the input of the document
foreach ($dom->getElementsByTagName('img') as $image) {
    $image->setAttribute('src',preg_replace('/^https?:/','',$image->getAttribute('src')));
}
$html = $dom->saveHTML(); //html no stores the new version

(running this in php -a gives you the expected output for your test example).

Or in a post-processing step:

$html = get_the_content();
$dom = new DOMDocument;
$dom->loadHTML($html); //$html is the input of the document
foreach ($dom->getElementsByTagName('img') as $image) {
    $image->setAttribute('src',preg_replace('/^https?:/','',$image->getAttribute('src')));
}
$html = $dom->saveHTML();
echo $html;

Performance:

Tests were performed about the performance using the php -a interactive shell (1'000'000 instances):

$ php -a
php > $timea=microtime(true); for($i = 0; $i < 10000000; $i++) { str_replace(array('http:', 'https:'), '', 'http://www.google.com'); }; echo (microtime(true)-$timea);  echo "
";
5.4192590713501
php > $timea=microtime(true); for($i = 0; $i < 10000000; $i++) { preg_replace('/^https?:/','', 'http://www.google.com'); }; echo (microtime(true)-$timea);  echo "
";
5.986407995224
php > $timea=microtime(true); for($i = 0; $i < 10000000; $i++) { preg_replace('/https?:/','', 'http://www.google.com'); }; echo (microtime(true)-$timea);  echo "
";
5.8694758415222
php > $timea=microtime(true); for($i = 0; $i < 10000000; $i++) { preg_replace('/(https?|s?ftp):/','', 'http://www.google.com'); }; echo (microtime(true)-$timea);  echo "
";
6.0902049541473
php > $timea=microtime(true); for($i = 0; $i < 10000000; $i++) { str_replace(array('http:', 'https:','sftp:','ftp:'), '', 'http://www.google.com'); }; echo (microtime(true)-$timea);  echo "
";
7.2881300449371

Thus:

str_replace:           5.4193 s     0.0000054193 s/call
preg_replace (with ^): 5.9864 s     0.0000059864 s/call
preg_replace (no ^):   5.8695 s     0.0000058695 s/call

For more possible parts (including sftp and ftp):

str_replace:           7.2881 s     0.0000072881 s/call
preg_replace (no ^):   6.0902 s     0.0000060902 s/call

Yeah str_replace is where it's at. It would be a protocol-relative link instead.

<?php echo str_replace(array('http:', 'https:'), '', 'http://www.google.com'); ?>

It outputs

//www.google.com

That does as expected. Otherwise you can use preg_replace which will allow you to use regex or regular expressions. CommuSoft posted an answer with a good example.