如何从PHP中的字符串中获取html标签?

I have a html output I'm pulling from a RSS feed, it is somethig like this:

<div>
    <p>
        Some text
    </p>
    <iframe src="http://www.source.com"></iframe>
</div>

The problem is that I only need the attr "src" of the iframe tag, Is there a way to get it with PHP? Regex maybe?

Thanks in advance!

I'd recommend DOMDocument or SimpleXML.

Something like this might give you the idea.

var_dump(simplexml_load_string($rss_feed));

I'm not an expert with regex, but a alternative way would be to use explode on the " marks and get array[1] like this:

$rssFeed = '<div>
    <p>
        Some text
    </p>
    <iframe src="http://www.source.com"></iframe>
</div>';

$rssArray = explode('"', $rssFeed);

echo $rssArray[1];

This requires your RSS feed to be very consistent though, if the "Some text" part were to contain " marks, this would mess up and you'd get a wrong string.

You could look through the array for everything starting with http or www to work around errors, but again, it requires a very consistent RSS feed, so you have to judge for you self if this would do the job good enough.

If you're consistently getting just the data you listed above, you could use a simple substring, using the string positions of src=" and "><iframe to specify which substring you want:

$html = '<div><p>Some text</p><iframe src="http://www.source.com"></iframe></div>';

$start = strpos($html, 'src="') + 5;
$length = strpos($html, '"></iframe') - $start;
$src = substr($html, $start, $length);

echo $src;

EDIT - fixed the code and split into multiple lines. This could easily be a one-liner, but - thought it was easier to understand if I broke into multiple lines.

You could parse this output with a little command line perl script. This can be quite robust depending on how general you make the regular expression.

For example,

$command = "echo your_html_output | perl -pe 's/src=\"(.*)\"/$1/'"; # Capture what is in between src=" and the " (the closing quote)

$output = shell_exec("$command");