抓取中间的文本到变量[重复]

Possible Duplicate:
PHP DOMDocument - get html source of BODY

I have the following code as a variable and trying to grab everything in between the body tags (while keeping the p tags etc). Whats the best way of doing this?

  • pregmatch
  • strpos / substr

    <head>
    <title></title>
    </head>
    <body>
        <p>Services Calls2</p>
    </body>
    

Neither. You can use a XML parser, like DomDocument:

$dom = new DOMDocument();
$dom->loadHTML($var);

$body = $dom->getElementsByTagName('body')->item(0);

$content = '';

foreach($body->childNodes as $child)
  $content .= $dom->saveXML($child);

I recommend you to use preg_match because contents between <p>Services Calls2</p> can change all the time then subtr or strpos is going to require quite controversial code.

Example:

$a = '<h2><p>Services Calls2</p></h2>';
preg_match("/<p>(?:\w|\s|\d)+<\/p>/", $a, $ar);
var_dump($ar);

The regex is going to allow alphabets, space and digits only.

Try this, $html has the text:

$s = strpos($html, '<body>') + strlen('<body>');
$f = '</body>';

echo trim(substr($html, $s, strpos($html, $f) - $s));