DOMDocument - 提取标记的textcontent,但首先删除某些子元素

Sample source HTML:

<p>
 <strong>Byline:</strong> Introductory text. 

 <a href="1.html" target="">Link 1</a> |
 <span class="foo"></span> 
 <a href="2.html">Link 2</a>
 <a href="3.html">Link 3</a>
</p>

What I'm trying to do:

I'd like to load the HTML in, get rid of the links and other extraneous tags (not a problem if I have to specify what they are), things like the '|' and so on, keeping the "Byline" and "Introductory text". This is a script that parses a 3rd-party site, so I've no ability to add CSS classes, etc.

I first attempted this with (not very widely used now) PHP Simple HTML DOM Parser, and more recently have been trying DOMDocument.

However I'm getting absolutely nowhere - e.g. right now I can't even traverse the tree underneath <p>:

$doc = new DOMDocument();
$doc->loadHTML($somehtml);

$p = $doc->getElementsbyTagName('p');

foreach($p->childNodes as $item) {
  ...    
}

The above gives me a 'Undefined property: DOMNodeList::$childNodes' error for the foreach line.

Also: I'm finding it frustrating that I apparently can't visualise the DOM using print_r, var_dump etc. and also when I looped through the links using xpath->query (which seems inappropriate here as I don't really want to search for/extract specific stuff, rather take the HTML, get rid of the nodes I don't want and then save it) using print_r showed me the link text but not the contents of href="".

Could anyone recommend an understandable guide to DOMDocument? The PHP manual seems very short on practical examples.