php - loadHTML() - 每个<p>直到某个类

I'm calling some wikipedia content two different way:

$html = file_get_contents('https://en.wikipedia.org/wiki/Sans-serif');

The first one is to call the first paragraph

$dom = new DomDocument();
@$dom->loadHTML($html);
$p = $dom->getElementsByTagName('p')->item(0)->nodeValue;
echo $p;

The second one is to call the first paragraph after a specific $id

$dom = new DOMDocument();
@$dom->loadHTML($html);
$p=$dom->getElementById('$id')->getElementsByTagName('p')->item(0);
echo $p->nodeValue;

I'm looking for a third way to call all the first part. So I was thinking about calling all the <p> before the id or class "toc" which is the id/class of the table of content.

Any idea how to do that?

You could use DOMDocument and DOMXPath with for example an xpath expression like:

//div[@id="toc"]/preceding-sibling::p

$doc = new DOMDocument();
$doc->load("https://en.wikipedia.org/wiki/Sans-serif");
$xpath = new DOMXPath($doc);
$nodes = $xpath->query('//div[@id="toc"]/preceding-sibling::p');

foreach ($nodes as $node) {
    echo $node->nodeValue;
}

That would give you the content of the paragraphs preceding the div with id = toc.

If you're just looking for the intro in plain text, you can simply use Wikipedia's API:

https://en.wikipedia.org/w/api.php?format=json&action=query&prop=extracts&exintro=&explaintext=&titles=Sans-serif

If you want HTML formatting as well (excluding inner images and the likes):

https://en.wikipedia.org/w/api.php?format=json&action=query&prop=extracts&exintro=&titles=Sans-serif