php DOMDocument - 将子元素列出到数组

For the following HTML:

<html>
<body>
<div whatever></div>
<div id="archive-wrapper">
<ul class="archive-list">
    <li><div><a href="#1">A</a></div></li>
    <li><div><a href="#2">B</a></div></li>
    <li><div><a href="#3">C</a></div></li>
</ul>
</div>
</body>

How could I retrieve, with PHP DOMDocument (http://php.net/manual/es/class.domdocument.php), an array containing (#1,#2,#3) in the most effective way? It's not that I did not try anything or that I want an already done code, I just need to know some guidelines to do it and understand it on my own. Thanks :)

A simple example using php DOMDocument -

<?php
$html = <<<HTML
<html>
<body>
<div whatever></div>
<div id="archive-wrapper">
<ul class="archive-list">
    <li><div><a href="#1">A</a></div></li>
    <li><div><a href="#2">B</a></div></li>
    <li><div><a href="#3">C</a></div></li>
</ul>
</div>
</body>
HTML;

$dom = new DOMDocument();
$dom->loadHTML($html);

//get all links
$links = $dom->getElementsByTagName('a');
$linkArray = array();

//loop through each link
foreach ($links as $link){
    $linkArray[] = $link->getAttribute('href');
}

edit
to get only the links inside ul->li, you could do something like -

$dom = new DOMDocument();
$dom->loadHTML($html);

$linkArray = array();

foreach ($dom->getElementsByTagName('ul') as $li){
    foreach ($li->getElementsByTagName('li') as $a){
        foreach ($a->getElementsByTagName('a') as $link){
            $linkArray[] = $link->getAttribute('href');
        }
    }
}

or if you just want the 1st ul you could simplify to

//get 1st ul using ->item(0)
$ul = $dom->getElementsByTagName('ul')->item(0);
foreach ($ul->getElementsByTagName('li') as $li){
    foreach ($li->getElementsByTagName('a') as $a){
        $linkArray[] = $a->getAttribute('href');
    }
}

what do you mean with PHP DOM? do you mean with PHP and JQuery? You can setup

  • you can put all that in a form and post it to a script
  • you can also wrap around a select which will only store the selected data
  • better idea would be to jquery to post the items to an array on the same page and using php as a processor for server side munipilation? this is better in the long run, being its the most updated way of interacting with html and server side scripts.

for example, you can try either way:

$("#form").submit(function(){ //form being the #form id
    var items = [];
    $("#archive-list li").each(function(n){
        items[n] = $(this).html();
    });

   $.post(
      "munipilate-data.php", 
      {items: items}, 
      function(data){
          $("#result").html(data);
      });
});

I suggest you a regex to parse it.

$html = '<html>
    <body>
       <div whatever></div>
       <div id="archive-wrapper">
       <ul class="archive-list">
            <li><div><a href="#1">A</a></div></li>
            <li><div><a href="#2">B</a></div></li>
            <li><div><a href="#3">C</a></div></li>
       </ul>
       </div>
    </body>';
$reg = '/a href=["\']?([^"\' ]*)["\' ]/';
preg_match_all($reg, $html, $m);
$arr = array_map(function($v){
    return trim(str_replace('a href=', '', $v), '"');
}, $m[0]);

print '<pre>';
print_r($arr);
print '</pre>';

Output:

Array
(
    [0] => #1
    [1] => #2
    [2] => #3
)

Regex Demo