从html文档中回显<a>具有class =“pret”的内容

I have the html document in a php $content. I can echo it, but I just need all the <a...> tags with class="pret" and after I get them I would need the non words (like a code i.e. d3852) from href attribute of <a> and the number (i.e. 2352.2345) from between <a> and </a>.

I have tried more examples from the www but I either get empty arrays or php errors.

A regex example that gives me an empty array (the <a> tag is in a table)

$pattern = "#<table\s.*?>.*?<a\s.*?class=[\"']pret[\"'].*?>(.*?)</a>.*?</table>#i";
preg_match_all($pattern, $content, $results);
print_r($results[1]);

Another example that gives just an error

$a=$content->getElementsByTagName(a);

Reason for various errors: unvalid html, non utf 8 chars.

Next I did this on another website, matched the contents in a single SQL table, and the result is a copied website with updated data from my country. No longer will I search the www for matching single results.

Let's hope you're trying to parse valid (at least valid enough) HTML document, you should use DOM for this:

// Simple example from php manual from comments
$xml = new DOMDocument(); 
$xml->loadHTMLFile($url); 
$links = array(); 

foreach($xml->getElementsByTagName('a') as $link) { 
    $links[] = array('url' => $link->getAttribute('href'),
                     'text' => $link->nodeValue); 
}

Note using loadHTML not load (it's just more robust against errors). You also may set DOMDocument::recover (as suggested in comment by hakre) so parser will try to recover from errors.

Or you could use xPath (here's explanation of syntax):

$xpath = new DOMXpath($doc);
$elements = $xpath->query("//a[@class='pret']");

if (!is_null($elements)) {
    foreach ($elements as $element) {
        $links[] = array('url' => $link->getAttribute('href'),
                         'text' => $link->nodeValue); 
    }
}

And for case of invalid HTML you may use regexp like this:

$a1 = '\s*[^\'"=<>]+\s*=\s*"[^"]*"'; # Attribute with " - space tolerant
$a2 = "\s*[^'\"=<>]+\s*=\s*'[^']*'"; # Attribute with ' - space tolerant
$a3 = '\s*[^\'"=<>]+\s*=\s*[\w\d]*' # Unescaped values - space tolerant
# [^'"=<>]* # Junk - I'm not inserting this to regexp but you may have to

$a = "(?:$a1|$a2|$a2)*"; # Any number of arguments
$class = 'class=([\'"])pret\\1'; # Using ?: carefully is crucial for \\1 to work
                                 # otherwise you can use ["']
$reg = "<a{$a}\s*{$class}{$a}\s*>(.*?)</a";

And then just preg_match_all._{All regexp are written from the top of my head - you may have to debug them}.

got the links like this

preg_match_all('/<a[^>]*class="pret">(.*?)<\\/a>/si', $content, $links);
print_r($links[0]);

and the result is

Array(
[0] => <a href='/word_word_34670_word_number.htm' class="pret"><span>3340.3570 word</span></a>..........)

so I need to get the first number inside href and the number between span