PHP Regex在HTML标记之间检索文本,但不检索标记

Similar question might be asked many times but I have a bit complex one.
I know when we want to parse only the text between <title> tag in this scenario,

<title>My work</title>
<p>This is my work.</p> <p>Learning regex.</p>

we can form a Regex like this:

>([^<]*)<

Source

But that works only because the <title> tag is on the top. But if the tag is the second one, it won't work.
Okay, my scenario is,

<td class="td1" headers="searchth1">JAVA1</td>
<td class="td2" headers="searchth2">JAVA2</td>
<td class="td3" headers="searchth3">JAVA3</td>

<td class="td1" headers="searchth1">PHP1</td>
<td class="td2" headers="searchth2">PHP2</td>
<td class="td3" headers="searchth3">PHP3</td>

There are many similar tags in the file, and I want to retrieve only the text between <td class="td1" headers="searchth1"> and </td> tags.
And, I've used '#<td class="td1" headers="searchth1">(.*)</td>#' , which is working fine. But it is also including all other <td> tags in the output, which I don't want.
I want only the texts Java1 and PHP1 and I guess if I could able to retrieve the text between the tags by excluding the tags, I may acieve it.
Am I correct? or Wrong? If so, how to achieve what I want?
Thanks in advance!!

I think your regex approach, while technically possible, is going to cause more trouble down the line. For example, if the source HTML changed so the headers attribute appeared before the class attribute the regex would fail. Also, your code will become pretty unreadable very quickly if you're using regex to search through HTML source code.

To parse HTML you should use PHP's DOMDocument functions, which are more robust in the face of changing HTML code and are far more readable to whoever may be maintaining your code (including you). This method will also support looking at other element attributes more easily. The sample code below should work for your use case:

$doc = '<td class="td1" headers="searchth1">JAVA1</td>
<td class="td2" headers="searchth2">JAVA2</td>
<td class="td3" headers="searchth3">JAVA3</td>
<td class="td1" headers="searchth1">PHP1</td>
<td class="td2" headers="searchth2">PHP2</td>
<td class="td3" headers="searchth3">PHP3</td>';
$dom = new DOMDocument();
$dom->loadHTML($doc);
$xpath = new DOMXpath($dom);
$tds = $xpath->query("//td[@class='td1']");
// the query could also be "//td[@headers='searchth1']" or even
// "//td[@headers='searchth1'][@class='td1']" depending on what you want to target
foreach($tds as $td){
    var_dump($td->nodeValue);
}

If you want to learn more about building and using xpath queries, I suggest the article PHP DOM: Using XPath over at SitePoint.com.

You want preg_match_all(), and make sure you're not using the "s" pattern modifier:

$regexp = '%<td class="td1" headers="searchth1">(.*)</td>%';
preg_match_all($regexp,$html,$matches);