I need to find a regular expression to use for finding the content within and tags for use in PHP. I have tried...
preg_split("<td>([^\"]*)</td>", $table[0]);
But that gives me the PHP error...
Warning: preg_split(): Unknown modifier '(' in C:\xampp\htdocs\.....
Can anyone tell me what I am doing wrong?
Try this:
preg_match("/<td>([^\"]*)<\/td>/", $table[0], $matches);
But, as a general rule, please, do not try to parse HTML with regexes... :-)
Use preg_match
instead of preg_split
preg_match("|<td>([^<]*)</td>|", $table[0], $m);
print_r($m);
Keep in mind that you need to do some extra work to make sure that the *
between <td>
and </td>
in your regular expression doesn't slurp up entire lines of <td>some text</td>
. That's because *
is pretty greedy.
To toggle off the greediness of *
, you can put a ?
after it - this tells it just grab up until the first time it reaches whatever is after the *
. So, the regular expression you're looking for is something like:
/<td>(.*?)<\/td>/
Remember, since the regular expression starts and ends with a /
, you have to be careful about any /
that is inside your regular expression - they have to be escaped. Hence, the \/
.
From your regular expression, it looks like you're also trying to exclude any "
character that might be between a <td>
and </td>
- is that correct? If that were the case, you would change the regular expression to use the following:
/<td>([^\"]*?)<\/td>/
But, assuming you don't want to exclude the "
character in your matches, your PHP code could look like this, using preg_match_all
instead of preg_match
.
preg_match_all("/<td>(.*?)<\/td>/", $str, $matches);
print_r($matches);
What you're looking for is in $matches[1]
.
First of all you forgot to wrap regex with delimiters. Also you shouldn't specify closing td
tag in regex.
Try the following code. Assuming $table[0]
contains html between <table>
, </table>
tags, it allows to extract any content (including html) from cells of table:
$a_result = array_map(
function($v) { return preg_replace('/<\/td\s*>/i', '', $v); },
array_slice(preg_split('/<td[^>]*>/i', $table[0]), 1)
);