Python正则,把不要的内容也获取到了,请问该怎么处理?

text = '''<tr class="rich-table"><td class="rich-table-cell" id="time1-1"width="333">2021-05-11 10:00</td><td class="rich-table-cell" id="time2-1"width="333">2021-05-11 11:00</td><td class="rich-table-cell" id="time3-1"width="333">2021-05-11 12:00</td></tr><tr class="rich-table"><td class="rich-table-cell" id="time1-2"width="333">2021-05-11 11:00</td><td class="rich-table-cell" id="time2-2"width="333">2021-05-11 12:00</td><td class="rich-table-cell" id="time3-2"width="333"></td></tr><tr class="rich-table"><td class="rich-table-cell" id="time1-3"width="333">2021-05-11 11:00</td><td class="rich-table-cell" id="time2-3"width="333">2021-05-11 12:00</td><td class="rich-table-cell" id="time3-3"width="333"></td>2021-05-11 13:00</tr><tr class="rich-table"><td class="rich-table-cell" id="time1-4"width="333">2021-05-11 11:00</td><td class="rich-table-cell" id="time2-4"width="333">2021-05-11 12:00</td><td class="rich-table-cell" id="time3-4"width="333"></td></tr>'''


1 = '''<tr class="rich-table"><td class="rich-table-cell" id="time1-1"width="333">2021-05-11 10:00</td><td class="rich-table-cell" id="time2-1"width="333">2021-05-11 11:00</td><td class="rich-table-cell" id="time3-1"width="333"</td>>这里有时间</tr>'''

2 = '''<tr class="rich-table"><td class="rich-table-cell" id="time1-2"width="333">2021-05-11 11:00</td><td class="rich-table-cell" id="time2-2"width="333">2021-05-11 12:00</td><td class="rich-table-cell" id="time3-2"width="333"></td>这里是空的!!!</tr>'''

3 = '''<tr class="rich-table"><td class="rich-table-cell" id="time1-3"width="333">2021-05-11 11:00</td><td class="rich-table-cell" id="time2-3"width="333">2021-05-11 12:00</td><td class="rich-table-cell" id="time3-3"width="333"></td>这里有时间</tr>'''

4 = '''<tr class="rich-table"><td class="rich-table-cell" id="time1-4"width="333">2021-05-11 11:00</td><td class="rich-table-cell" id="time2-4"width="333">2021-05-11 12:00</td><td class="rich-table-cell" id="time3-4"width="333"></td>这里是空的!!!</tr>'''

实际内容可能不为4句,这里只是举例,没有分段,是连续的,例如text。

我使用正则.*?和(.*?)能把text里分4句都提取出来,但是我不需要一二句,只需要第二句这种末尾没时间的。

r'''<tr class="rich-table">(.*?)id=.*?width="333"></td></tr>'''

import re
text = '''<tr class="rich-table"><td class="rich-table-cell" id="time1-1"width="333">2021-05-11 10:00</td><td class="rich-table-cell" id="time2-1"width="333">2021-05-11 11:00</td><td class="rich-table-cell" id="time3-1"width="333">2021-05-11 12:00</td></tr><tr class="rich-table"><td class="rich-table-cell" id="time1-2"width="333">2021-05-11 11:00</td><td class="rich-table-cell" id="time2-2"width="333">2021-05-11 12:00</td><td class="rich-table-cell" id="time3-2"width="333"></td></tr><tr class="rich-table"><td class="rich-table-cell" id="time1-3"width="333">2021-05-11 11:00</td><td class="rich-table-cell" id="time2-3"width="333">2021-05-11 12:00</td><td class="rich-table-cell" id="time3-3"width="333"></td>2021-05-11 13:00</tr><tr class="rich-table"><td class="rich-table-cell" id="time1-4"width="333">2021-05-11 11:00</td><td class="rich-table-cell" id="time2-4"width="333">2021-05-11 12:00</td><td class="rich-table-cell" id="time3-4"width="333"></td></tr>'''

s = re.findall(r'''<tr class="rich-table">(?:(?!<tr).)*?id=(?:(?!<tr).)*?width="333"></td></tr>''',text)
print(*s , sep="\n\n")

 

末尾没时间是什么意思

厉害、厉害