解析php中的html页面

Today when I was parsing one page with Simple HTML DOM parser I didn't get any result. So I thought, that it must be strange. So I went to see HTML code written there. I found that there's many mistakes.

So here comes the question. What to do in state, when parser works correctly, but HTML is a mess. Maybe some one would suggest some aproach or some other parser which is able to handle, that sort of matters.

Thank you all for help.

Run it through tidy before trying to load it into a DOM tree, http://php.net/manual/en/book.tidy.php

Seems like php's built in stuff should work fine for the html that is not so well written. Have a read in the comments as some people have info about it.

http://docs.php.net/manual/en/domdocument.loadhtml.php