I am trying to load xml content from a URL containing about 60MB of data. When I do that using simple XML built in library I keep getting the following error:
PHP Warning: DOMDocument::loadXML(): internal error: Huge input lookup in Entity, line: 845125
And the script is being stopped. What's wrong? How can I deal with this?
Sample url I use:
http://foo.com/feed.xml
The libxml2 changelog contains "608773 add a missing check in xmlGROW (Daniel Veillard)", which seems to be related to input buffering. Note I don't know anything about libxml2 internals, but it seems conceivable that you have tickled a 2.7.6 bug fixed in 2.7.7.
Check if the behavior is any different when you use simplexml_load_file() directly, and try setting libxml parser-related options, e.g.
simplexml_load_string($xml, 'SimpleXMLElement', LIBXML_COMPACT | LIBXML_PARSEHUGE)` Specifically, you might want to try the LIBXML_PARSEHUGE flag.
http://php.net/manual/en/libxml.constants.php XML_PARSE_HUGE flag relaxes any hardcoded limit from the parser. This affects limits like maximum depth of a document or the entity recursion, as well as limits of the size of text nodes.