The last couple of days I am updating a script to work with Joomla 2.5. It's almost done, but there is one thing that I haven't been able to solve yet. And it's a weird one.
The script has a cron wich parses a affiliate XML. To do this it uses the PHP function xml_parse as seen below:
if (!($fp = @$file_function($url, 'rb'))) {
$this->error("Cannot open {$url}");
return;
}
while (($data = fread($fp, 8192))) {
if ( defined ('LIBXML_BUG') ) {
# fix voor LIBXML BUG
$data=str_replace("&","XMLLIBHACK",$data);
}
if (!xml_parse($this->parser, $data, feof($fp))) {
printf('XML error in %s at line %d column %d',
$url,
xml_get_current_line_number($this->parser),
xml_get_current_column_number($this->parser));
unset ($this->items);
}
}
xml_parser_free( $this->parser );
As said the problem lays on the xml_parse function. On this line the whole page/script stops working and returns only the contents written above this line. It is not returning any error while error_reporting is E_ALL and display_errors is On. When creating an error on purpose I see the error so error_reporting is working. The parser($this->parser) is created in another file. Wich is loaded(var_dumped $this->parser).
The code where $this->parser is created (I believe this class is called MagpieRSS):
function create_parser($out_enc, $in_enc, $detect) {
if ( substr(phpversion(),0,1) == 5) {
$parser = $this->php5_create_parser($in_enc, $detect);
}
else {
$parser = $this->php4_create_parser($in_enc, $detect);
}
if ($out_enc) {
$this->encoding = $out_enc;
xml_parser_set_option($parser, XML_OPTION_TARGET_ENCODING, $out_enc);
}
return $parser;
}
/**
* Instantiate an XML parser under PHP5
*
* PHP5 will do a fine job of detecting input encoding
* if passed an empty string as the encoding.
*
* All hail libxml2!
*
*/
function php5_create_parser($in_enc, $detect) {
// by default php5 does a fine job of detecting input encodings
if(!$detect && $in_enc) {
return xml_parser_create($in_enc);
}
else {
return xml_parser_create('');
}
}
/**
* Instaniate an XML parser under PHP4
*
* Unfortunately PHP4's support for character encodings
* and especially XML and character encodings sucks. As
* force to UTF-8 use admin settings to change this
*/
function php4_create_parser($in_enc, $detect) {
if ( $detect ) {
$in_enc = 'UTF-8';
}
return xml_parser_create($in_enc);
}
I am out of ideas to solve this. I tried different encodings(ISO, UTF-8 etc.), checking the $data but everything seems file.
An example XML file can be found here: http://pastebin.com/wT1pVZLQ
I would recommend you use SimpleXMLElement
which much more flexible .. all you would need to do is just loop to get the elements you want.
Example Using Your XML
header('Content-Type: text/html; charset=utf-8');
$sxe = simplexml_load_file("log.xml", "SimpleXMLElement");
echo "<pre>";
foreach ( $sxe->dataHeader as $element ) {
foreach ( $element as $key => $value )
echo $key, " = ", $value, PHP_EOL;
}
echo PHP_EOL;
foreach ( $sxe->data as $record ) {
foreach ( $record as $key => $element ) {
foreach ( $element as $key => $value )
echo $key, " = ", $value, PHP_EOL;
}
}
Output
exportType = stream
exportId = 256106
rows = 1
lastChecked = 2012-11-04 14:03:06.26
lastUpdated = 2012-11-04 00:03:02.822
parserLocale = nl_NL
streamCurrency = EUR
name = befit2day.nl
description =
recordHash = 1124208770
url = http://clicks.m4n.nl/_c?aid=14375&adid=695437&_df=true&turl=http%3A%2F%2Fbefit2day.nl%2F
title = Universeel krachtstation
description = Géén verzendkosten Verwachte levertijd 5 werkdagen Universeel krachtstation In hoogte verstelbare haltersteunen Rugleuning in 6 standen te verstellen Biceps curl steun in 3 standen te verstellen Been curl in 3 standen te verstellen (te belasten tot 60 kg) Geschikt voor halterschijven met stang opening van 20 - 31 mm Ook te gebruiken voor weighted crunches (met behulp van de kabel), tot 60 kg Ook te gebruiken voor dips Maximaal belastbaar tot 280 kg (inclusief gebruikersgewicht) Bankdrukken tot 180 kg Totaal gewicht 40 kg Geleverd exclusief halterstangen en gewichten Afmetingen krachtstation (L x B x H): 180 cm x 106 cm x 90-110 cm Afmetingen rugleuning (L x B x H): 68 cm x 28 cm x 4 cm Afmetingen curl steun (L x B x H): 28 cm x 44,5 cm x 4 cm Met aan de onderzijde gummi bekleding voor bescherming van de vloer
offerid =
image = http://befit2day.nl/img/products/11749/103/universeel-krachtstation.jpg
price = 149.90
category = dagaanbieding
subcategory =
stock = 1
timetoship =
ean =
price_shipping = 0.00
price_old = 299.90
vendor =
category_path =
publisher = befit2day.nl
column0 = dagaanbieding
time = 0:00
logo = http://befit2day.nl/themes/store_4/images/logo.jpg
merchantID = 20420
Check your logs to see what the parser is throwing, and/or try ini_set('display_errors', 1);
You are probably getting either a character in the file that isn't compatible with the encodings you have tried, or it doesn't recognize the xml file as being properly formatted.
If it's an encoding issue, replace the character(s) first (if it's only one or two) then parse it as xml, or find an encoding that supports your xml file fully.
tail -f /var/log/apache2/error.log
Refresh the website and watch for "Segmentation Fault". This is usually the case when everything magically stops responding.