I am parsing an XML file (source.xml) in PHP and need to identify instances where the <property>
node contains the <rent>
element.
Once identified the entire <property>
parent node for that entry should be copied to a separate XML file (destination.xml).
On completion of the copy that <property>
node should be removed from the source.xml file.
Here is an example of the source.xml file:
<?xml version="1.0" encoding="utf-8"?>
<root>
<property>
...
<rent>
<term>long</term>
<freq>month</freq>
<price_peak>1234</price_peak>
<price_high>1234</price_high>
<price_medium>1234</price_medium>
<price_low>1234</price_low>
</rent>
...
</property>
</root>
I've tried using DOM with the below code however I'm not getting any results at all despite their being hundreds of nodes that match the above requisites. Here is what I have so far:
$destination = new DOMDocument;
$destination->preserveWhiteSpace = true;
$destination->load('destination.xml');
$source = new DOMDocument;
$source->load('source.xml');
$xp = new DOMXPath($source);
foreach ($xp->query('/root/property/rent[term/freq/price_peak/price_high/price_medium/price_low]') as $item) {
$newItem = $destination->documentElement->appendChild(
$destination->createElement('property')
);
foreach (array('term', 'freq', 'price_peak', 'price_high', 'price_medium', 'price_low') as $elementName) {
$newItem->appendChild(
$destination->importNode(
$item->getElementsByTagName($elementName)->property(0),
true
)
);
}
}
$destination->formatOutput = true;
echo $destination->saveXml();
I've only started learning about DOMDocument and it's uses so I'm obviously messing up somewhere so any help is appreciated. Many thanks.
The difficulty is when your trying to copy a node from one document to another. You can try and re-create the node, copying all of the components across, but this is hard work (and prone to errors). Instead you can import the node from one document to another using importNode
. The second parameter says copy all child elements as well.
Then deleting the element from the original document is a case of getting the item to 'delete itself from it's parent' which sounds odd, but thats how this code works.
<?php
error_reporting ( E_ALL );
ini_set ( 'display_errors', 1 );
$destination = new DOMDocument;
$destination->preserveWhiteSpace = true;
$destination->loadXML('<?xml version="1.0" encoding="utf-8"?><root></root>');
$source = new DOMDocument;
$source->load('NewFile.xml');
$xp = new DOMXPath($source);
$destRoot = $destination->getElementsByTagName("root")->item(0);
foreach ($xp->query('/root/property[rent]') as $item) {
$newItem = $destination->importNode($item, true);
$destRoot->appendChild($newItem);
$item->parentNode->removeChild($item);
}
echo "Source:".$source->saveXML();
$destination->formatOutput = true;
echo "destination:".$destination->saveXml();
With the destination, I prime it with the basic <root
> element and then add in the contents from there.
Did you wanted to obtain something like this? Hope this helps:
$inXmlFile = getcwd() . "/source.xml";
$inXmlString = file_get_contents($inXmlFile);
$outXmlFile = getcwd() . "/destination.xml";
$outXmlString = file_get_contents($outXmlFile);
$sourceDOMDocument = new DOMDocument;
$sourceDOMDocument->loadXML($inXmlString);
$sourceRoot = null;
foreach ($sourceDOMDocument->childNodes as $childNode) {
if(strcmp($childNode->nodeName, "root") == 0) {
$sourceRoot = $childNode;
break;
}
}
$destDOMDocument = new DOMDocument;
$destDOMDocument->loadXML($outXmlString);
$destRoot = null;
foreach ($destDOMDocument->childNodes as $childNode) {
if(strcmp($childNode->nodeName, "root") == 0) {
$destRoot = $childNode;
break;
}
}
$xmlStructure = simplexml_load_string($inXmlString);
$domProperty = dom_import_simplexml($xmlStructure->property);
$rents = $domProperty->getElementsByTagName('rent');
if(($rents != null) && (count($rents) > 0)) {
$destRoot->appendChild($destDOMDocument->importNode($domProperty->cloneNode(true), true));
$destDOMDocument->save($outXmlFile);
$sourceRoot->removeChild($sourceRoot->getElementsByTagName('property')->item(0));
$sourceDOMDocument->save($inXmlFile);
}
Consider running two XSLT transformations: one that adds <property><rent>
nodes in destination and one that removes these nodes from source. As background, XSLT is a special-purpose language designed to transform XML files even maintaining a document()
function to parse from external XML files in same folder or subfolder.
PHP can run XSLT 1.0 scripts with its php-xsl class (be sure to enable extension in .ini file). With this approach no if
logic or foreach
loops are needed.
XSLT Scripts
PropertyRentAdd.xsl (be sure source.xml and XSLT are in same folder)
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output indent="yes"/>
<xsl:strip-space elements="*"/>
<!-- IDENTITY TRANSFORM -->
<xsl:template match="@*|node()">
<xsl:copy>
<xsl:apply-templates select="@*|node()"/>
</xsl:copy>
</xsl:template>
<!-- ADD TEMPLATE -->
<xsl:template match="root">
<xsl:copy>
<xsl:copy-of select="*"/>
<xsl:copy-of select="document('source.xml')/root/property[local-name(*)='rent']"/>
</xsl:copy>
</xsl:template>
</xsl:stylesheet>
PropertyRentRemove.xsl
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output indent="yes"/>
<xsl:strip-space elements="*"/>
<!-- IDENTITY TRANSFORM -->
<xsl:template match="@*|node()">
<xsl:copy>
<xsl:apply-templates select="@*|node()"/>
</xsl:copy>
</xsl:template>
<!-- REMOVE TEMPLATE -->
<xsl:template match="property[local-name(*)='rent']">
</xsl:template>
</xsl:stylesheet>
PHP
// Set current path
$cd = dirname(__FILE__);
// Load the XML and XSLT files
$doc = new DOMDocument();
$doc->load($cd.'/destination.xml');
$xsl = new DOMDocument;
$xsl->load($cd.'/PropertRentAdd.xsl');
// Transform the destination xml
$proc = new XSLTProcessor;
$proc->importStyleSheet($xsl);
$newXml = $proc->transformToXML($xml);
// Save output to file, overwriting original
file_put_contents($cd.'/destination.xml', $newXml);
// Load the XML and XSLT files
$doc = new DOMDocument();
$doc->load($cd.'/source.xml');
$xsl = new DOMDocument;
$xsl->load($cd.'/PropertRentRemove.xsl');
// Transform the source xml
$proc = new XSLTProcessor;
$proc->importStyleSheet($xsl);
$newXml = $proc->transformToXML($xml);
// Save output overwriting original file
file_put_contents($cd.'/source.xml', $newXml);
Inputs (examples to demonstrate, with other tags to show content is not affected)
source.xml
<?xml version="1.0" encoding="utf-8"?>
<root>
<property>
<rent>
<term>long</term>
<freq>month</freq>
<price_peak>1234</price_peak>
<price_high>1234</price_high>
<price_medium>1234</price_medium>
<price_low>1234</price_low>
</rent>
</property>
<property>
<rent>
<term>short</term>
<freq>month</freq>
<price_peak>7890</price_peak>
<price_high>7890</price_high>
<price_medium>7890</price_medium>
<price_low>7890</price_low>
</rent>
</property>
<property>
<web_site>stackoverflow</web_site>
<general_purpose>php</general_purpose>
</property>
<property>
<web_site>stackoverflow</web_site>
<special_purpose>xsl</special_purpose>
</property>
</root>
destination.xml
<?xml version="1.0" encoding="utf-8"?>
<root>
<original_data>
<test1>ABC</test1>
<test2>123</test2>
</original_data>
<original_data>
<test1>XYZ</test1>
<test2>789</test2>
</original_data>
</root>
Output (after PHP run)
source.xml
<?xml version="1.0"?>
<root>
<property>
<web_site>stackoverflow</web_site>
<general_purpose>php</general_purpose>
</property>
<property>
<web_site>stackoverflow</web_site>
<special_purpose>xsl</special_purpose>
</property>
</root>
destination.xml (new nodes appended at bottom)
<?xml version="1.0"?>
<root>
<original_data>
<test1>ABC</test1>
<test2>123</test2>
</original_data>
<original_data>
<test1>XYZ</test1>
<test2>789</test2>
</original_data>
<property>
<rent>
<term>long</term>
<freq>month</freq>
<price_peak>1234</price_peak>
<price_high>1234</price_high>
<price_medium>1234</price_medium>
<price_low>1234</price_low>
</rent>
</property>
<property>
<rent>
<term>short</term>
<freq>month</freq>
<price_peak>7890</price_peak>
<price_high>7890</price_high>
<price_medium>7890</price_medium>
<price_low>7890</price_low>
</rent>
</property>
</root>