There is the same problem at xml to array - remove empty array php Dont know how you handle this. I mean how can i get an answer to a question that is not mine and asked > 2 years ago. So im asking here my own question:
Simple script:
$xml
= '<?xml version="1.0"?>
<Envelope>
<foo>
<bar>
<baz>Hello</baz>
<bat/>
</bar>
</foo>
<foo>
<bar>
<baz>Hello Again</baz>
<bat></bat>
</bar>
</foo>
<foo>
<bar>
<baz>Hello Again</baz>
<bat> </bat>
</bar>
</foo>
</Envelope>';
$xml = new \SimpleXMLElement(
$xml,
LIBXML_NOBLANKS | LIBXML_NOEMPTYTAG | LIBXML_NOCDATA
);
$array = json_decode(json_encode((array)$xml), true);
// [
// 'foo' => [
// 0 => [
// 'bar' => [
// 'baz' => 'Hello',
// 'bat' => [], <<-- how to get this to NULL
// ],
// ],
// 1 => [
// 'bar' => [
// 'baz' => 'Hello Again',
// 'bat' => [], <<-- how to get this to NULL
// ],
// ],
// 2 => [
// 'bar' => [
// 'baz' => 'Hello Again',
// 'bat' => [ <<-- how to get this to NULL
// 0 => ' ', or at least to value of " " without array
// ],
// ],
// ],
// ],
// ];
As you can see there is an empty <bat/>
tag and a whitespace in the last <bat> </bat>
tag.
I would like to get those to null
in the array.
I tried the following but this works for the first level only ofc:
$data = (array)$xml;
foreach ($data as &$item) {
if (
$item instanceof \SimpleXMLElement
and $item->count() === 0
) {
// is a object(SimpleXMLElement)#1 (0) {}
$item = null;
}
}
I tried and failed doing this recursively.
Also tried RecursiveIteratorIterator
but failed.
But there must be a way to get those offset to null
.
Anybody done this before?
EDIT
Solved. See https://stackoverflow.com/a/55733384/3411766
Found it out my self. Took a while but works perfectly.
/**
* @param array|\SimpleXMLElement[]|\SimpleXMLElement $data .
*
* @return array
*/
protected function emptyNodesToNull($data)
{
if ($data instanceof \SimpleXMLElement and $data->count() === 0) {
// is empty object like
// SimpleXMLElement::__set_state(array())
// which was f.e. a <foo/> tag
// or
// SimpleXMLElement::__set_state(array(0 => ' ',))
// which was f.e. a <foo> </foo> (with white space only)
return null;
}
$data = (array)$data;
foreach ($data as &$value) {
if (is_array($value) or $value instanceof \SimpleXMLElement) {
$value = $this->emptyNodesToNull($value);
} else {
// $value is the actual value of a node.
// Could do further checks here.
}
}
return $data;
}
My tests did exactly what i expected
and returns imo exactly what you can expect from a xmlToArray method.
I mean we wont be able to handle attributes, but this is not the requirement.
Test:
$xml
= '<?xml version="1.0"?>
<Envelope>
<a/><!-- expecting null -->
<foo>
<b/><!-- expecting null -->
<bar>
<baz>Hello</baz>
<!-- expecting here an array of 2 x null -->
<c/>
<c/>
</bar>
</foo>
<foo>
<bar>
<baz>Hello Again</baz>
<d> </d><!-- expecting null -->
<item>
<firstname>Foo</firstname>
<email></email><!-- expecting null -->
<telephone/><!-- expecting null -->
<lastname>Bar</lastname>
</item>
<item>
<firstname>Bar</firstname>
<email>0</email><!-- expecting value 0 (zero) -->
<telephone/><!-- expecting null -->
<lastname>Baz</lastname>
</item>
<!-- expecting array of values 1, 2 null, 4 -->
<number>1</number>
<number>2</number>
<number></number>
<number>4</number>
</bar>
</foo>
</Envelope>';
$xml = new \SimpleXMLElement($xml);
$array = $class::emptyNodesToNull($xml);
Returns:
[
'Envelope' => [
'a' => null,
'foo' => [
0 => [
'b' => null,
'bar' => [
'baz' => 'Hello',
'c' => [
0 => null,
1 => null,
],
],
],
1 => [
'bar' => [
'baz' => 'Hello Again',
'd' => null,
'item' => [
0 => [
'firstname' => 'Foo',
'email' => null,
'telephone' => null,
'lastname' => 'Bar',
],
1 => [
'firstname' => 'Bar',
'email' => '0',
'telephone' => null,
'lastname' => 'Baz',
],
],
'number' => [
0 => '1',
1 => '2',
2 => null,
3 => '4',
],
],
],
],
],
];
You can use XPath with the predicate not(node())
to select all elements that do not have child nodes.
<?php
$doc = new DOMDocument;
$doc->preserveWhiteSpace = false;
$doc->loadxml('<?xml version="1.0"?>
<Envelope>
<foo>
<bar>
<baz>Hello</baz>
<bat/>
</bar>
</foo>
<foo>
<bar>
<baz>Hello Again</baz>
<bat></bat>
</bar>
</foo>
<foo>
<bar>
<baz>Hello Again</baz>
<bat></bat>
</bar>
</foo>
</Envelope>');
$xpath = new DOMXPath($doc);
foreach( $xpath->query('//*[not(node())]') as $node ) {
$node->parentNode->removeChild($node);
}
$doc->formatOutput = true;
echo $doc->savexml();
Print:
<?xml version="1.0"?>
<Envelope>
<foo>
<bar>
<baz>Hello</baz>
</bar>
</foo>
<foo>
<bar>
<baz>Hello Again</baz>
</bar>
</foo>
<foo>
<bar>
<baz>Hello Again</baz>
</bar>
</foo>
</Envelope>
Regards!