条件XML重组

I'm looking for an efficient way to reorganize parts of an XML document that contain multiple children of a type such as 'SmallCat' or 'BigCat'.

Here are the rules:

  1. Everything except for Habitat nodes should be passed through; attributes and all.
  2. Habitat nodes with less than 2 instances of either BigCat or SmallCat should be passed through.

The input document looks like:

<Zoo>
  <Habitat HabitatID="habitat.cage.1">
    <Type>Cats</Type>
    <Food>Birds</Food>
    <BigCat AnimalID="Tiger.1">
      <Type>Bengal</Type>
    </BigCat>
    <SmallCat AnimalID="bobcat.1">
      <Type>Bobcat</Type>
    </SmallCat>
    <BodyTemp>endothermic</BodyTemp>
  </Habitat>
  <Habitat HabitatID="cage.2">
    <Type>Cats</Type>
    <Food>Birds</Food>
    <SmallCat AnimalID="tabycat.1">
      <Type>Tabycat</Type>
    </SmallCat>
    <BodyTemp>endothermic</BodyTemp>
  </Habitat>
  <ConsessionStand>
    <Type>PopcornStand</Type>
  </ConsessionStand>
</Zoo>

The output should look like:

<Zoo>
  <Habitat HabitatID="sub_habitat.1.habitat.cage.1">
    <Type>Cats</Type>
    <Food>Birds</Food>
    <BigCat AnimalID="Tiger.1">
      <Type>Bengal</Type>
    </BigCat>
  </Habitat>

  <Habitat HabitatID="sub_habitat.2.habitat.cage.1">
    <Type>Cats</Type>
    <Food>Birds</Food>
    <SmallCat AnimalID="bobcat.1">
      <Type>Bobcat</Type>
    </SmallCat>
  </Habitat>

  <Habitat HabitatID="habitat.cage.1">
    <BodyTemp>endothermic</BodyTemp>
    <Child>
        <HabitatID>sub_habitat.1.habitat.cage.1</HabitatID>
    </Child>
    <Child>
        <HabitatID>sub_habitat.2.habitat.cage.1</HabitatID>
    </Child>
  </Habitat>

  <Habitat HabitatID="cage.2">
    <Type>Cats</Type>
    <Food>Birds</Food>
    <SmallCat AnimalID="tabycat.1">
      <Type>Tabycat</Type>
    </SmallCat>
    <BodyTemp>endothermic</BodyTemp>
  </Habitat>
  <ConsessionStand>
    <Type>PopcornStand</Type>
  </ConsessionStand>
</Zoo>

The ideal solution will use XSLT but, any solution (bash, javascript, php, python, ruby, go, etc) that gets the job done is a worthy contender.

Here's an implementation that does ~90% of the work.

This solution does not reconstruct the first Habitat node with references to the new sub_habitat child nodes.

<xsl:stylesheet version="1.0"
 xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
    <xsl:output omit-xml-declaration="yes" indent="yes"/>
     <xsl:strip-space elements="*"/>

    <xsl:template match="@*|node()">
       <xsl:copy>
          <xsl:apply-templates select="@*|node()"/>
       </xsl:copy>
    </xsl:template>

    <xsl:template match="Habitat[count(BigCat|SmallCat) &gt; 1]">
        <xsl:param name="i"/>
        <xsl:for-each select="BigCat|SmallCat">
          <xsl:choose>
            <xsl:when test="self::BigCat">
              <Habitat HabitatID="sub_habitat.{position()}.{../@HabitatID}">
                <xsl:copy-of select="../*[not(self::SmallCat|self::BodyTemp)]"/>
              </Habitat>
            </xsl:when>
            <xsl:when test="self::SmallCat">
              <Habitat HabitatID="sub_habitat.{position()}.{../@HabitatID}">
                <xsl:copy-of select="../*[not(self::BigCat|self::BodyTemp)]"/>
              </Habitat>
            </xsl:when>
          </xsl:choose> 
        </xsl:for-each>
    </xsl:template> 

</xsl:stylesheet>

The resulting output is seen here.

<Zoo>
  <Habitat HabitatID="sub_habitat.1.habitat.cage.1">
    <Type>Cats</Type>
    <Food>Birds</Food>
    <BigCat AnimalID="Tiger.1">
      <Type>Bengal</Type>
    </BigCat>
  </Habitat>
  <Habitat HabitatID="sub_habitat.2.habitat.cage.1">
    <Type>Cats</Type>
    <Food>Birds</Food>
    <SmallCat AnimalID="bobcat.1">
      <Type>Bobcat</Type>
    </SmallCat>
  </Habitat>
  <Habitat HabitatID="cage.2">
    <Type>Cats</Type>
    <Food>Birds</Food>
    <SmallCat AnimalID="tabycat.1">
      <Type>Tabycat</Type>
    </SmallCat>
    <BodyTemp>endothermic</BodyTemp>
  </Habitat>
  <ConsessionStand>
    <Type>PopcornStand</Type>
  </ConsessionStand>
</Zoo>

What have you tried? Each of the rules in your prose description of the problem translates pretty directly into a template rule. For example, the rule:

This experience contains more than 1 element (Audiovisual and Gallery), it will be reorganized as a set of 2 discrete experience children

becomes something like

<xsl:template match="Experience[count(Audiovisual|Gallery) gt 1]">
  <xsl:for-each select="AudioVisual|Gallery">
    <Experience ExperienceID="{../@ExperienceID}.ce.{position()}"/>
      <xsl:copy-of select="../*[not(self::AudioVisual|self::Gallery)]"/>
      <xsl:copy-of select="."/>
    </Experience>
  </xsl:for-each>
</xsl:template> 

Just go through all your rules and write a template rule for each one.

Consider the following stylesheet:

XSLT 1.0

<xsl:stylesheet version="1.0" 
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>
<xsl:strip-space elements="*"/>

<xsl:template match="/Zoo">
    <xsl:copy>
        <xsl:apply-templates select="Habitat/BigCat | Habitat/SmallCat"/>
        <xsl:apply-templates/>
    </xsl:copy>
</xsl:template>

<xsl:template match="BigCat| SmallCat">
    <Habitat HabitatID="sub_habitat.{position()}.{../@HabitatID}">
        <xsl:copy-of select="../*[not(self::BigCat or self::SmallCat or self::BodyTemp)]"/>
        <xsl:copy-of select="."/>
    </Habitat>
</xsl:template>

<xsl:template match="Habitat">
    <xsl:copy>
        <xsl:copy-of select="@* | BodyTemp"/>
        <xsl:apply-templates select="BigCat | SmallCat" mode="child"/>
    </xsl:copy>
</xsl:template>

<xsl:template match="BigCat| SmallCat" mode="child">
    <Child>
        <HabitatID>
            <xsl:text>sub_habitat.</xsl:text>
            <xsl:value-of select="position()"/>
            <xsl:text>.</xsl:text>
            <xsl:value-of select="../@HabitatID"/>
        </HabitatID>
    </Child>
</xsl:template>

</xsl:stylesheet>

Applied to your input example, the result will be:

<?xml version="1.0" encoding="UTF-8"?>
<Zoo>
   <Habitat HabitatID="sub_habitat.1.habitat.cage.1">
      <Type>Cats</Type>
      <Food>Birds</Food>
      <BigCat AnimalID="Tiger.1">
         <Type>Bengal</Type>
      </BigCat>
   </Habitat>
   <Habitat HabitatID="sub_habitat.2.habitat.cage.1">
      <Type>Cats</Type>
      <Food>Birds</Food>
      <SmallCat AnimalID="bobcat.1">
         <Type>Bobcat</Type>
      </SmallCat>
   </Habitat>
   <Habitat HabitatID="habitat.cage.1">
      <BodyTemp>endothermic</BodyTemp>
      <Child>
         <HabitatID>sub_habitat.1.habitat.cage.1</HabitatID>
      </Child>
      <Child>
         <HabitatID>sub_habitat.2.habitat.cage.1</HabitatID>
      </Child>
   </Habitat>
</Zoo>