0

I want to extract the data on an xml file into a csv file then manipulate the data and write the manipulated data back into the same xml format using a stylesheet.

Input XML

<population>
  <!-- ====================================================================== --> 
 <person id="person_203">
    <attributes>
      <attribute name="student" class="java.lang.String">FALSE</attribute>
      <attribute name="subpopulation" class="java.lang.String">person</attribute>
      <attribute name="tbc" class="java.lang.String">0</attribute>
      <attribute name="wards" class="java.lang.String">19100100</attribute>
    </attributes>
    <plan selected="yes">
      <activity type="home" x="-491818.676117819" y="-3702897.1885688" end_time="06:24:13" />
      <leg mode="pt" />
      <activity type="leisure" x="-491101.476702402" y="-3703311.7449867" end_time="11:15:07" />
      <leg mode="pt" />
      <activity type="shopping" x="-494740.864728234" y="-3702228.87495458" end_time="12:59:01" />
      <leg mode="pt" />
      <activity type="home" x="-491818.676117819" y="-3702897.1885688" />
    </plan>
  </person>
    <!-- ====================================================================== -->
  <person id="person_174">
    <attributes>
      <attribute name="student" class="java.lang.String">FALSE</attribute>
      <attribute name="subpopulation" class="java.lang.String">person</attribute>
      <attribute name="tbc" class="java.lang.String">0</attribute>
      <attribute name="wards" class="java.lang.String">19100061</attribute>
    </attributes>
    <plan selected="yes">
      <activity type="home" x="-491592.958893144" y="-3703196.68309401" />
    </plan>
  </person>
  <!-- ====================================================================== -->
 </population>

Expected csv output

id,Student,subpopulation,tbc,wards,Selected,type,x,y,end_time,.....
person_203,FALSE,person,0,19100100,yes,home,-491818,6761,-3702897,189,06:24:13,....
person_174,FALSE,person,0,19100061,yes,home,-491592.9589,-3703196.683......

The plan attribute is different for person_203 and person_174 elements. How can I achieve this?

My Stylesheet is as below:-

<?xml version="1.0" encoding="UTF-8" ?>
<xsl:transform version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="text" encoding="UTF-8"/>
    <xsl:template match="/">
        <xsl:text>person,attribute,x, y, &#xA;</xsl:text>
            <xsl:for-each select=".//person">
                    <xsl:value-of select="concat(@id,'&#9;')"/>
                <xsl:for-each select=".//attributes">
                    <xsl:for-each select=".//attribute">
                        <xsl:value-of select="concat(',',text(),'&#9;')"/> 
                    </xsl:for-each>
                </xsl:for-each>
                <xsl:for-each select=".//plan">
                        <xsl:for-each select=".//activity">
                            <xsl:value-of select="concat(',',',',@x,',',@y,'&#9;')"/>
                        </xsl:for-each>
                </xsl:for-each>
                
            </xsl:for-each>
    </xsl:template>
</xsl:transform>

The Current output is this:

person,attribute,x, y, 
"person_203 "
" ,FALSE    ,person ,0  ,19100100   ,,-491818.676117819,-3702897.1885688    ,,-491101.476702402,-3703311.7449867    ,,-494740.864728234,-3702228.87495458   ,,-491818.676117819,-3702897.1885688    person_174  "
" ,FALSE    ,person ,0  ,19100061   ,,-491592.958893144,-3703196.68309401   "

I have looked at the two solutions below but the formats of xml are slightly different.

Owen
  • 1
  • 2
  • Would it work for you to treat plan separately, in its own CSV, and have each plan entry linked to the person via the `person_id`? Kind of like a relational DB? That way you could keep the `type` attribute of each plan. – Zach Young Apr 13 '22 at 16:03

0 Answers0