-2

I want to extract all the text in a XML-File with Python.

Is there any possibility?

I heard that it is possible with using xml.etree.ElementTree

For example:

<data>
    <country name="Liechtenstein">
        <rank>1</rank>
        <year>2008</year>
        <gdppc>141100</gdppc>
        <neighbor name="Austria" direction="E"/>
        <neighbor name="Switzerland" direction="W"/>
    </country>
    <country name="Singapore">
        <rank>4</rank>
        <year>2011</year>
        <gdppc>59900</gdppc>
        <neighbor name="Malaysia" direction="N"/>
    </country>
    <country name="Panama">
        <rank>68</rank>
        <year>2011</year>
        <gdppc>13600</gdppc>
        <neighbor name="Costa Rica" direction="W"/>
        <neighbor name="Colombia" direction="E"/>
    </country>
</data>

I just want to extract: 1 2008 141100 4 2011 and so on...

1 Answers1

0

Hope this helps

from xml-etree.ElementTree import parse

tree = parse('data.xml') #extract the file (change the name to your file name)

for e in tree.findall('year') #searches the content of all the tags 'year' 
    print(e.text)

You can also store the value in a dict with the key equal to the name of the country and the value equal to a list of all the attributes

Pietro Marsella
  • 346
  • 1
  • 9