Get all the Text from a xml file

Question

I want to extract all the text in a XML-File with Python.

Is there any possibility?

I heard that it is possible with using xml.etree.ElementTree

For example:

<data>
    <country name="Liechtenstein">
        <rank>1</rank>
        <year>2008</year>
        <gdppc>141100</gdppc>
        <neighbor name="Austria" direction="E"/>
        <neighbor name="Switzerland" direction="W"/>
    </country>
    <country name="Singapore">
        <rank>4</rank>
        <year>2011</year>
        <gdppc>59900</gdppc>
        <neighbor name="Malaysia" direction="N"/>
    </country>
    <country name="Panama">
        <rank>68</rank>
        <year>2011</year>
        <gdppc>13600</gdppc>
        <neighbor name="Costa Rica" direction="W"/>
        <neighbor name="Colombia" direction="E"/>
    </country>
</data>

I just want to extract: 1 2008 141100 4 2011 and so on...

Pietro Marsella · Answer 1 · 2019-08-07T12:18:52.507

0

Hope this helps

from xml-etree.ElementTree import parse

tree = parse('data.xml') #extract the file (change the name to your file name)

for e in tree.findall('year') #searches the content of all the tags 'year' 
    print(e.text)

You can also store the value in a dict with the key equal to the name of the country and the value equal to a list of all the attributes

edited Aug 07 '19 at 12:18

answered Aug 07 '19 at 11:00

Pietro Marsella

346
1
9

Does not work. `open` should be `parse`. And there are no `title` elements in the XML document in the question. – mzjn Aug 07 '19 at 11:44
'title' was just a random name i put. In your case you can put rank, year, neighbor ecc.. – Pietro Marsella Aug 07 '19 at 11:46
sorry i took this from an old script i made a while ago. – Pietro Marsella Aug 07 '19 at 12:17
The code still has at least two syntax errors. Why don't you test it before posting? – mzjn Aug 07 '19 at 12:26

Get all the Text from a xml file

1 Answers1