0

Imagine I have the following piece of text:

<Data>
    <Country>
       <Name>Portugal<\Name>
       <Population>10M</Population>
       <Sub>
          <Code>Y</Code>
       </Sub>
    </Country>
    <Country>
       <Name>Spain<\Name>
       <Population>30M</Population>
       <Sub>
          <Code>Y</Code>
       </Sub>
    </Country>
</Data>

How can I replace the Y to N from Country Portugal without replacing the Code from the remaining countries?

I've tried to use sed:

sed -i '/<Country>Portugal<\/Country>/{s/Y/N/;}' file.xml

but this is not replacing anything.

Can you tell me what I am doing wrong? How can I replace the first occurrence of Y AFTER matching the Portugal ?

Thanks!

4 Answers4

2

Avoid parsing XML with regex. Use an XML processing tool like xmlstarlet:

$ cat foo.xml
<Data>
  <Country>
    <Name>Portugal</Name>
    <Population>10M</Population>
    <Sub>
      <Code>Y</Code>
    </Sub>
  </Country>
  <Country>
    <Name>Spain</Name>
    <Population>30M</Population>
    <Sub>
      <Code>Y</Code>
    </Sub>
  </Country>
</Data>

$ xmlstarlet edit --update '/Data/Country[Name="Portugal"]/Sub/Code' -v "N" foo.xml
<?xml version="1.0"?>
<Data>
  <Country>
    <Name>Portugal</Name>
    <Population>10M</Population>
    <Sub>
      <Code>N</Code> 
    </Sub>
  </Country>
  <Country>
    <Name>Spain</Name>
    <Population>30M</Population>
    <Sub>
      <Code>Y</Code>
    </Sub>
  </Country>
</Data>
that other guy
  • 109,738
  • 11
  • 156
  • 185
  • Thanks! The reason why I was trying with regex is that I do not have access to xmlstarlet and I have no means to install it. however, I have xmllint. You think it would also be possible with it? – CarlosBernardes Jul 10 '18 at 21:25
0

Use a range match.

sed '/<Name>Portugal</,/<\/Country>/ s/<Code>Y</<Code>N</' file.xml

(Edited to match updated requirements.)

Paul Hodges
  • 10,927
  • 1
  • 16
  • 30
  • Thanks for your help!! However, I made a mistake in the question.. :( With your solution, that would be OK, but I forgot that is parent and Portugal comes inside the tags – CarlosBernardes Jul 10 '18 at 21:03
0

This might work for you (GNU sed):

sed '/<Country>/{:a;N;/<\/Country>/!ba;/Portugal/s/Y/N/}' /file

Gather up the lines for a Country then match those lines to contain Portugal and replace the first Y with N.

potong
  • 51,370
  • 6
  • 49
  • 80
0

If your input is really always exactly that format then all you need is:

$ awk '/<Name>/{f=/Portugal/} f && /<Code>/{sub(/Y/,"N")} 1' file
<Data>
    <Country>
       <Name>Portugal<\Name>
       <Population>10M</Population>
       <Sub>
          <Code>N</Code>
       </Sub>
    </Country>
    <Country>
       <Name>Spain<\Name>
       <Population>30M</Population>
       <Sub>
          <Code>Y</Code>
       </Sub>
    </Country>
</Data>
Ed Morton
  • 172,331
  • 17
  • 70
  • 167