Match Anything In Between Strings For Linux Grep Command

Question

I have read the post grep all characters including newline but I not working with XML so it's a bit different with my Linux command.

I have the following data:

Example line 0</span>
<tag>Example line 1</tag>
<span>Example line 1.5</span>
<tag>
Example line 2
</tag>
Example line 3
<span>Example line 4</span>

Using this command cat file.txt | grep -o '<tag.*tag>\|^--.*' I get:

<tag>Example line 1</tag>

However, I want the output to be:

<tag>Example line 1</tag>
<tag>Example line 2</tag>

How can I match anything between the strings, including the newline?

Note: I need to used <tag and tag> as strings because other files can contain multiple tags and text in between the lines. Will update sample data to show that.

score 2 · Accepted Answer · answered Oct 14 '16 at 19:14

2

This is easier done with gnu-awk using </tag> as record separator:

awk -v RS='</tag>' 'RT {gsub(/\n/, ""); print $0 RT}' file

<tag>Example line 1</tag>
<tag>Example line 2</tag>

answered Oct 14 '16 at 19:14

anubhava

713,503
59
514
593

Need to use `` to get between them. Updating sample data. Sorry :-/ – DomainsFeatured Oct 14 '16 at 19:33
ok try this: `awk -v RS='' 'RT {gsub(/.*?|\n/, ""); print "" $0 RT}' file` – anubhava Oct 14 '16 at 19:38
1

Hey Anbhava, this works! I'm going to make another question to build on this. Thank you :-) – DomainsFeatured Oct 14 '16 at 21:37
1

Hey Anubhava, any chance you could tell me how to also match for lines starting with "Example" too? If not, I made a question on it: http://stackoverflow.com/questions/40052458/match-multiple-strings-in-awk-command-using-rs-and-rt – DomainsFeatured Oct 14 '16 at 21:55

John1024 · Answer 2 · 2016-10-14T19:54:42.490

0

Consider this test file:

$ cat file2
Example line 0</span>
<tag>Example line 1</tag>
<span>Example line 1.5</span>
<tag>
Example line 2
</tag>
Example line 3
<span>Example line 4</span>

This produces the output that you want (requires GNU sed):

$ sed -z 's|\n||g; s|</tag>|&\n|g; s|[^\n]*<tag>|<tag>|; s|\n[^\n]*<tag>|\n<tag>|g; s|\n[^\n]*$|\n|' file2
<tag>Example line 1</tag>
<tag>Example line 2</tag>

Limitation: Note that processing XML-like text with non-specialized tools can be quite fragile.

edited Oct 14 '16 at 19:54

answered Oct 14 '16 at 19:07

John1024

103,964
12
124
155

Hey John, sorry, the data does have other tags. My example was too minimalist. I have updated it at bit. – DomainsFeatured Oct 14 '16 at 19:38
@DomainsFeatured See updated answer for code that handles the revised input file. – John1024 Oct 14 '16 at 19:55

Match Anything In Between Strings For Linux Grep Command

2 Answers2

Linked