1

I'm trying to extract data/urls (in this case - someurl) from a file that contains them within some tag ie.

xyz>someurl>xyz

I don't mind using either awk or sed.

fedorqui
  • 252,262
  • 96
  • 511
  • 570
L P
  • 1,616
  • 5
  • 25
  • 44

3 Answers3

9

I think the best, easiest, way is with cut:

$ echo "xyz>someurl>xyz" | cut -d'>' -f2
someurl

With awk can be done like:

$ echo "xyz>someurl>xyz" | awk  'BEGIN { FS = ">" } ; { print $2 }'
someurl

And with sed is a little bit more tricky:

$ echo "xyz>someurl>xyz" | sed 's/\(.*\)>\(.*\)>\(.*\)/\2/g'
someurl

we get blocks of something1<something2<something3 and print the 2nd one.

fedorqui
  • 252,262
  • 96
  • 511
  • 570
0

grep was born to extract things:

kent$  echo "xyz>someurl>xyz"|grep -Po '>\K[^>]*(?=>)'
someurl

you could kill a fly with a bomb of course:

kent$  echo "xyz>someurl>xyz"|awk -F\> '$0=$2'
someurl
Kent
  • 181,427
  • 30
  • 222
  • 283
0

If your grep supports P option then you can use lookahead and lookbehind regular expression to identify the url.

$ echo "xyz>someurl>xyz" | grep -oP '(?<=xyz>).*(?=>xyz)'
someurl

This is just a sample to get you started not the final answer.

jaypal singh
  • 71,025
  • 22
  • 98
  • 142