0

I am working on project that includes scraping data from social media service - Instagram. I already managed to retrieve source code using:

            $.ajaxPrefilter( function (options) {
          if (options.crossDomain && jQuery.support.cors) {
            var http = (window.location.protocol === 'http:' ? 'http:' : 'https:');
            options.url = http + '//cors-anywhere.herokuapp.com/' + options.url;
            //options.url = "http://cors.corsproxy.io/url=" + options.url;
          }
        });

and $.get('https://www.instagram.com/manchesterunited,

I have source file and I need to get informations such as title, description, image link. Everything I need is in this line: http://pastebin.com/Zh0mPGtu

I googled about how to retrieve data between two strings and I can't manage to do it. For example profile full name is between "full_name": " and ",. I tried to do this many many ways but I always failed, here is one of methods I tried: var title = data.match('"full_name": "(.*)",');.

worldofjr
  • 3,809
  • 8
  • 33
  • 48
  • Data on this page is in JSON format. It will be a lot easier for you to parse JSON than to handle it with regex. Take a look at http://stackoverflow.com/questions/8951810/how-to-parse-json-data-with-jquery-javascript – user3009344 Sep 01 '16 at 23:28
  • Thanks for help, I am struggling with getting Regex of this: I need to get whole 247 line: http://pastebin.com/bgHNqxXS – Krzysztof Wurst Sep 01 '16 at 23:51
  • – user3009344 Sep 02 '16 at 11:19

1 Answers1

0

'"full_name": "(.*)",' will take EVERYTHING until the very last ",

What you need is the non-greedy version (*?): '"full_name": "(.*?)",' That should solve your issue. The question mark, marks * or + as non-greedy.

A. L
  • 10,555
  • 20
  • 75
  • 142