0

I wanted to capture the string between braces after specific selector. For example I have string like:

<div id="text-module-container-eb7272147" class="text-module-container"><style>#123-module-container-eb7272147 p{text-color:#211E22;bgcolor:test;} #text-module-container-eb7272147 p{color:#211E1E;} #text-module-container-eb7272147 p{color:#123444;} </style>

And now if I give selector #123-module-container-eb7272147 p it should return text-color:#211E22;bgcolor:test;

I am able to get data between the braces but not with specific selector. This is tried code https://regex101.com/r/AESL8q/1

Rikesh
  • 25,621
  • 14
  • 77
  • 86
  • 1
    Try `/#123-module-container-eb7272147\s+p{([^}]+)}/i` – anubhava Mar 08 '22 at 07:22
  • "*I give selector #123-module-container-eb7272147 p it*" - "*it*" being? As tagged [tag:jquery] if you give that as a selector to jquery, it won't return any elements (given *only* the provided string) as there's no `p` element. – freedomn-m Mar 08 '22 at 07:25
  • Slightly simplified regex, but you will need to use groups, not matches: `#123-module-container-eb7272147 p{(.+?)}` https://regex101.com/r/obIoHN/1 – freedomn-m Mar 08 '22 at 07:29
  • @freedomn-m: It is almost same regex that I had posted in comment. It is just that `[^]]+` is lot more efficient than `.+?`. – anubhava Mar 08 '22 at 07:39
  • @anubhava sorry, no idea about efficiency (unlikely to matter unless 10mill+ searches), just *slightly* shorter and with a regex101 link – freedomn-m Mar 08 '22 at 07:52

1 Answers1

1

You can use a positive lookbehind with your selector and the opening brace then capture all chars which are not a closing brace and use a positive lookahead for the closing brace (optional):

/(?<=#123-module-container-eb7272147 p\{)[^}]+(?=\})/

  • The positive lookbehind is done with (?<= ).
  • For the selector, you'll have to escape some chars, typically if you have a class selector the dot should be escaped. The opening brace also.
  • The match you want between the braces is [^}]+ to say any char except the closing brace, once or more. Adding a question mark behind would make it ungreedy but I don't think it would be necessary. It would be the case if you use the dot to match anything.
  • The positive lookahead is done with (?= ).

You can test it here:

/**
 * Escape characters which have a meaning in a regular expression.
 * 
 * @param string The string you need to escape.
 * @returns The escaped string.
 */
function escapeRegExp(string) {
    return string.replace(/[.*+?^${}()|[\]\\]/g, '\\$&');
}

let button = document.querySelector('#extract');

button.addEventListener('click', function(event) {
  let html = document.querySelector('#html').value;
  let selector = document.querySelector('#selector').value;
  let pattern = new RegExp('(?<=' + escapeRegExp(selector) + '\s*\{)[^}]+(?=\})');
  let matches = pattern.exec(html);
  if (matches) {
    alert("The extracted CSS rules:\n\n" + matches[0]);
  }
  event.preventDefault();
});
html, body {
  font-family: Arial, sans serif;
  font-size: 14px;
}

fieldset {
  min-width: 30em;
  padding: 0;
  margin: 1em 0;
  border: none;
  display: flex;
}

label {
  margin-right: 1em;
  width: 6em;
}

input[type="text"],
textarea {
  width: calc(100% - 7em);
  min-width: 20em;
  margin: 0;
  padding: .25em .5em;
}

input[type="submit"] {
  margin-left:  7.1em;
  padding: .2em 1em;
}
<form action="#">
  <fieldset>
    <label for="selector">Selector: </label>
    <input type="text" id="selector" name="selector"
           value="#123-module-container-eb7272147 p">
  </fieldset>
  <fieldset>
    <label for="">HTML code:</label>
    <textarea id="html" name="html" cols="30" rows="10">&lt;div id=&quot;text-module-container-eb7272147&quot; class=&quot;text-module-container&quot;&gt;&lt;style&gt;#123-module-container-eb7272147 p{text-color:#211E22;bgcolor:test;} #text-module-container-eb7272147 p{color:#211E1E;} #text-module-container-eb7272147 p{color:#123444;} &lt;/style&gt;&lt;div style=&quot;background-color: rgb(168, 27, 219); color: rgb(33, 30, 30);&quot;&gt;&lt;span style=&quot;color:#3498db;&quot;&gt;Click the edit button to replace this conte&lt;/span&gt;nt with your own.&lt;/div&gt;&lt;/div&gt;</textarea>
  </fieldset>
  <fieldset>
    <input type="submit" id="extract" value="Extract the CSS rules">
  </fieldset>
</form>

Or play with it here: https://regex101.com/r/N5cVKq/1

Patrick Janser
  • 2,356
  • 1
  • 15
  • 14
  • Thanks Patrick, that works like charm. Any way by which I can dynamically replace `#123-module-container-eb7272147 p` part in pattern? – Rikesh Mar 08 '22 at 08:09
  • You can build your pattern with `let pattern = new RegExp(...)` and quickly create a function to escape your selector. See the solution here: https://stackoverflow.com/a/6969486/653182 – Patrick Janser Mar 08 '22 at 08:14
  • 1
    @Rikesh I've updated the answer with the solution of the dynamic regular expression. If it's fine for you then you can mark the answer as correct so that your question can be solved and closed. Best regards. – Patrick Janser Mar 08 '22 at 10:17