0

I'm trying to remove a script that contains a malware from my database.
It was injected in a lot of registers of my table.

The script starts with a <script> tag and ends with a </script> tag.

I'm using the following code to find and replace it:

$content = $post->post_content;
$new_content = preg_replace('/(<script>.+?)+(<\/script>)/i', '', $content);

I've tested it on regx101.com and it's working fine but on my code, it doesn't work.

Does anyone know what's wrong?

fackz
  • 511
  • 2
  • 6
  • 12

1 Answers1

0

Here is my goto regex for <script>...</script> tags with their contents:

(\<script\>)([\s\S]*?)(<\/script>)

You're not escaping some key characters and you're not capturing everything which could be in the contents of the tags.

Here is an explanation of the content capturing group:

\s matches any whitespace character
\S matches any non-whitespace character
*? matches between zero and unlimited times, as few times as possible, expanding as needed

As I stated before, you really shouldn't do this. You should use a PHP DOM parser instead.

Funk Forty Niner
  • 74,372
  • 15
  • 66
  • 132
Jay Blanchard
  • 33,530
  • 16
  • 73
  • 113