1

I have multiple tables with corrupted data like this:

Good data</title>and some random HTML<p></p><

What I would like to do is only keep "Good data" in the cell and remove everything else. I thought I can just create a small project with EF and clean the db with C# but there might be a quicker solution with SQL only? Can you use regex or some kind of substring function in SQL?

I will manually look at the table and select the field that needs to run through the code, there is no need to automate that at this point.

Aaron Bertrand
  • 261,961
  • 36
  • 448
  • 471
bobek
  • 7,903
  • 8
  • 35
  • 75

2 Answers2

4
UPDATE dbo.SQLInjectionVictimTableName
  SET UnprotectedColumn = LEFT(UnprotectedColumn, CHARINDEX('<', UnprotectedColumn) - 1)
  WHERE UnprotectedColumn LIKE '%<%';
Aaron Bertrand
  • 261,961
  • 36
  • 448
  • 471
0

If good data is consistently followed by </, you could use:

UPDATE YourTable
SET BadField = LEFT(Badfield,CHARINDEX('</',BadField)-1)
WHERE CHARINDEX('</',Badfield) > 0
Hart CO
  • 32,944
  • 5
  • 44
  • 59