1

I've got a table with 200k rows which lives on an SSD. I'm doing a lookup using a comparison on varchar field where ~20% of entries are null, otherwise 10 numeric characters with random distribution.

I expect a search to take a bit of time without an index, but not 13s like it does now. The execution plan accounts all 13s to "Clustered Index Scan". What could be going wrong here? This amount of data can be read and grepped in less than a second, so I'm surprised with SQL Server doing >10x the work.

The profiler says: CPU 359, reads 11143.

Different slow run (3m 21s) plan: https://www.brentozar.com/pastetheplan/?id=rkNF4LF6K

Edit: After compressing the table at Page level I'm down to sub-second results. After turning off compression I'm still at sub-second. After a few hours, I'm back to very long delays, this time >1min)

viraptor
  • 111
  • 5
  • What's your execution plan say?...perhaps you have a cardinality estimate issue which would account for the performance problems. Please upload your actual execution plan to Paste The Plan and link it in your post so we can help you. – J.D. Jan 22 '22 at 04:49
  • @J.D. Here's the plan (from the quick run, but it was the same on the slow one) https://www.brentozar.com/pastetheplan/?id=SJsPAbY6t But I don't see how cardinality would affect a full scan? – viraptor Jan 22 '22 at 04:56
  • I don't understand what you mean by quick vs slow run, your post only mentions it's slow at roughly 13s? Also please add the link to your post. – J.D. Jan 22 '22 at 04:58
  • See the edit - compressing and decompressing brought it to sub-second duration. – viraptor Jan 22 '22 at 05:00
  • Doing such an action may have affected / fixed your execution plan, but hard to say without observing it prior to you doing so. You should use the execution plan as your go-to for performance problems as a starting point. Also you shouldn't normally have to change the compression settings to fix performance in this case, but it ended up getting you across the finish line as a one off coincidence. – J.D. Jan 22 '22 at 05:04
  • The times were different, but the plan itself (as in operations) were the same as posted. I.e. the nodes have not changed: clustered index scan -> gather streams -> top. – viraptor Jan 22 '22 at 05:06
  • If it was only spending 359ms CPU time that would imply it was waiting for the rest of the time. Did you collect the actual execution plan for a slow run? What wait stats are shown in that? – Martin Smith Jan 22 '22 at 07:28
  • @MartinSmith Things got slow again, so I added the plan of the slow run to the question – viraptor Jan 22 '22 at 09:56
  • So the query spent 200 seconds waiting for a memory grant and the memory grant of that size wasn't needed – Martin Smith Jan 22 '22 at 10:14
  • @MartinSmith thanks, wanna submit that as an answer? I raised a separate question for how to deal with that in my case: https://dba.stackexchange.com/questions/306428/reducing-memory-grant-waits-in-development-environment – viraptor Jan 22 '22 at 11:13
  • 1
    I would question why it took 200 seconds to do a memory grans of just 50MB! – Tibor Karaszi Jan 22 '22 at 16:46

1 Answers1

2

So as I mentioned in the comments, you appear to be having a cardinality estimate issue which is resulting in the unnecessary memory grant to occur (as Martin pointed out). You should always use your execution plans as a comparison even if they both are doing the same operations, because the runtime statistics may be different between the slow and fast run, which appears to be the case here.

You can observe the cardinality estimate issue you're experiencing in this example by looking at the clustered index scan operation on the right side of your execution plan. It is showing Actual Number of Rows is 5, but the plan's Estimated Number of Rows is over 7,000. This is a significant overestimate that results in the query requesting and waiting for much more memory than it needs. (Sometimes you'll the see the opposite occur, where the cardinality estimate is an underestimate, and under requests memory and / or generates a less efficient execution plan as a result.)

Your issue likely results from non-sargable and hard to optimize predicates in your WHERE clause, most likely these predicates WHERE (REPLACE(MEDICARE_NO, ' ', '') LIKE '111111111%') AND (STATUS_CODE Not in ('D', 'X','I','Z')). Using functions in predicates can result in cardinality estimate issues like you're currently seeing. So replacing () the REPLACE() function with a different solution would probably help fix your issue. You can either store the data already staged with spaces replaced or add a computed column to your table that applies the REPLACE() function. You can even make it a persisted computed column and index it, since the REPLACE() function is deterministic.

Your other predicate AND STATUS_CODE Not in ('D', 'X','I','Z')) could also be better rewritten to maximize performance. NOT IN is an inequality operator and makes it difficult for the engine to optimize for. You're better off using IN and listing the values that are valid, which would be an equality operation then. Additionally the IN clause is just syntactical sugar for multiple OR clauses. Sometimes that also throws off the optimizer and you can try replacing them with using a UNION clause instead, for each value from your IN clause. Of course that would require you to repeat your query multiple times, so hurts maintainability and readability a little bit, but most times you can refactor the core part of your code to a CTE first then using a UNION against that CTE for each value in the IN clause.

But I'd recommend focussing on eliminating your use of the REPLACE() function in your WHERE clause first. My instincts would say that's your bigger issue here. You can quickly prove it out too by testing with selecting your data into a temp table first, with REPLACE(MEDICARE_NO, ' ', '') AS SomeColumn as a column you select into the temp table. Then use that temp table in your main query instead, so your main query's WHERE clause now becomes WHERE SomeColumn LIKE '111111111%'.

J.D.
  • 37,483
  • 8
  • 54
  • 121
  • Oh, i could improve the query in lots of ways, but this is a 3rd party software that I can't touch in any meaningful way. I was just trying to understand exactly how things fail. – viraptor Jan 22 '22 at 20:39
  • @viraptor Yea it depends how you're using the data. If it's slow when your 3rd party software runs it, then the above information you should pass along to the technical team of the vendor, and if you're lucky maybe they'll fix it. But unfortunately not much you can do to make the software faster yourself then without messing with their code. Otherwise if you're running this same query against your 3rd party software's database, then you can stage the data from their database into your own database in a more conducive way for performance (as per my answer) and then you can change the query. – J.D. Jan 23 '22 at 01:38