6

Maybe a duplicate, but I believe my case is a bit different. From one of the answers I got to this post on SQL Server Central that also comes handy too but is not quite the same scenario: 9 Things to Do When You Inherit a Database

Couple of weeks ago started on a new job. I'm supposed to be working as BI analyst and SQL Developer. But right on the firsts assignments noticed in general everything was taking long to execute. Asked the person that is guiding me on the first days, my supervisor you could say, and he told me that they know the databases are a mess. Asked if I could take a look and see what could be done, got a yes as answer.

So I began to digg in, using several really handy scripts, like for example:

What I've found is a big mess as they told me. As an example, blitzindex procedure returns almost 2000 rows with lot of duplicate indexes, NC indexes including all the columns from a table, lot of heap tables, really wide indexes and more. As for backups, none is done since several weeks, asked about it and IT guys just copy the databases each night to a different server. Couple of databases are over 100Gb and several others are close to that size too. Statistics are updated everyday for every table. There are reports that take more than hour to finish, on not so big tables (just couple of millions of rows). And so on.

As a test I spent couple of days tuning couple of big tables and different procedures and queries that use them. Prepared a baseline, using the profiler. Then made few changes and ran again the test queries. As expected, a report that was taking about 8min now is running in around a minute and a couple of other queries also now take less than half the time. All these changes are done on a test server, we still have a reporting server and a production server.

Taking into consideration that I'm supposed to be a BI and sql developer with limited rights new on the office and not a DBA. What other actions do you recommend me to do in order to approach this scenario? There is an appointed DBA but seems to be just an sql developer doing some dba tasks. There was DBA but he left around half year ago they told me. Should I forget about these issues? Or as someone that is heavily using the database I must point out the problems and propose solutions? Has someone been on the same scenario?

EDIT

Thanks everyone for the comments and answer. It's been one month since I posted the question. So now I can point out more accurate issues.

I believe indexes is one of the main concerns, but deleting indexes is completely forbidden, at least for now. Even when there are critical cases with several identical indexes, as on one of the main tables, with around 30million rows and growing, there are 3 NC indexes with 7 columns indexed and including all the columns. I can create indexes under very strict supervision but nothing more.

There are several disabled indexes, do they affect the performance on CRUD operations?

The "Uber view" syndrome. Lots of views inside another view inside another one. Some advices regarding this point?

Lot of sp that execute fast on SMSS but when calling from SSRS take several minutes to execute. Reading about it found that seems that is an issue related to parameters sniffing and their use on the stored procedures. One recommendation is to use variables on the sp. I already tried this on couple of them, with successful results. Also some recommend using WITH RECOMPILE. But as I understand, this can be counterproductive as each time a new execution plan needs to be created on recompilation. As always, I guess it "depends", must test and see where it can be helpful and where not. Any other suggestion?

Backups, logs, etc, is out of my hand for now. So I must center on performance and optimization issues.

2nd EDIT

There are also lot of duplicated sp's, udf's tables and some other objects. I guess that where created at some point on some schema, usually dbo. But then, after a while, a new schema was created to accommodate some of the objects but the old ones on dbo where not deleted. So now when modifying things I'm encountering lot of duplicated (some times triplicated) objects on the database.

Is there an easy way (checking dependencies maybe?) to check which object are still on use so I can make a report of which ones should be deleted?

Aaron Bertrand
  • 180,303
  • 28
  • 400
  • 614
Yaroslav
  • 2,826
  • 4
  • 26
  • 40
  • 2
    As written this almost sounds like career advice - but there is a really good technical question in there I think. Don't worry about job title in this and focus on what needs to be done to fix the issues. – JNK Mar 25 '13 at 20:10
  • Yes, I've read it again and indeed sounds like a career advice. You point out that there could be a good technical question, can you elaborate a bit more? Maybe I can edit my question and make it more specific. – Yaroslav Mar 25 '13 at 20:27
  • 2
    Focus on the issues, what you have done so far, etc. If structured as a "How do I start to tackle this large structural issue" question it would be a lot better than "How do I navigate political issues at a new job to fix this without stepping on toes?". Basically, what technical steps do you need to take to fix the problems, irregardless of title/job stuff. – JNK Mar 25 '13 at 20:28
  • Got it! When re-reading the question after your first comment I understood that it could be more on the "political issues" on how to do my job instead of the technical poing of view. Need to rewrite it indeed. I'm at home now. Tomorrow will collect some accurate information at work and edit my question to make it more suitable. – Yaroslav Mar 25 '13 at 20:46
  • 4
    Honestly, it might be worth it to bring in a DBA consultant for a couple of days to clean things on the box up and focus on fixing the performance problems. (Disclaimer I'm a consultant, no I'm not telling you to hire me.) If you don't have people in house who know how to dive into this stuff, bring in someone that does and get some training from them as part of the deal. – mrdenny Mar 25 '13 at 20:59
  • I think too that this should be done at some point. But it's not my call @mrdenny. And we are in Málaga, Spain, maybe too far from you. Indeed these consultant tasks could be done remotely but I think it works better onsite, or not? – Yaroslav Mar 25 '13 at 21:44
  • Remote is the same, thinking of the technological point of view. Onsite is only better if you really want face to face communication between the consultant and the team. But most consultant firms offer first a short (hours, days) touch up online, and only then, if required, a face to face encounter. But this goes in the higher $ range than the remote consult, anyway. – Marian Mar 25 '13 at 22:54
  • 2
    @MrDenny shall we organise a community advert for your services, so you don't have to drop your self-promoting comments/answers in occasionally? – Mark Storey-Smith Mar 25 '13 at 23:32
  • 1
    @Yaroslav Your question is somewhat open ended, which makes it difficult to provide a factual answer. The site chat room might be a better place to discuss your options. – Mark Storey-Smith Mar 25 '13 at 23:35
  • 7
    I'd make a list of the issues and ask your boss where they want you to start first. If you have no DBA, then I'd urge you to focus on making certain you can recover data and prevent any data loss. After you have that done, then focus on the most pressing performance issues. – SQLRockstar Mar 26 '13 at 02:12
  • 1
    @SQLRockstar: That sounds like a good answer for this question. Perhaps with a few notes about how to help management in prioritizing the fixes, as it sounds like they may not know. – Jon of All Trades Apr 19 '13 at 17:04
  • I'm making up some kind of list like that, but sounds to me that my efforts will go end up on the bottom of some other pile of papers. I will try and update my question maybe tomorrow with some more detailed and accurrate questions, as I'm already been here for a month and a half and have better understanding of what is under the hood. – Yaroslav Apr 19 '13 at 17:26
  • 1
    @SQLRockstar +1000 upvotes if i could give them to you. being able to "recover data and prevent data loss" should be the number one priority for the accidental DBA. Bad indexes, heaps, performance problems and tuning queries are interesting problems. Testing backups and restores is boring. Be boring for a couple weeks. If you can say "We've got a hole in our backups. We could lose XX amount of data, maybe more b/c we don't test them. I've got a script ready that does YY and ZZ to fix it...." then you'll gain respect and it will help grease the wheels for the interesting problems. – StrayCatDBA Apr 21 '13 at 06:46
  • Backups, recovering, prevent data loss and similar issues are out of my hand for now. I must center on performance tuning, query optimization, creating new reports, etc. – Yaroslav Apr 26 '13 at 15:31

1 Answers1

3

Asked if I could take a look and see what could be done, got a yes as answer.

That's very encouraging: you're in an excellent position to advance your career and learn something!

Some of the things you mentioned are problems, some not. I'll answer them briefly and follow with some broader advice.

  • procedure returns almost 2000 rows with lot of duplicate indexes. If they're really duplicates -- type, order, etc. -- extras can be dropped without loss and without fear. The DBMS can use only one, and it doesn't care which. Insert performance improves whenever an index goes bye-bye.

  • NC indexes including all the columns from a table. Normal. These are known as "covering indexes". They're similar to materialized views.

  • lots of heap tables. Bad. Every table needs a key. Sometimes you'll be told, "it has no key". If true, the whole row can be defined as the primary key, except that an extra column is added (not in the key) as a count. "Adding" a row becomes a matter of adding 1 to the count.

  • really wide indexes might or might not help, cf. covering indexes, above.

  • databases are over 100Gb. Normal, databases grow with data.

  • Statistics are updated everyday for every table. Good.

  • There are reports that take more than hour to finish. Every problem is an opportunity.

My advice is to do your homework by reading four books, and to start small with something dramatic, by fixing the longest-running procedure (or the one you boss thinks is the biggest bugaboo).

When you suggest a change, you want to be able to speak confidently and authoritatively in the domain you're addressing yourself to: relational theory and database design, and SQL Server.

For theory, I recommend CJ Date's An Introduction to Database Systems, SQL and Relational Theory, and Database Design and Relational Theory, in that order. That will give you the information you need, and intellectual reference points when you're asked about why something should be done a certain way. For SQL Server, I can recommend Inside Microsoft SQL Server 2008: T-SQL Querying by Itzik Ben-Gan.

Reading the books is unfortunately the easy part; the hard part is applying them to the database. You may find it difficult to get a list of all the queries used with a table. Not only will make it hard to know which indexes are needed, it means splitting one table into two will require arduous testing to make sure nothing breaks (and application changes, too, potentially). The guy who wants to normalize the database has few friends because any change, even one that's correct and necessary, may create work and disruption for others.

I'd look at your bad boy, the long-running report or the one that seems to be wrong a lot of the time. Examine the query plan and think about the tables in terms of BCNF. If you're lucky, the answer might be adding an index, or possibly removing some. If not, you've got a design problem on your hands, something that's bound to involve others.

Good luck. You have your work cut out for you. The good news is you've got a place to learn some things, and permission to work smarter instead of harder. If you keep at it, one of these days you'll reduce query time not by 30% or 50%, but 99% or more. When that happens, you'll be looking at the new DBA in the mirror, so be careful what you wish for. :-)

James K. Lowden
  • 602
  • 3
  • 4
  • Thanks James. As for the size of the databases, I was just pointing out that they where big, not just a couple of gigs. I agree about the normalization issue and making friends, but for now that is not on the plan. As for all other issues, I'm updating the question. – Yaroslav Apr 26 '13 at 15:11