Ben Vegiard
So, you want help writing a query in sql? You've got it! You are about to learn some best practices you can start using within seconds that will reward you for years to come.
OK, I admit it, I am a programmer too. So, I got my MCDBA to show off that I knew a thing or two about a database. So, lately I have been doing more DBA work than programming work. Yes, I can hear you crying “Traitor!” I have gone over to the dark-side; the “Back-end.”
The good news here is that I can (to coin a well overused phrase”) feel your pain. Many developers view database as some sort of necessary evil. In my experience, the average developer knows just enough SQL to insert, update, and delete a single record from a database. Let the DBA’s handle the rough stuff. Unfortunately, many developers don’t have access to a DBA (at least not one willing to work with them) and so have to work out the SQL on their own. As well, imagine the peer respect, lavish salary increases you will gain, and nominations for sainthood you will receive when you’re able to make the “stupid thing go faster.”
So, here are the three biggies that I have found that the average “Joe Programmer” can put into effect to increase their Query’s response time.
One: Everything Works in Dev!
To start off, let’s think outside the box a bit. I know I said I was going to give tips on how to make your query faster… and I am. However, this may be a bit off the beaten path. But doesn’t that make for a great tip? Something you haven’t already heard a hundred times?
Here’s the backdrop: The app is in the can, working great, and the roll out begins. Within a few days, you start hearing murmurs that parts of the app are slow. These murmurings soon become help tickets. These help tickets soon become meetings with the boss and the users. These meeting often turn into alcoholism…. You get the point.
This unfortunate situation arises all too frequently, and there is a relatively simple solution to it. You see, everything always performs well in your development environment because you have relatively few records in your development database! Sure, if I only had 30 records in my main table, my application will work wonderfully too! Once it gets into the real world, and a few hundred thousand or millions of records are input or imported, the need for performance tuning becomes a critical apparentness.
The solution is simple; create the same number of records in your development environment, as you’d expect to have in the first year of operation. If you have importable data (say, from legacy systems or third party feeds) them pump them in. If you do not, then there are many open source utilities to create mock data that would closely conform to your actual data. Now, as you're first writing your queries, you’ll have a really good idea if you need to concentrate on a particular query to avoid future headaches and counseling bills.
B) Know your Indexes
Many developers ignore indexes altogether. Many do not understand them. An index is merely a Table of Contents to the data in the database table. While developing an application, print out the schema of the database, including indexes, so you know what they are while writing your queries. If an appropriate one is missing, have someone experienced with indexes create it (whether that be you, the developer on the other side of the wall, or a DBA)
Here’s the impact. Say you are writing a contact management system. You would probably have a table with first name, last name, etc. For simplicity, we’ll keep this all in one big table with obvious field names. You’d probably write a query that looks something like:
Select Person_ID, Name_First, Name_Last
From Person_Data
Where Name_Last = @NameLast
And Name_First = @NameFirst
With 100 rows, this will work instantly. However, as your database grows to several thousand records, the query will slow down, eventually hitting the seconds range. If you merely add an index with the columns Name_Last and Name_First, you will have basically instant return time for the life of the application.
III) Function Calls are Evil Incarnate!
The following queries look extremely innocent. However, they are all application killers once the table grows to more than a couple thousand rows:
Select Person_Data_ID, Name_First, Name_Last
From Person_Data
Where Next_Contact_Date > GetDate()
Select Count(*)
From Person_Data
Where Upper(Name_Last) = ‘SMITH’
Select Avg(Collar_Size)
From Person_Data
Where dbo.MyFunc(Person_Data) > dbo.MyOtherFunc(22)
For each of these statements, a table scan will be used to try and fetch data. A table scan is simply an operation where the database looks at each and every record in the table to see which ones meet it’s criteria in the Where clause. We would much rather the database use the index and only fetch the 20 records it needs rather than it reading 200,000 records in a table scan for the 20 records it needs. However, the use of the functions (GetDate(), Upper(), MyFunc(), MyOtherFunc(), etc) keep it from doing that.
Why do functions keep the database from using indexes? It’s simply because the database does not know the result of the function. Each time GetDate() is called, a different answer could come back. Each time MyFunc() is called it could come back with a different result since the function could potentially be based on the input value, the time of day, or data that is being modified elsewhere in the database. The only way the database can know for certain the answer to the function call is to make the call and see what the answer is. Since the database doesn’t know ahead of time what the answer is, it cannot use the index to go to the corresponding entry.
So, whenever possible, avoid the use of functions. A great solution to these problems are (in order):
Select Person_Data_ID, Name_First, Name_Last
From Person_Data
Where Next_Contact_Date > GetDate()
Becomes
Declare @DT DateTime
Set @DT = GetDate()
Select Person_Data_ID, Name_First, Name_Last
From Person_Data
Where Next_Contact_Date > @DT
Vola, the database gets the value of GetDate() ahead of time. Since the value of @DT will not change while the query is running, it can use an appropriate index.
Select Count(*)
From Person_Data
Where Upper(Name_Last) = ‘SMITH’
Becomes
Select Count(*)
From Person_Data
Where Name_Last = ‘SMITH’
“Whoa?” I hear you cry. All you did was take the Upper() function out. What about case sensitivity?!? Actually, I’m cheating a bit here. With Microsoft SQL Server, the default index collations are case insensitive. In plain terms, that means the searches are case insensitive by default. With Oracle and other database engines, you can choose to make your indexes case insensitive, so why not? For a simple name search, it’s a good idea. And, it avoids the dreaded Table Scan.
The third example is a bit tougher. Essentially, if your function is extremely complex, you may not have a choice but to use it and let the user know to expect poor performance (a please wait screen goes a long way here, guys). However, if the function is merely a simple multiplication (i.e. Unit_Price * Quantity) then simply do the math in the statement and avoid the function. Please note: Many developers have it beaten into them to make reusable code so changes can be made in one central place. The fact of the matter here is that a SQL statement is not the same as a C program!! You should adjust accordingly. One of the adjustments is to give up reusable code in this context in favor of better performance.
Conclusion
Well, there ya have it. Three tips to start you on the road to better performance. Incorporate these into your daily routines and you’ll get results without dieting, without exercising, and without any boring calorie counting.
Ben Vegiard is a Professional IT Consultant currently working on Government database projects. Ben and his colleague have recently started a blog providing free sql notes and tutorials covering a variety of the issues faced by todays database-driven system developers. Please feel free to check out SQL Coach for more information.
Article Source: http://EzineArticles.com/?expert=Ben_Vegiard
0 ความคิดเห็น: Responses to “ Three Ways to Improve Query Performance (help for sql server time outs) ”