I’ve got a live table with 98 million rows that I’m normalizing. Unforuntately all full batch updates tend to lock the system for a very long so I wrote a VB.Net program to perform updates in smaller batches. The program pulls 10,000 records that haven’t been updated by using the standard SELECT TOP 10000…
syntax. Unfortunately the WHERE
portion of this query takes a really long time to run. I started digging into things and tried just SELECT TOP 1…
and that returned results immediately. I then tried SELECT TOP 10…
and that returned immediately, too. After playing around I found that SELECT TOP 63…
was the most I could go before the query took a long time to execute. I have no idea what’s so special about 63
but it was bugging me. So I ran SELECT TOP 63…
and SELECT TOP 64…
side-by-side in SQL SMS with the execution plan turned on and found the 63
version was using an Index Seek on the index that was on the column in my WHERE
clause. This is a good thing. The 64
version, however, was performing a Clustered Index Scan on the primary key which had nothing to do with my WHERE
clause.
The fix for this is to force SQL to use the index of your choice and its so easy to do. In your query, before the WHERE
clause just add WITH (INDEX(INDEX_NAME))
.
So if this is your original query:
SELECT TOP 10000 Col1, Col2 FROM YourTable WHERE Col3 IS NULL
And you have an index on Col3
called IDX__YourTable__Col3
you’d execute this query instead:
SELECT TOP 10000 Col1, Col2 FROM YourTable WITH (INDEX(IDX__YourTable__Col3)) WHERE Col3 IS NULL
Normally you shouldn’t have to do this but if you’re doing one-off things like I’m doing it comes in handy.
–Update:
The forced index is running so fast that I’ve actually changed my batch size from 10,000 to 100,000. The prior 10,000 was processing about 400 records/sec, most of the time spent trying to get results from the SELECT TOP…
but with the forced index and changing to 100,000 I’m now averaging 3,500 records/sec. Nice.
2 thoughts on “Make sure your SQL queries are using the proper index”