You are a database developer. You plan to design a database solution by using SQL Server 2008.
The database contains a large table that has 20 million rows. The table contains the following columns:
CustomerNumber
• CompanyName
• ContactFirstName
• ContactLastName
The table currently has single-column nonclustered indexes on the CustomerNumber, CompanyName, and ContactFirstName columns.
An application uses data from this table. The user interface of the application allows the usage of any one filter from the following list of filters:
• CustomerNumber and CompanyName
• CompanyName
• ContactLastName
• ContactLastName and ContactFirstName
In all cases, the listed order of the columns is the order in which they will appear in the WHERE clause that is generated.
You need to design an indexing strategy for this table, so that the query optimizer can quickly perform an index seek when searching through the table data.
What should you recommend?
A. • Drop all existing indexes.
• Create a new index on ContactLastName and ContactFirstName.
• Create a multicolumn index on CustomerNumber and CompanyName.
B. • Drop all existing indexes.
• Create two multicolumn indexes, one on CustomerNumber and CompanyName and the other on ContactLastName and ContactFirstName.
C. • Drop the indexes on CustomerNumber and ContactFirstName.
• Create two multicolumn indexes, one on CustomerNumber and CompanyName and the other on ContactLastName and ContactFirstName.
D. • Drop all existing indexes.
• Create two multicolumn indexes, one on CompanyName and CustomerNumber and the other on ContactLastName and ContactFirstName.
Correct Answer: D
Explanation/Reference:
First of all, the order of the columns in the WHERE clause does NOT matter. That is the following 2 statements will have the same execution plan:
SELECT * FROM AppTable
WHERE CustomerNumber = 25000 AND CompanyName = ‘Company Name 3’
SELECT * FROM AppTable
WHERE CompanyName = ‘Company Name 3’ AND CustomerNumber = 25000
Second, the order of the index keys matters, because it determines the sort order of the index and it affects the set of seek predicates that SQL Server can evaluate using the index. That is the following 2 indexes are absolutely different:
CREATE INDEX IX_AppTable_CompanyName_CustomerNumber ON dbo.AppTable (CompanyName, CustomerNumber);
CREATE INDEX IX_AppTable_CustomerNumber_CompanyName ON dbo.AppTable (CustomerNumber, CompanyName);
In the current scenario, CompanyName can be specified either alone or with CustomerNumber, i.e. we will have either
WHERE CompanyName = ‘Company Name 3’ or
WHERE CompanyName = ‘Company Name 3’ AND CustomerNumber = 25000
So if we create the following index:
CREATE INDEX IX_AppTable_CompanyName_CustomerNumber ON dbo.AppTable (CompanyName, CustomerNumber)
GO
it will work for both quiries:
This index will not be efficient only for Customer Number alone:
But this filter is not included in the list, so we don’t care.
For the last 2 filters the following index will work:
CREATE INDEX IX_AppTable_ContactLastName_ContactFirstName ON dbo.AppTable (ContactLastName, ContactFirstName)
GO
It is worth noting that in terms of performance for the followng 2 filters:
• CompanyName
• ContactLastName
it is better to have separate indexes:
CREATE INDEX IX_AppTable_CompanyName ON dbo.AppTable (CompanyName)
CREATE INDEX IX_AppTable_ContactLastName ON dbo.AppTable (ContactLastName)
However, the performance gain will not be very high in comparison with the corresponding multicolumn indexes. Taking into account that the table has 20 million rows, we should have only 2 multicolumn indexes instead of 4 indexes. It will save a lot of space.