blackangelion: August 2011

Monday, August 8, 2011

Difference between Truncate and Delete

Truncate and Delete both are used to delete data from the table. Both these commands will only delete the data of the specified table; they cannot remove the whole table: data along with structure. Now it's ok that both the SQL statements are used to delete only the data from the table but they both differ from each other in many aspects like syntax, performance, resource uses, etc., so let's take a look of both of these commands.

Truncate

The TRUNCATE command in SQL removes all rows from a table without logging the individual row deletions in the transaction log. The TRUNCATE statement has the sane functionality as the DELETE statement has in that it deletes the data from the table without modifying or deleting the structure of the table, however you can't use the WHERE Clause with the TRUNCATE statement.
The Syntax for this statement is:
TRUNCATE TABLE [ { database_name.[ schema_name ]. | schema_name . } ] table_name
Table_name : Is the name of the table to truncate or from which all rows are removed.

To execute it, use a query like the one below:

TRUNCATE TABLE authors

The above command will delete all data from the table author.

Delete

The DELETE command in SQL also removes rows from a table, but it logs the individual row deletions in the transaction log. You can also use the WHERE Clause with the DELETE statement to qualify which rows are to be deleted.
Here I am showing just the simple syntax. For a more detailed explanation of the DELETE syntax, visit this link: DELETE (Transact-SQL)
Syntax:

DELETE FROM TABLE_NAME[ { database_name.[ schema_name ]. | schema_name . } ] table_name

Database_name: Is the name of the database in which the table exists. This is optional. If it is not included, the current database context is assumed.
Schema_name: Is the name of the schema in which the table exists. This is optional. If it is not included, the current database context is assumed.
Table_name : Is the name of the table to truncate or from which all rows are removed.

A simple command looks like this query:

DELETE FROM authors

The above query will delete all data from the table author.
In DELETE statements you can limit your DELETE query using the WHERE clause to delete only particular records that fulfill the condition of the WHERE clause. In this case, only those records matching the WHERE clause will be deleted, not the all records. A limited DELETE query is shown below.

DELETE FROM authors Where AuthorId IN (1,2,3)

This statement only deletes rows from the authors table which have the author ids of 1, 2, or 3.

The Differences between Truncate and Delete

Now above you have seen the DELETE and TRUNCATE statements. Both the statements are similar, but there are many differences that exist between them. Those similarities and differences are explaned below:
TRUNCATE and DELETE remove the data not the structure
Both commands remove rows from a table, but the table structure and its columns, constraints, indexes, and so on remain. To remove the table definition in addition to its data, use the DROP TABLE statement.
Conditional based deletion of data
Conditional based deletion of data means that not all rows are deleted. Let's suppose I have a table authors and from this table I want to delete the authors that are living in Australia. Let's examine what our options for doing this with each command.

TRUNCATE - In case of the TRUNCATE command we can't perform the conditional based deletion because there is no WHERE clause allowed with this command.
DELETE - THe DELETE command provides the functionality of conditional based deletion of data from the table using the WHERE clause.

Delete and Truncate both are logged operations:
On most of the articles I have read on the Internet, I have seen this written: "delete is a logged operation and truncate is not a logged operation", which means when we run the delete command it logs (records) the information about the deleted rows and when we run the truncate command it doesn't log any data. But this is not true; truncate is also a logged operation but in a different way. It uses fewer system and transaction log resources than delete. The TRUNCATE command uses minimum logging resources, which is why it is faster than delete. So both delete and truncate are logged operations, but they work differently as shown below.

DELETE is a logged operation on a per row basis. The DELETE statement removes rows one at a time and records an entry in the transaction log for each deleted row. So, in case if you are deleting a huge number of records then it can cause your transaction log to grow. This means the deletion of a huge number of records will use more server resources as it logs each and every row that is deleted. That is why your transaction log will grow very rapidly. Since the delete statement records each deleted row it is also slow. Some people ask that if this is done for each row then why does not Microsoft modify the delete statement to not record each deleted row??? The answer is when you run your databases in full recovery mode, detailed logging is necessary for SQL Server to be able to recover your database to the most recent state.
TRUNCATE logs the deallocation of the data pages in which the data exists. TRUNCATE is faster than DELETE due to the way TRUNCATE "removes" rows from the table. It won't log the deletion of each row; instead it logs the deallocation of the data pages of the table. The TRUNCATE statement removes the data by deallocating the data pages used to store the table data and records only the page deallocation in the transaction log. Actually, TRUNCATE does not remove data, but rather deallocates whole data pages and removes pointers to indexes. The data still exists until it is overwritten or the database is shrunk. This action does not require a lot of resources and is therefore very fast. It is a common mistake to think that TRUNCATE is not logged. This is wrong. The deallocation of the data pages is recorded in the log file. Therefore, "Books Online (BOL)" refers to TRUNCATE operations as "minimally logged" operations. You can use TRUNCATE within a transaction, and when this transaction is rolled-back, the data pages are reallocated again and the database is again in its original, consistent state.

To be more specific lets take a look of an example, which will tell you that truncate is also a logged operation. So take a look of following example.

/*Create a dummy table in a non-production(dummy) database.*/
CREATE TABLE tranTest
(
Id int Identity(1,1),
Name Varchar(100)
)
/*Now insert the records in the tranTest table.*/
INSERT INTO tranTest(Name) VALUES('prashant')
INSERT INTO tranTest(Name) VALUES('prateek')
INSERT INTO tranTest(Name) VALUES('praveen')
INSERT INTO tranTest(Name) VALUES('prakash')
INSERT INTO tranTest(Name) VALUES('prabhat')
/*Then as you know delete is a looged operation that means 
in a transaction if we rollback the transaction after deleting 
the records from table, it will restore all deleted records.*/
BEGIN TRAN
DELETE FROM tranTest
SELECT * FROM tranTest
ROLLBACK
SELECT * FROM tranTest
/*Now delete all records from the table.*/
DELETE FROM tranTest
/*And Insert new fresh records in the table.*/
INSERT INTO tranTest(Name) VALUES('prashant')
INSERT INTO tranTest(Name) VALUES('prateek')
INSERT INTO tranTest(Name) VALUES('praveen')
INSERT INTO tranTest(Name) VALUES('prakash')
INSERT INTO tranTest(Name) VALUES('prabhat')
/*Then as you now you know that Truncate is also a logged 
opertion so it must restore all records that are deleted from 
the tranTest table in the below tansaction*/
BEGIN TRAN
TRUNCATE TABLE tranTest
SELECT * FROM tranTest
ROLLBACK
SELECT * FROM tranTest

So when we run the above command and TRUNCATE also restores all the records that means somewhere the logging of the TRUNCATE operation is also being done, but with minimal resources. Here I think minimal resources means you can restore the Truncate statements for that particular session but if you close the connection then i think it will not able to restore your data. BUT Delete will be able to restore your data later also.
Behavior of Delete and Truncate for identity columns
OK, now the case of identity columns. Both the TRUNCATE and DELETE commands behave differently against Identity columns. When we use truncate it will reset the counter used by an identity column for new rows to the seed value defined for the column. But in the case of DELETE it will not reset the counter of your identity column. Rather it maintains the same counter for new rows. In both the cases, if no seed was defined the default value 1 is used. As TRUNCATE resets the identity column counter, in the case where you want to retain the identity counter, use DELETE instead of TRUNCATE.
Why do these two behave differently? I don't know, but T-SQL is providing you the two ways you can use as needed. In the case where you want all the data from the table deleted and the counter will restarting from 1, then truncate can help you. If you want to delete all the records but don't want to reset you counter, then delete is there for you.
An example for this is here. :

/* First if you already created the tranTest table then delete it from your database.*/
DROP Table tranTest
/* Then Create a dummy table in a non-production(dummy) database with an Identity column
*/
CREATE TABLE tranTest
(
Id int Identity(1,1),
Name Varchar(100)
)
/* Now insert the records in the tranTest table. */
INSERT INTO tranTest(Name) VALUES('prashant')
INSERT INTO tranTest(Name) VALUES('prateek')
INSERT INTO tranTest(Name) VALUES('praveen')
INSERT INTO tranTest(Name) VALUES('prakash')
INSERT INTO tranTest(Name) VALUES('prabhat')

/* If you run the below querry you'll see that the 
   max value of the identity column is 5 caz we have 
   inserted only five records in this table. */
SELECT * FROM tranTest

/* Now delete all the records from the table using the delete command. */
DELETE FROM tranTest

/* Now by running the above command your table is empty, 
   so insert the new 5 records in the table to see that after 
   using the delete command from where identity will tart, fom 1 or from 6. */
INSERT INTO tranTest(Name) VALUES('prashant')
INSERT INTO tranTest(Name) VALUES('prateek')
INSERT INTO tranTest(Name) VALUES('praveen')
INSERT INTO tranTest(Name) VALUES('prakash')
INSERT INTO tranTest(Name) VALUES('prabhat')

/* After running the below query u'll see that your identity now 
  starts from 6 because delete do not resets the counter of your 
  identity column */
SELECT * FROM tranTest

/* Now drop your table again and create it again. */
DROP TABLE tranTest
 
/* Create a dummy table again in a non-production(dummy) database 
   to see the effect of truncate command on identity columns */
CREATE TABLE tranTest
(
Id int Identity(1,1),
Name Varchar(100)
)

/* And Insert new fresh records in teh table. */
INSERT INTO tranTest(Name) VALUES('prashant')
INSERT INTO tranTest(Name) VALUES('prateek')
INSERT INTO tranTest(Name) VALUES('praveen')
INSERT INTO tranTest(Name) VALUES('prakash')
INSERT INTO tranTest(Name) VALUES('prabhat')

/* Now at this point the counter of this table's 
  identity column is 5 taht is the max value of id column */
SELECT * FROM tranTest

/* Then truncate the table. */
TRUNCATE TABLE tranTest

/* and insert new records */
INSERT INTO tranTest(Name) VALUES('prashant')
INSERT INTO tranTest(Name) VALUES('prateek')
INSERT INTO tranTest(Name) VALUES('praveen')
INSERT INTO tranTest(Name) VALUES('prakash')
INSERT INTO tranTest(Name) VALUES('prabhat')

/* Now you'll see that after truncating the table the 
  identity is reset to its seed value. */
SELECT * FROM tranTest

/* So this example explains the beahviour of both of these command for Identity columns. */

TRUNCATE is a DDL command whereas DELETE is a DML command
This is also a common difference you might have read in many articles. That is TRUNCATE is a DDL (data definition language) operation and DELETE is a DML (data manipulation language) operation. Yes according to SQL server it's true. But why it is so, why is TRUNCATE DDL and DELETE DML? Let's look at this;
When we run the TRUNCATE command it puts a "Schema modification (Sch-M)" lock on the table. What is "schema modification (Sch-M)"?
The Database Engine uses schema modification (Sch-M) locks during a table data definition language (DDL) operation, such as adding a column or dropping a table. During the time that it is held, the Sch-M lock prevents concurrent access to the table. This means the Sch-M lock blocks all outside operations until the lock is released. Now you'll ask how it blocks any modification to the table when in the case of TRUNCATE we are performing modifications because we are deleting data? But deleting the data is the one side of coin only. What we see with the internal workings of truncate is because as you read above, that it doesn't remove the data. Rather it deallocates the data pages. Because TRUNCATE doesn't perform any data modification in the table that is why the DELETE TRIGGER is not called. I think we are not modifying the data of the table, BUT as you know TRUNCATE resets the Identity counter of the column in the table, which means the TRUNCATE is modifying the table definition or structure, which comes under the DDL operations. Also when you are truncating a table, you can't modify or add any data to the table. So, to become a DDL operation you have to fulfill some of the conditions written below:

Modifying a table structure or definition comes under DDL operations, and
When you are modifying the table structure, you can't access the table to do any data modification.

Since TRUNCATE is doing all the activities above, that proves that TRUNCATE is a DDL operation.
Now we move to the DELETE command. In case of the DELETE command I am not sure which lock is implemented, but as we know and you can read above that DELETE command deletes the rows one by one. It is modifying the data by deleting it from the table, and because DELETE performs data modifications that is why the DELETE TRIGGER is called. The DELETE command does not modify the table structure in any manner, such as like how TRUNCATE modifies the identity column by resetting its value.
To become a DML operation you have to fulfill some of the conditions written below:

Modifying the table data.
When you are modifying the table data in the mean time you can't perform any table structure modification on the table.

Here the DELETE command is modifying the data of the table and also when delete statement is running you can't modify the table structure. So we can say that DELETE is a DML operation.

Behavior of Truncate and Delete for Triggers
Triggers are important topic in SQL Server, and here I am talking about how both TRUNCATE and DELETE behave differently for Triggers. As you all know triggers fire whenever any data modification happens in the table. In case of TRUNCATE and DELETE, because they are deleting data from the table, the DELETE TRIGGER will fire if present and also the INSTEAD OF and AFTER triggers can be triggered if present. The INSERT and UPDATE triggers will not be fired here.
So let's take both one by one:

TRUNCATE - When we run the TRUNCATE command to delete all the rows of a table it actually doesn't remove any row rather it deallocates the data pages. So in the case of the TRUNCATE command, triggers will not be fired because here no modification takes place. As we know that TRUNCATE is a DDL command, and DDL commands doesn't modify your data; instead they modify your table structure and definition.
DELETE - In case of DELETE the DELETE trigger will be fired if present and also if the INSTEAD OF and AFTER triggers for DELETE TRIGGER is present then they also will be fired. As we know that delete command is a DML command and it deletes the data on row-by-row basis. So that means delete is modifying the data by deleting it from the table. As we know delete is a DML command and trigger will be fired whenever any DML operation on the table takes place.

Where we can use these statements
There are some restrictions on the use of both of these statements as follows:
For Delete

The DELETE statement may fail if it violates a trigger or tries to remove a row referenced by data in another table with a FOREIGN KEY constraint. If the DELETE removes multiple rows, and any one of the removed rows violates a trigger or constraint, the statement is canceled, an error is returned, and no rows are removed.

For Truncate
You cannot use TRUNCATE TABLE on tables that:

Are referenced by a FOREIGN KEY constraint.
Participate in an indexed view.
Are published using transactional replication or merge replication.

Permissions of performing TRUNCATE or DELETE operation
For using both the statements you need some permissions on the server, which decides whether you can perform the Delete or Truncate action on the table or not. To truncate a table, you need at least ALTER permissions on the table (which is granted by default to the table owner, members of the sysadmin fixed server role, and the db_owner and db_ddladmin fixed database roles). To delete rows using the delete statement, you only need the DELETE permission.

Reference : http://www.sqlservercentral.com/articles/delete/61387/
: help from mssql

Tuesday, August 2, 2011

Missing Index

Just as using an index in a book to find a particular bit of information is often much faster than reading all pages, so SQL Server indexes can make finding a particular row in a table dramatically faster by cutting down the number of read operations.
This section first discussing the two types of indexes supported by SQL Server, clustered and non-clustered. It also goes into included columns, a feature of non-clustered indexes. After that, we'll look at when to use each type of index.

Clustered Index

Take the following table:

CREATE TABLE [dbo].[Book](
 [BookId] [int] IDENTITY(1,1) NOT NULL,
 [Title] [nvarchar](50) NULL,
 [Author] [nvarchar](50) NULL,
 [Price] [decimal](4, 2) NULL)

Because this table has no clustered index, it is called a heap table. Its records are unordered, and to get all books with a given title, you have to read all the records, which is just not efficient. It has a very simple structure:

Let's see how long it takes to locate a record in this table. That way, we can compare against the performance of a table with an index. To do that in a meaningful way, first insert a million records into the table (code to do this is in missingindexes.sql). Tell SQL Server to show I/O and timing details of each query we run:

SET STATISTICS IO ON
SET STATISTICS TIME ON

Also, before each query, flush the SQL Server memory cache:

CHECKPOINT
DBCC DROPCLEANBUFFERS

Now run the query below with a million records in the Book table:

SELECT Title, Author, Price FROM dbo.Book WHERE BookId = 5000

The results on my machine - reads: 9564, CPU time: 109 ms, elapsed time: 808 ms SQL Server stores all data in 8KB pages. This shows it read 9564 pages - the entire table. Now add a clustered index:

ALTER TABLE Book ADD CONSTRAINT [PK_Book] PRIMARY KEY CLUSTERED ([BookId] ASC)

This puts the index on column BookId, making WHERE and JOIN statements on BookId faster. It sorts the table by BookId, and adds a structure called a B-tree to speed up access:

BookId is now used the same way as a page number in a book. Because the pages in a book are sorted by page number, finding a page by page number is very fast.
Now run the same query again to see the difference:

SELECT Title, Author, Price FROM dbo.Book WHERE BookId = 5000

The results - reads: 2, CPU time: 0 ms, elapsed time: 32 ms. The number of reads of 8KB pages has gone from 9564 to 2, CPU time from 109ms to less than 1 ms, and elapsed time from 808 ms to 32 ms. That's a dramatic improvement.

Non-clustered Index

Now let's select by Title instead of BookId:

SELECT Title, Author FROM dbo.Book WHERE Title = 'Don Quixote'

The results - reads: 9146, CPU time: 156 ms, elapsed time: 1653 ms These results are pretty similar to what we got with the heap table. Which is no wonder, seeing that there is no index on Title.
The solution obviously is to put an index on Title. However, because a clustered index involves sorting the table records on the index field, there can be only one clustered index. We've already sorted on BookId, and the table can't be sorted on Title at the same time.
The solution is to create a non-clustered index. This is essentially a duplicate of the table records, this time sorted by Title. To save space, SQL Server leaves out the other columns, such as Author and Price. You can have up to 249 non-clustered indexes on a table.
Because we still want to access those other columns in queries though, we need a way to get from the non-clustered index records to the actual table records. The solution is to add the BookId to the non-clustered records. Because BookId has the clustered index, once we have found a BookId via the non-clustered index, we can use the clustered index to get to the actual table record. This second step is called a key lookup.

Why go through the clustered index? Why not put the physical address of the table record in the non-clustered index record? The answer is that when you update a table record, it may get bigger, causing SQL Server to move subsequent records to make space. If non-clustered indexes contained physical addresses, they would all have to be updated when this happens. It’s a trade off between slightly slower reads and much slower updates. If there is no clustered index, or if it is not unique, then non-clustered index records do have the physical address.
To see what a non-clustered index will do for us, first create it:

CREATE NONCLUSTERED INDEX [IX_Title] ON [dbo].[Book]([Title] ASC)

Now run the same query again:

SELECT Title, Author FROM dbo.Book WHERE Title = 'Don Quixote'

The results - reads: 4, CPU time: 0 ms, elapsed time: 46 ms. The number of reads has gone from 9146 to 4, CPU time from 156 ms to less than 1 ms, and elapsed time from 1653 ms to 46 ms. This means that having a non-clustered indexes is not quite as good as having a clustered index, but still dramatically better than having no index at all.

Included Columns

You can squeeze a bit more performance out of a non-clustered index by cutting out the key lookup - the second step where SQL Server uses the clustered index to find the actual record.
Have another look at the test query - it simply returns Title and Author. Title is already in the non-clustered index record. If you were to add Author to the non-clustered index record as well, there would be no longer any need for SQL Server to access the table record, enabling it to skip the key lookup. It would look like this

This can be done by including Author in the non-clustered index:

CREATE NONCLUSTERED INDEX [IX_Title] ON [dbo].[Book]([Title] ASC) 
INCLUDE(Author) WITH drop_existing

Now run the query again:

SELECT Title, Author FROM dbo.Book WHERE Title = 'Don Quixote'

The results - reads: 2, CPU time: 0 ms, elapsed time: 26 ms. The number of reads has gone from 4 to 2, and elapsed time from 46 ms to 26 ms. That's almost a 50% improvement. In absolute terms, the gain isn't all that great, but for a query that is executed very frequently this may be worthwhile. Don't overdo this - the bigger you make the non-clustered index records, the fewer fit on an 8KB page, forcing SQL Server to read more pages.

Selecting columns to give an index

Because indexes do create overhead, you want to carefully select the columns to give indexes. Before starting the selection process, keep in mind that:

Putting a Primary Key on a column has the effect of giving it a clustered index. So you may already have many columns in your database with an index. As you'll see later in section "When to use a clustered index" , putting the clustered index on the ID column of a record is almost always a good idea.
Putting an index on a table column affects all queries that use that table. Don't focus on just one query.
Before introducing an index on your live database, test the index in development to make sure it really does improve performance.

Let's look at when and when not to use an index, and when to use a clustered index.
When to use an index
You can follow this decision process when selecting columns to give an index:

Start by looking at the most expensive queries. You identified those in section "Pinpointing bottlenecks" subsection "Missing Indexes and Expensive Queries". There you also saw indexing suggestions generated by Database Engine Tuning Advisor.
Look at putting an index on at least one column involved in every JOIN. If you join two tables with 500 rows each, this potentially creates a set of 500 * 500 = 250000 rows - so an index on a JOIN can make a big difference.
Consider columns used in ORDER BY and GROUP By clauses. If there is an index on such a column, than SQL Server doesn't have to sort the column again - because the index already keeps the column values in sorted order.
Consider columns used in WHERE clauses, especially if the WHERE will select a small number of records.
Consider columns that have a UNIQUE constraint. Having an index on the column makes it easier for SQL Server to check whether a new value would not be unique.
The MIN and MAX functions benefit from working on a column with an index. Because the values are sorted, there is no need to go through the entire table to find the minimum or maximum.
Think twice before putting an index on a column that takes a lot of space. If you use a non-clustered index, the column values will be duplicated in the index. If you use a clustered index, the column values will be used in all non-clustered indexes. The increased sizes of the index records means fewer fit in each 8KB page, forcing SQL Server to read more pages. The same applies to including columns in non-clustered indexes.
A WHERE clause that applies a function to the column value can't use an index on that column, because the output of the function is not in the index. Take for example:
```
           SELECT Title, Author FROM dbo.Book WHERE LEFT(Title, 3) = 'Don'
           
```
Putting an index on the Title column won't make this query any faster.
Likewise, SQL Server can't use an index if you use LIKE in a WHERE clause with a wild card at the start of the search string, such as this:

          SELECT Title, Author FROM dbo.Book WHERE Title LIKE '%Quixote'

          SELECT Title, Author FROM dbo.Book WHERE Title LIKE 'Don%'

When not to use an index
Having too many indexes can actually hurt performance. Here are the main reasons not to use an index on a column:

The column gets updated often.
The column has low specificity - meaning it has lots of duplicate values.

Let's look at each reason in turn.
Column Updated Often
When you update a column without an index, SQL Server needs to write one 8KB page to disk - provided there are no page splits. However, if the column has a non-clustered index, or if it is included in a non-clustered index, SQL Server needs to update the index as well - so it has to write at least one additional page to disk. It also has to update the B tree structure used in the index, potentially leading to more page writes.
If you update a column with a clustered index, the non-clustered index records that use the old value need to be updated too, because the clustered index key is used in the non-clustered indexes to navigate to the actual table records. Secondly, remember that the table records themselves are sorted based on the clustered index - if the update causes the sort order of a record to change, that may mean more writes. Finally, the clustered index needs to keep its B-tree up to date.
This doesn't mean you cannot have indexes on columns that get updated - just be aware that indexes slow down updates. Test the effect of any indexes you add.
If an index is critical but rarely used, for example only for overnight report generation, consider dropping the index and recreating it when it is needed.
Low Specificity
Even if there is an index on a column, the query optimizer won't always use it. Think of the index in this book - great if you are trying to find a word that is used on only a few pages, but not so great if you're trying to find all occurrences of a commonly used word such as "query". You'd be better off going through each page, rather than going back and forth to the index. In this context, it is said that "query" has low specificity.
You can use a simple query to determine the average selectivity of the values in a column. For example, to find the average selectivity of the Price column in the Book table, use:

SELECT 
 COUNT(DISTINCT Price) AS 'Unique prices',
 COUNT(*) AS 'Number of rows',
 CAST((100 * COUNT(DISTINCT Price) / CAST(COUNT(*) AS REAL)) 
 AS nvarchar(10)) + '%' AS 'Selectivity'
FROM Book

If every book has a unique price, selectivity will be 100%. However, if half the books are $20 and the other half $30, then average selectivity will be only 50%. If the selectivity is 85% or less, an index is likely to incur more overhead than it would save.
Some prices may occur a lot more often than other prices. To see the specificity of each individual price, you would run (missingindexes.sql in downloads):

DECLARE @c real
SELECT @c = CAST(COUNT(*) AS real) FROM BookSELECT 
 Price, 
 COUNT(BookId) AS 'Number of rows',
 CAST((1 - (100 * COUNT(BookId) / @c)) 
 AS nvarchar(20)) + '%' AS 'Selectivity'
FROM BookGROUP BY PriceORDER BY COUNT(BookId)

The query optimizer is unlikely to use a non-clustered index for a price whose specificity is below 85%. It figures out the specificity of each price by keeping statistics on the values in the table.

When to use a clustered index

You saw that there are two types of indexes, clustered and non-clustered. And that you can have only one clustered index. How do you determine the lucky column that will have the clustered index?
To work this out, let's first look at the characteristics of a clustered index against a non-clustered index.
When the record is inserted at the end of the table, there won't be a page split.

Characteristic	Clustered index compared to a non-clustered index
Reading	Faster - Because there is no need for key lookups. No difference if all required columns are included in the non-clustered index.
Updating	Slower - Not only the table record, but also all non-clustered index records need potentially be updated.
Inserting / Deleting	Faster - With a non-clustered index, inserting a new record in the table means inserting a new record in the non-clustered index as well. With a clustered index, the table is effectively part of the index, so there is no need for the second insert. The same goes for deleting a record. On the other hand, when the record is inserted at any place in the table but the very end, the insert may cause a page split where half the content of the 8KB page is moved to another page. Having a page split in a non-clustered index is less likely, because its records are smaller (they normally don't have all columns that a table record has), so more records fit on a page.
Column Size	Needs to be kept short and fast - Every non-clustered index contains a clustered index value, to do the key lookup. Every access via a non-clustered index has to use that value, so you want it to be fast for the server to process. That makes a column of type int a lot better to put a clustered index on than a column of type nvarchar(50).

If only one column requires an index, this comparison shows that you'll want to always give it the clustered index. If multiple columns need indexes, you'll probably want to put the clustered index on the primary key column:

Reading - The primary key tends to be involved in a lot of WHERE and JOIN clauses, making read performance important.
Updating - The primary key should never or rarely get updated, because that would mean changing referring foreign keys as well.
Inserting / Deleting - Most often you'll make the primary key an IDENTITY column, so each new record is assigned a unique, ever increasing number. This means that if you put the clustered index on the primary key, new records are always added at the end of the table without page splits.
Size - Most often the primary key is of type int - which is short and fast.

Indeed, when you set the Primary Key on a column in the SSMS table designer, SSMS gives that column the clustered index unless another column already has the clustered index.

Maintaining Indexes

Do the following to keep your indexes working efficiently:

Defragment indexes. Repeated updates cause indexes and tables to become fragmented, decreasing performance. To measure the level of fragmentation and to see how to defragment indexes, refer to parts 2 and 7.
Keep statistics updated. SQL Server maintains statistics to figure out whether to use an index for a given query. These statistics are normally kept up to date automatically, but this can be switched off. If you did, make sure statistics are kept up to date.
Remove unused indexes. As you saw, indexes speed up read access, but slow down updates. To see how to identify unused indexes, see section "Pinpointing bottlenecks", subsection "Missing Indexes and Expensive Queries".

reference: http://www.sqlservercentral.com/articles/Indexing/74513/