Thursday, March 22, 2012

Book recommendation on performance

I have the Transact-SQL Programming book from O'Reilly. It was
published in 1999. It states that "SELECT ... INTO" statements end up
locking the entire database of the target table. Since the tempdb is
also involved (in many cases), this creates major deadlocks for the
entire database and all users. It suggests using the "INSERT ...
SELECT" form instead.

Considering that the book is somehow dated, is this recommendation
still valid, especially on target sizes of up to 5 million records?php newbie (newtophp2000@.yahoo.com) writes:
> I have the Transact-SQL Programming book from O'Reilly. It was
> published in 1999. It states that "SELECT ... INTO" statements end up
> locking the entire database of the target table. Since the tempdb is
> also involved (in many cases), this creates major deadlocks for the
> entire database and all users. It suggests using the "INSERT ...
> SELECT" form instead.
> Considering that the book is somehow dated, is this recommendation
> still valid, especially on target sizes of up to 5 million records?

To a large extent, no. The author seems to have had SQL 6.5 in mind, where
SELECT INTO a temptable, indeed to bring a server to a stand still. The
problem is that SELECT INTO creates the temp table, and then goes on to
fill it with data. Since the query is one transaction, it keeps a lock on
the system tables until the query has completed. This was fatal in SQL 6.5
which only had page locks. In SQL7 and SQL2000 where you have row locks,
the impact on other processes is much smaller.

The advantage of SELECT INTO is that is that is minimally logged, so
SELECT INTO #tmp for five million rows can be faster and take less
toll on the server than CREATE TABLE INSERT INTO.

Nevertheless, there are contexts where SELECT INTO is an inferior choice
over CREATE TABLE not talking about a table variable. Say that you already
have a transaction in progress, and you creating many small tables
repeatedly within this transaction. SELECT INTO takes out more locks on
than CREATE TABLE, so you acquire a whole lot more locks with SELECT
INTO, and this can have an impact on performance. Case in point: I had
a procedure which suddently started to perform much slower than before.
This procedure performs some complex data updating in an iterative
fashion. My profiling pointed to a seeminginly innocent query which
appeared to take longer and longer time as the procedure proceeded.
Eventually I found the answer in a trigger (which was not affected by
this query). I had replaced direct deferences to "inserted" with temp
table created through "SELECT * INTO #inserted FROM inserted".

To summarize: SELECT INTO is fine for single-time queries on large
tables. It is bad to have in triggers in tables which maninly are
updated one row at a time.

--
Erland Sommarskog, SQL Server MVP, sommar@.algonet.se

Books Online for SQL Server SP3 at
http://www.microsoft.com/sql/techin.../2000/books.asp

No comments:

Post a Comment