This morning, we discovered that our application was timing out on several
 pages. We tracked it down to a table that couldn't be read without timing
 out. In Enterprise Manager, we found a table (TAB) lock on the table that
 was of mode IS that was blocking other spids. The text property of the lock
 showed this lock to be on a simple reporting stored procedure, which just
 did a SELECT on a couple of tables that should have only taken a second or
 two. We tried to debug the problem for several minutes, but to no avail.
 Finally, we killed the lock and the database problem was immediately solved.
 A look at SQL Profiler (which we run continuously) showed that a query that
 matched the text property of the lock and had the same pid as the lock had
 been started last Tuesday and ended at roughly the same time as we killed
 the lock. The index name listed with the lock in Enterprise Manager also
 was strange, since the index listed was not used by the stored procedure
 listed in the properties.
 We have had several similar problems with our database in the past, but this
 is the first time we didn't resort to just a reboot. Why would a simple
 stored procedure executing a select cause such problems? Why would this
 procedure be allowed to run for a week? Why would we experience no problems
 until days after (the table being locked was core to almost every page in
 the system and was fine until this morning)? Can the index listed in the
 lock information be used to debug the problem?
 An even bigger question is how to handle such a problem after it occurs?
 Killing the spid seems to have caused some problems in the dotNet
 application and forced me to restart the app. Is there a more graceful
 method of rolling back the offending process?the spid you found did not actually lock on the tab, it was an intent share
lock . .effectively indicating that it was going to lock the table(possibly
pages within the table). what version of sql do you have running(and what
service pack) if you are running sql2k plus sp3 then ::fn_get_sql can be
useful in the future to indicate what the currently executing sql was for
that particular spid.
I would also switch on traceflag -T1204 on this server to ensure that you
capture the full details of the blocking in the sqlerror log.
I would also look at your current configurations for query govenor cost
limit and possibly reduce this to an acceptable value which is in line with
your 'Longest Running Query'(LRQ) . . .if you have profiler running
regularly u should be able to determine what your longest running query is
and set you query govenor cost limit accordingly, preventing such problems.
re: stopping the spid . .the only option is to kill the spid(or find out the
application that had started the process and closing the application)
it will be useful for the next time it happens to perform a select * from
master.dbo.sysprocesse(nolock) to identify what the current waittype / wait
resource /waittine were for the query in question . . as a means to
establish if there are other hidden issues within your system.
HTH
Olu Adedeji
"Stephen Brown" <nospam@.telusplanet.net> wrote in message
news:cin2ht$rbf$1@.utornnr1pp.grouptelecom.net...
> This morning, we discovered that our application was timing out on several
> pages. We tracked it down to a table that couldn't be read without timing
> out. In Enterprise Manager, we found a table (TAB) lock on the table that
> was of mode IS that was blocking other spids. The text property of the
lock
> showed this lock to be on a simple reporting stored procedure, which just
> did a SELECT on a couple of tables that should have only taken a second or
> two. We tried to debug the problem for several minutes, but to no avail.
> Finally, we killed the lock and the database problem was immediately
solved.
> A look at SQL Profiler (which we run continuously) showed that a query
that
> matched the text property of the lock and had the same pid as the lock
had
> been started last Tuesday and ended at roughly the same time as we killed
> the lock. The index name listed with the lock in Enterprise Manager also
> was strange, since the index listed was not used by the stored procedure
> listed in the properties.
> We have had several similar problems with our database in the past, but
this
> is the first time we didn't resort to just a reboot. Why would a simple
> stored procedure executing a select cause such problems? Why would this
> procedure be allowed to run for a week? Why would we experience no
problems
> until days after (the table being locked was core to almost every page in
> the system and was fine until this morning)? Can the index listed in the
> lock information be used to debug the problem?
> An even bigger question is how to handle such a problem after it occurs?
> Killing the spid seems to have caused some problems in the dotNet
> application and forced me to restart the app. Is there a more graceful
> method of rolling back the offending process?
>
 
No comments:
Post a Comment