Intro
Since it’s Halloween this week I thought I would cover a true SQL Server horror story in this post, zombie spids. What are zombie spids? Well they are basically spids that continue to run and however you try to kill them whilst SQL Server is running they will not die. To avoid any confusion when I say spids I mean sessions in SQL Server.
First encounter
My first encountered was years ago during a migration to SQL Server 2008. I encountered an issue running a command across a linked server. It had a wait type of sos_scheduler_yield and no matter what I tried I could not kill of the spid.
We tried numerous times to kill it in various ways whilst SQL Server was running. We even had Microsoft involved in the investigation. In the end we ended up restarting the SQL Service. It turned out to be a bug whilst running a query to older SQL Servers through Linked Servers so Microsoft kindly gave us back the support time we had used.
Sequels
Now all these years later you would think zombie spids would be a thing of the past. But like the sequels of the masked killer in the horror movies they keep coming back. Even recently I’ve seen one occur on a SQL Server 2016 server whilst trying to cancel a CheckDB. Before anybody panics that there’s an issue with CheckDB in SQL Server 2016 I will stress that this was on a read only database terabytes in size.
Killing a zombie spid
As I mentioned above the only way to kill a zombie spid is to either restart the SQL Service on the server or restart the entire server instead. Make sure you are absolutely certain it is a zombie spid before you do so though.
One giveaway sign is that it’s been running for a lot longer than it was before you tried to kill it. Another sign is that the wait type is not one you would be expecting and it never changes like sos_scheduler_yield or execsync.
You have to be aware of some downtime if it’s a physical server you are restarting. In this case make sure you have a secondary server on which you can failover to before you restart. If not and it’s a production server I seriously recommend you get a secondary server.
Final word
Well I hope you’ve all enjoyed my Halloween special. Feel free to comment if needed.
[…] Which I discussed in my last Halloween themed post last year, and you can read in detail here. […]
[…] also helps reduce the number of potential SQL Server issues you have to deal with. Like the zombie spid one I posted about […]
I have had the same experience on SQL Server 2016 with an ODBC Linked Server to a Rocket UniVerse database (file based multivalued database on AIX). Happened a few years ago when rebooting the machine was the only thing that worked.
I’m here on this page today because it just happened again. (Zombie was the word that got me here with Google.)
It is triggered by query to too much data. A file had grown enormously since last time I ran the query a few years ago when it was working.