Urgent: SQL Server PID Blocking

This forum is now locked, since Gold Support is no longer offered.

Moderator: SourceGear

asills
Posts: 95
Joined: Tue Jan 25, 2005 5:05 pm

Urgent: SQL Server PID Blocking

Post by asills » Mon Jul 11, 2005 9:39 am

So this started some time over the weekend... People came in this morning and all of a sudden the Vault server can't retrieve the repository structure from the server.

Anyway, I've basically narrowed it down:

I clean the server (restart SQL, restart IIS, etc) and connect and login and get a normal set of entries in the log file (GetUserOptions, GetuserList, "Getting repository Structure", etc).

Then I attempt to check out and get a successful log message:

----7/11/2005 10:18:16 AM asills--s03cms01.globeranger.com.pvt(127.0.0.1)--SSL Disabled Checking out Files
----7/11/2005 10:18:16 AM asills--s03cms01.globeranger.com.pvt(127.0.0.1)--SSL Disabled CheckOutFiles returned:Success

Then naturally it attempts to refresh its view and I get this message:

----7/11/2005 10:18:16 AM asills--s03cms01.globeranger.com.pvt(127.0.0.1)--SSL Disabled Getting list of checkout changes.

Then the entire server stops responding. Going through the active process IDs in SQL I have on PID blocking another PID. The blocking PID's last command was "dbo.spdeletecheckoutlists;1" and it has tblcheckoutlistitems, tblcheckoutlists, tblcheckoutlocks, tempdb all actively locked. The blocked PID's last command was "dbo.spgetlockedfilechanges;1".

The blocked PID is actively in a SLEEPING state as is the blocking PID, which most definitely would be a deadlock (the blocked PID is doing an INSERT and the blocking PID is doing a delete).

Going through SQL Profiler I really don't see anything spectacular. I see the logon, I see the checkout, I see the repository refresh then it stops.

Vault is effectively useless at the moment on our server, do you have any suggestions???
Attachments
sqlprofiler.xls
(29 KiB) Downloaded 808 times

asills
Posts: 95
Joined: Tue Jan 25, 2005 5:05 pm

Post by asills » Mon Jul 11, 2005 9:40 am

By "the entire server stops responding" I really meant "users can no longer get latest, etc as they get blocked whenever Vault tries to refresh the repository info.

Also, I turned folder security off to see if that would have any affect (and it didn't, but it's still off).

asills
Posts: 95
Joined: Tue Jan 25, 2005 5:05 pm

Post by asills » Mon Jul 11, 2005 9:46 am

If this helps any:
Windows 2003
Vault 3.0.6

asills
Posts: 95
Joined: Tue Jan 25, 2005 5:05 pm

Post by asills » Mon Jul 11, 2005 9:47 am

Hm, looking at Vault 3.0.7, I see this:

Server Only
· Allow better concurrency on refresh and checkout commands

Maybe I should try 3.0.7 first?

jclausius
Posts: 3706
Joined: Tue Dec 16, 2003 1:17 pm
Location: SourceGear
Contact:

Post by jclausius » Mon Jul 11, 2005 10:12 am

You may want to check your statistics / indices for tbltreerevisionfolders and tblcheckoutlistitems. Blocking does not equal deadlock, but it will prevent users from selecting on a table which may be locked. After updating statistics or defragging or rebuilding the indices, you should try again.

I would also recommend upgrading to Vault 3.0.7 or Vault 3.1 Beta 3 server. There is a race condition in previous versions of Vault server which can cause problems.

As for deadlocks altogether, Vault 3.1 Beta 3 would be your best bet. We've gone through some extensive testing to remove deadlocks from this version of the server.
Jeff Clausius
SourceGear

asills
Posts: 95
Joined: Tue Jan 25, 2005 5:05 pm

Post by asills » Mon Jul 11, 2005 10:17 am

The server has a nightly maintenance schedule that updates the vault database. The last log file I received for it (last night at 1am) reported everything in tip top shape.

So you suggest a beta version? What will that do to the database? What if the beta version is worse? Can I revert back to an earlier version? Will every client have to be updated to the beta version as well?

jclausius
Posts: 3706
Joined: Tue Dec 16, 2003 1:17 pm
Location: SourceGear
Contact:

Post by jclausius » Mon Jul 11, 2005 10:35 am

asills wrote:The server has a nightly maintenance schedule that updates the vault database. The last log file I received for it (last night at 1am) reported everything in tip top shape.
Do you know what command was run?
asills wrote:So you suggest a beta version?
Yes. We've been using it for months.
asills wrote:What will that do to the database? What if the beta version is worse? Can I revert back to an earlier version?
As with every DB schema change, it is a one way street. There is no way to get back. As for being "worse" than Vault 3.0.7 , its not possible. Vault 3.1 Server started with the Vault 3.0.7 code base, and had deadlock / concurrency / memory issues addressed.

However, if you're not comfortable with Vault 3.1, then at least get up to Vault 3.0.7 so you do not encounter the race condition problem. However, as I mentioned, we are supporting Vault 3.1 Beta 3, so people should not be discouraged from its use. The biggest thing to be aware of right now is the schema change. On repositories with large folder counts, upgrading may take some time. Most people I've talked to, ran a test upgrade first.
asills wrote:Will every client have to be updated to the beta version as well?
No.
Jeff Clausius
SourceGear

asills
Posts: 95
Joined: Tue Jan 25, 2005 5:05 pm

Post by asills » Mon Jul 11, 2005 10:41 am

jclausius wrote:Do you know what command was run?
Yes, it's your suggested script for defragging, reindexing and updating statistics in your KB. DBCC REINDEX, UPDATE STATISTICS, DBCC INDEXDEFRAG.
jclausius wrote:However, if you're not comfortable with Vault 3.1, then at least get up to Vault 3.0.7 so you do not encounter the race condition problem. However, as I mentioned, we are supporting Vault 3.1 Beta 3, so people should not be discouraged from its use. The biggest thing to be aware of right now is the schema change. On repositories with large folder counts, upgrading may take some time. Most people I've talked to, ran a test upgrade first.
Okay, 3.0.7 didn't help at all. If nothing else, I can upgrade to the beta and if it doesn't help, I can simply restore last night's full backup since nobody's been able to do anything on it all day today anyway.

jclausius
Posts: 3706
Joined: Tue Dec 16, 2003 1:17 pm
Location: SourceGear
Contact:

Post by jclausius » Mon Jul 11, 2005 10:51 am

You're still blocked?

Try this - Kill the blocking process OR you can kill ALL Vault connections by running spkillusers 'sgvault' from your master database.
Jeff Clausius
SourceGear

asills
Posts: 95
Joined: Tue Jan 25, 2005 5:05 pm

Post by asills » Mon Jul 11, 2005 10:53 am

Yes I can manually kill the processes which I've done already (had to to get the Vault server to respond to even a login), but the exact same set of repro (checkin/checkout/undo checkout) causes the blockage again (every time and on any single file or folder).

jclausius
Posts: 3706
Joined: Tue Dec 16, 2003 1:17 pm
Location: SourceGear
Contact:

Post by jclausius » Mon Jul 11, 2005 10:58 am

I'd like to suggest running some maintenance commands by hand.

Let's see what this does:

UPDATE STATISTICS sgvault.dbo.tblcheckoutlistitems
UPDATE STATISTICS sgvault.dbo.tbltreerevisionfolders

Next:

SELECT cli.* FROM sgvault.dbo.tblcheckoutlistitems cli
INNER JOIN sgvault.dbo.tblrepositories r ON r.currentcolistid = cli.colistid
ORDER BY r.repid

Does this second query finish?
Jeff Clausius
SourceGear

asills
Posts: 95
Joined: Tue Jan 25, 2005 5:05 pm

Post by asills » Mon Jul 11, 2005 11:00 am

Ran the update statistics, only took a few seconds (19 actually). The second command finished in <1 second.

jclausius
Posts: 3706
Joined: Tue Dec 16, 2003 1:17 pm
Location: SourceGear
Contact:

Post by jclausius » Mon Jul 11, 2005 11:09 am

OK.

Run this query:
SELECT repid, name FROM sgvault.dbo.tblrepositories,


Now, using the repid from above (I used 2) in my sample:

Code: Select all

DECLARE @repid [int], @colistid [bigint];

SET @repid = 2;

SELECT @colistid = currentcolistid FROM sgvault.dbo.tblrepositories WHERE repid = @repid;
exec sgvault.dbo.spgetlockedfilechanges @repid, @basecolistid = 32492, @targetcolistid = @colistid, @refreshlist = 1
Does this return?


What about this query:
EXEC sgvault.dbo.spdeletecheckoutlists @repid = 2, @daysold = 10
Jeff Clausius
SourceGear

asills
Posts: 95
Joined: Tue Jan 25, 2005 5:05 pm

Post by asills » Mon Jul 11, 2005 11:14 am

jclausius wrote:OK.

Run this query:
SELECT repid, name FROM sgvault.dbo.tblrepositories,


Now, using the repid from above (I used 2) in my sample:

Code: Select all

DECLARE @repid [int], @colistid [bigint];

SET @repid = 2;

SELECT @colistid = currentcolistid FROM sgvault.dbo.tblrepositories WHERE repid = @repid;
exec sgvault.dbo.spgetlockedfilechanges @repid, @basecolistid = 32492, @targetcolistid = @colistid, @refreshlist = 1
Does this return?
Yes, it took 28 seconds (repid 2 is my main repository we're having problems with).
jclausius wrote: What about this query:
EXEC sgvault.dbo.spdeletecheckoutlists @repid = 2, @daysold = 10
Finished <1 second.

I re-ran the above spgetlockedfilechanges query after the spdeletecheckoutlists query and it only took 2 seconds this time.

jclausius
Posts: 3706
Joined: Tue Dec 16, 2003 1:17 pm
Location: SourceGear
Contact:

Post by jclausius » Mon Jul 11, 2005 11:19 am

Let's narrow things down a bit.

Try:

EXEC sgvault.dbo.spdeletecheckoutlists @repid = 2, @daysold = 5

Then re-try the get lock changes from above.
Jeff Clausius
SourceGear

Locked