Vault server loses DB connection

This forum is now locked, since Gold Support is no longer offered.

Moderator: SourceGear

Locked
mlippert
Posts: 252
Joined: Wed Oct 06, 2004 10:49 am
Location: Cambridge, MA

Vault server loses DB connection

Post by mlippert » Fri Jun 16, 2006 9:57 am

This just started happening a few days ago. I think last night was the 3rd or 4th time this week.

I don't know of any process that has changed recently. The only other datapoint I found looking at the event log was that the Vault database backup finished at 10:33pm.

The Vault server and SQL server are both running on the same machine.

This has all been running fine for several months.

This is running "Version Check: Your Vault server is version 3.1.6.3658"

Any ideas what might be causing this?

We keep "fixing" it in the morning by rebooting the machine.

Mike
----6/15/2006 10:42:38 PM RelEng--idol.mathsoft.com(10.10.10.145)--SSL Disabled Login
----6/15/2006 10:42:38 PM RelEng--idol.mathsoft.com(10.10.10.145)--SSL Disabled GetUserOptions returned: Success
----6/15/2006 10:42:39 PM RelEng--idol.mathsoft.com(10.10.10.145)--SSL Disabled GetRepositories returned: Success
----6/15/2006 10:42:40 PM RelEng--idol.mathsoft.com(10.10.10.145)--SSL Disabled Getting repository Structure-> Rep ID: 1 Base: 188900 Target: -1
----6/15/2006 10:42:40 PM RelEng--idol.mathsoft.com(10.10.10.145)--SSL Disabled TreeManager: cache matches repository revision and folder security hasn't changed. Returning cached tree, revID 188900
----6/15/2006 10:42:40 PM RelEng--idol.mathsoft.com(10.10.10.145)--SSL Disabled VaultServiceAPI::GetRepositoryTreeDelta() UserID:23 RepID:1 Base:188900 Target:188900 Calling VaultRepUtil.DiffRepTrees() - in-memory tree diff.
----6/15/2006 10:42:40 PM RelEng--idol.mathsoft.com(10.10.10.145)--SSL Disabled GetRepositoryStructure returned: Success
----6/15/2006 10:42:40 PM RelEng--idol.mathsoft.com(10.10.10.145)--SSL Disabled Getting list of checkout changes.
----6/15/2006 10:42:40 PM RelEng--idol.mathsoft.com(10.10.10.145)--SSL Disabled GetCheckOutListChanges returned: Success
----6/15/2006 10:42:40 PM RelEng--idol.mathsoft.com(10.10.10.145)--SSL Disabled GetUserEmailOptions returned Success
----6/15/2006 10:42:40 PM RelEng--idol.mathsoft.com(10.10.10.145)--SSL Disabled TreeManager: cache matches repository revision and folder security hasn't changed. Returning cached tree, revID 188900
----6/15/2006 10:42:47 PM RelEng--idol.mathsoft.com(10.10.10.145)--SSL Disabled ListWatchers returned Success
----6/15/2006 10:43:02 PM RelEng--idol.mathsoft.com(10.10.10.145)--SSL Disabled Vault Failed to establish a connection to the database.
----6/15/2006 10:43:02 PM RelEng--idol.mathsoft.com(10.10.10.145)--SSL Disabled System.InvalidOperationException: Timeout expired. The timeout period elapsed prior to obtaining a connection from the pool. This may have occurred because all pooled connections were in use and max pool size was reached.
at System.Data.SqlClient.SqlConnectionPoolManager.GetPooledConnection(SqlConnectionString options, Boolean& isInTransaction)
at System.Data.SqlClient.SqlConnection.Open()
at VaultServiceSQL.VaultSqlConn.OpenConn()
at VaultServiceAPILib.VaultServiceAPISystem.GetOpenSqlConn(HttpApplicationState has, VaultLoginInfo vli, VaultSqlConn& conn) at System.Data.SqlClient.SqlConnectionPoolManager.GetPooledConnection(SqlConnectionString options, Boolean& isInTransaction)
at System.Data.SqlClient.SqlConnection.Open()
at VaultServiceSQL.VaultSqlConn.OpenConn()
at VaultServiceAPILib.VaultServiceAPISystem.GetOpenSqlConn(HttpApplicationState has, VaultLoginInfo vli, VaultSqlConn& conn)
----6/15/2006 10:43:02 PM RelEng--idol.mathsoft.com(10.10.10.145)--SSL Disabled ListCloakedObjects returned: FailDBConn
----6/15/2006 10:43:02 PM RelEng--idol.mathsoft.com(10.10.10.145)--SSL Disabled Logout
----6/15/2006 10:43:17 PM RelEng--idol.mathsoft.com(10.10.10.145)--SSL Disabled Vault Failed to establish a connection to the database.
----6/15/2006 10:43:17 PM RelEng--idol.mathsoft.com(10.10.10.145)--SSL Disabled System.InvalidOperationException: Timeout expired. The timeout period elapsed prior to obtaining a connection from the pool. This may have occurred because all pooled connections were in use and max pool size was reached.
at System.Data.SqlClient.SqlConnectionPoolManager.GetPooledConnection(SqlConnectionString options, Boolean& isInTransaction)
at System.Data.SqlClient.SqlConnection.Open()
at VaultServiceSQL.VaultSqlConn.OpenConn()
at VaultServiceAPILib.VaultServiceAPISystem.GetOpenSqlConn(HttpApplicationState has, VaultLoginInfo vli, VaultSqlConn& conn) at System.Data.SqlClient.SqlConnectionPoolManager.GetPooledConnection(SqlConnectionString options, Boolean& isInTransaction)
at System.Data.SqlClient.SqlConnection.Open()
at VaultServiceSQL.VaultSqlConn.OpenConn()
at VaultServiceAPILib.VaultServiceAPISystem.GetOpenSqlConn(HttpApplicationState has, VaultLoginInfo vli, VaultSqlConn& conn)
----6/15/2006 10:43:17 PM RelEng--idol.mathsoft.com(10.10.10.145)--SSL Disabled The connection to the database failed for VaultServiceAPI.Logout: FailDBConn

lbauer
Posts: 9736
Joined: Tue Dec 16, 2003 1:25 pm
Location: SourceGear

Post by lbauer » Fri Jun 16, 2006 10:00 am

Similar problem reported here:

http://support.sourcegear.com/viewtopic.php?t=5172

See if any of the suggestions work for you.
Linda Bauer
SourceGear
Technical Support Manager

mlippert
Posts: 252
Joined: Wed Oct 06, 2004 10:49 am
Location: Cambridge, MA

Post by mlippert » Fri Jun 16, 2006 11:55 am

Thanks Linda,

I'll check that out. Meantime I looked at our web.config file and
<add key="ConnectString" value="Application Name='SourceGear Vault Server'; Connection Reset='true'; Server=(local); Database=sgvault; User ID=sgvaultuser; pwd=xxxxxxxx=" />
Note that there is no Max Pool Size set to anything. I assume that means that the default is in affect. I looked at the properties of the IIS|Application Pools|VaultAppPool, but didn't see any attribute like this, so I'm not sure what else to look at.

I don't want to just add Max Pool Size without understanding what the value is that I set. (It seems that each user would want 1, but if someone is running 2 copies of Visual Studio, 1 editor, and the Vault GUI, would that be using 4 from the pool?)

Mike

lbauer
Posts: 9736
Joined: Tue Dec 16, 2003 1:25 pm
Location: SourceGear

Post by lbauer » Fri Jun 16, 2006 3:49 pm

The "pool size" refers to SQL Server connections and does not involve IIS.

However an IIS re-set might help, if the Vault Server has lost contact with the SQL Server machine.

If that doesn't help, I'd suggest adding "Max Pool Size=x" to your web.config in the Connect String line. Example:

<add key="ConnectString" value="Application Name='SourceGear Vault Server'; Connection Reset='true'; Server=(local); Database=sgvault; Max Pool Size=x; Integrated Security='true'" />

"x" could be the total number of connections you expect your team would have; maybe add a few more just in case.

If a user is running multiple client instances, then each would be a connection (not sure about the editor).
Linda Bauer
SourceGear
Technical Support Manager

mlippert
Posts: 252
Joined: Wed Oct 06, 2004 10:49 am
Location: Cambridge, MA

Post by mlippert » Fri Jun 16, 2006 4:03 pm

Thanks Linda, that answers my question.
I'll have to wait and see whether the problem persists, and if it does I'll give the Max Pool Size attribute a shot. I tossed the 1 editor into my example because my editor (Codewright) is configured to connect to Vault (it uses SCC, the same as Visual Studio) so that I can check out files from within the editor.

It means that 4 times the number of users is probably a reasonable estimate for me to use, which in my case would be 220 for our 55 users.

Mike

mlippert
Posts: 252
Joined: Wed Oct 06, 2004 10:49 am
Location: Cambridge, MA

Post by mlippert » Fri Jun 30, 2006 3:40 pm

Just as a followup for anyone who runs into the same problem.
I added the Max Pool Size attribute with a value of 220 (see my previous post for how I chose that value) over a week ago and the problem hasn't manifested itself since :).

Thanks Linda,
Mike

mlippert
Posts: 252
Joined: Wed Oct 06, 2004 10:49 am
Location: Cambridge, MA

Post by mlippert » Thu Jul 06, 2006 11:24 am

I may have spoken too soon, although things are still way better than they were. We had another failure this past Sunday 7/2.

And for some reason the Vault server log for 7/2 was almost completely empty. We still have server debug logging turned on.

The last entry in the previous log file was:

Code: Select all

----7/1/2006 11:59:58 PM     SGherbi--picasso.mathsoft.com(10.210.112.69)--SSL Disabled	GetCheckOutListChanges returned: Success 
while the 7/2 log file contained only:

Code: Select all

# SourceGear Vault Server
----7/2/2006 11:59:50 PM     releng--idol.mathsoft.com(10.210.112.145)--SSL Disabled	System.InvalidOperationException: Timeout expired.  The timeout period elapsed prior to obtaining a connection from the pool.  This may have occurred because all pooled connections were in use and max pool size was reached.
   at System.Data.SqlClient.SqlConnectionPoolManager.GetPooledConnection(SqlConnectionString options, Boolean& isInTransaction)
   at System.Data.SqlClient.SqlConnection.Open()
   at VaultServiceSQL.VaultSqlConn.OpenConn()
   at VaultServiceAPILib.VaultServiceAPISystem.GetOpenSqlConn(HttpApplicationState has, VaultLoginInfo vli, VaultSqlConn& conn)    at System.Data.SqlClient.SqlConnectionPoolManager.GetPooledConnection(SqlConnectionString options, Boolean& isInTransaction)
   at System.Data.SqlClient.SqlConnection.Open()
   at VaultServiceSQL.VaultSqlConn.OpenConn()
   at VaultServiceAPILib.VaultServiceAPISystem.GetOpenSqlConn(HttpApplicationState has, VaultLoginInfo vli, VaultSqlConn& conn)
And then it started a new (and the current) log file
which starts:

Code: Select all

# SourceGear Vault Server
----7/2/2006 11:59:59 PM     SGherbi--picasso.mathsoft.com(10.210.112.69)--SSL Disabled	Getting repository Structure-> Rep ID: 5 Base: 0 Target: -1 
----7/3/2006 12:00:01 AM     ECasey--bigpapi.mathsoft.com(10.210.112.109)--SSL Disabled	Vault Failed to establish a connection to the database. 
----7/3/2006 12:00:01 AM     ECasey--bigpapi.mathsoft.com(10.210.112.109)--SSL Disabled	System.InvalidOperationException: Timeout expired.  The timeout period elapsed prior to obtaining a connection from the pool.  This may have occurred because all pooled connections were in use and max pool size was reached.
   at System.Data.SqlClient.SqlConnectionPoolManager.GetPooledConnection(SqlConnectionString options, Boolean& isInTransaction)
   at System.Data.SqlClient.SqlConnection.Open()
   at VaultServiceSQL.VaultSqlConn.OpenConn()
   at VaultServiceAPILib.VaultServiceAPISystem.GetOpenSqlConn(HttpApplicationState has, VaultLoginInfo vli, VaultSqlConn& conn)    at System.Data.SqlClient.SqlConnectionPoolManager.GetPooledConnection(SqlConnectionString options, Boolean& isInTransaction)
   at System.Data.SqlClient.SqlConnection.Open()
   at VaultServiceSQL.VaultSqlConn.OpenConn()
   at VaultServiceAPILib.VaultServiceAPISystem.GetOpenSqlConn(HttpApplicationState has, VaultLoginInfo vli, VaultSqlConn& conn)
I don't know if this log file behavior signifies anything, but I thought I'd mention it.

jclausius
Posts: 3706
Joined: Tue Dec 16, 2003 1:17 pm
Location: SourceGear
Contact:

Post by jclausius » Thu Jul 06, 2006 12:50 pm

Do you have any kind of automated procedures also accessing the Vault server? Is it possible there are too many logins to the Vault server using all available DB connections in the pool?
Jeff Clausius
SourceGear

mlippert
Posts: 252
Joined: Wed Oct 06, 2004 10:49 am
Location: Cambridge, MA

Post by mlippert » Thu Jul 06, 2006 1:21 pm

Jeff,
The only automated procedures that I know of is our nightly build (which I'm pretty sure doesn't run on the weekend nights). Also this past weekend would have had almost no one actually logging in to Vault (although many people leave their machine running along w/ the Vault GUI and/or Visual Studio).

There is one other person I can think of who may have written some process that automatically accesses Vault, he's away this week, I'll check with him next week.

Do you know of any way I can track the allocation and release of DB connections? Could/does the Vault server keep track of this in its log in debug mode?

As I said above, I set the DB pool to 220, which should be way more than we'd ever need at 1 time.

Thanks,
Mike

ps I know that 3.5 is supposed to come out soon, and I'll try to get our server upgraded when it does, I'm not bothering to spend the time to upgrade from 3.1.6 to 3.1.9, because I'm expecting 3.5.

jclausius
Posts: 3706
Joined: Tue Dec 16, 2003 1:17 pm
Location: SourceGear
Contact:

Post by jclausius » Thu Jul 06, 2006 3:07 pm

No, but we can take a look at the Server's log. You'll need to place the Server in Debug Log mode for this information. The resulting server log should give us an indication of how many requests are coming in.

Also, looking at SQL Server's activity information will help determine who has an active connection to SQL Server. If you are using MSDE or prefer text based info the stored procedure sp_who2 within a query tool should help determine what is going on within the database.
Jeff Clausius
SourceGear

mlippert
Posts: 252
Joined: Wed Oct 06, 2004 10:49 am
Location: Cambridge, MA

Post by mlippert » Thu Jul 06, 2006 4:26 pm

Jeff,
Our Server has been in Debug Log mode for months now. That's what I was trying to say above, also remarking on the lack of log activity for the day the problem was reported (7/2).

As a basepoint I just ran the sp_who2 stored procedure and got a result of 28 rows (and I ran it again a couple of minutes later and got 31 rows).

So I can zip the 060625sgvault.log file (it's 200MB), I've posted the 060702sgvault.log contents above, and I can zip the sgvault.log (70MB) and ftp both of those log files to you if you'd like.

I think the 1st 94000 lines in sgvault.log are "Vault Failed to establish a connection to the database" & related messages.

jclausius
Posts: 3706
Joined: Tue Dec 16, 2003 1:17 pm
Location: SourceGear
Contact:

Post by jclausius » Fri Jul 07, 2006 8:01 am

Mike:

I don't think the logs would be helpful, unless you could cross-reference the info against an sp_who2.

For right now, let's keep an eye on sp_who2. If the list grows, we will need to keep a log of when that was run, so we can then use that info with the sgvault.log file to see how many actual connections are opened to the database.
Jeff Clausius
SourceGear

mlippert
Posts: 252
Joined: Wed Oct 06, 2004 10:49 am
Location: Cambridge, MA

Post by mlippert » Fri Jul 07, 2006 9:25 am

Sounds reasonable Jeff.
Could you tell me what the info that sp_who2 is producing means (just an overview would be fine, ie look at a value of X in such and such a column which means Y, and maybe what each row returned represents, ie when is a row created and when does it go away).

jclausius
Posts: 3706
Joined: Tue Dec 16, 2003 1:17 pm
Location: SourceGear
Contact:

Post by jclausius » Fri Jul 07, 2006 9:52 am

From SQL Server Books online, sp_who and sp_who2 provides information about current users and processes in an instance of the Microsoft SQL Server Database Engine.

What we're interested in is the login, program name, and dbname columns. What I'm specifically looking for is to see if the number of connections to the sgvault when your log file hits the connection limit problem. From that list, we'll need to take a look at the login to determine who is actually accessing the database.
Jeff Clausius
SourceGear

Locked