Retrieving repository structure .... 9MB

If you are having a problem using Vault, post a message here.

Moderator: SourceGear

Post Reply
Thomas Linder Puls
Posts: 153
Joined: Tue Jan 20, 2004 2:28 am
Location: PDC, Copenhagen Denmark
Contact:

Retrieving repository structure .... 9MB

Post by Thomas Linder Puls » Mon Aug 16, 2004 2:54 pm

My vault client requests around 9MB data during the "Retrieving repository structure ...".

This is an awfull lot of data just to connect.

One of our directories contains >14.000 files (master results for our test suite), organized in numerous sub-directories. I have this directory is "cloaked" and I only have a fraction of the files on my own computer.

Does vault have performance problems with that many files?

Will it help if I have the files on my computer?

What can I do to speed up the connecting?
Thomas Linder Puls
Visual Prolog www.visual-prolog.com

jeremy_sg
Posts: 1821
Joined: Thu Dec 18, 2003 11:39 am
Location: Sourcegear
Contact:

Post by jeremy_sg » Mon Aug 16, 2004 9:03 pm

There are a few different types of things that the client fetches from the server. Some are stored on the client side in the cache files (http://support.sourcegear.com/viewtopic ... ight=cache), and some are fetched new every time for the first refresh.

The big one that is stored in the cache files is the tree structure. This is probably being sent as a relatively small delta that is applied against the client cache file. Sometimes the tree needs to be requested in full. To test your setup to see if you are downloading the full tree every time, delete your repository cache file. If you are loading the full tree every time, there will be no difference in the amount of data that is sent. If you normally see tree deltas, then this time you will see a drastically large download size.

Another possibility is that you have large numbers of shared items or checked out items. These both are communicated to the client at login time.

Does that help at all?

Thomas Linder Puls
Posts: 153
Joined: Tue Jan 20, 2004 2:28 am
Location: PDC, Copenhagen Denmark
Contact:

Post by Thomas Linder Puls » Tue Aug 17, 2004 3:56 pm

My cache directory was 196MB large. Far most of it was in a subfolder _sgvault, which is not mentioned in the post about the cache. I think this is quite a lot of data, are you sure that you clean-up properly :?:

I deleted everything, and started the Vault GUI client. Result: 9MB was transferred, and the cache directory became 1.5MB large.

I started another Vault GUI client. Result: (additional) 9MB was transferred, the cache directory stayed 1.5MB large.

I closed both clients, and started a third one. Result once more 9MB was transferred and the cache directory stayed 1.5MB large.

:arrow: So, I guess, that the tree structure is transferred each time (and that it is transferred with a significant overhead).

Obviously this did not help, because 9MB is still transferred each time.
Thomas Linder Puls
Visual Prolog www.visual-prolog.com

jeremy_sg
Posts: 1821
Joined: Thu Dec 18, 2003 11:39 am
Location: Sourcegear
Contact:

Post by jeremy_sg » Tue Aug 17, 2004 8:06 pm

The main size that I'm interested in is the CacheMember_Repository file. The _sgvault folder is the baseline files for all of your working folders. For a little bit of info on the _sgvault directory, look at http://support.sourcegear.com/viewtopic.php?t=562

How are you arriving at you 9M figure? Do you have lots of shared files or lots of checked out files?

If bandwidth is an issue, and you suspect that the server is sending a full tree delta for every login, you can set the TreeDeltaCompressionThreshhold to a lower value (under the size of your full tree). See http://support.sourcegear.com/viewtopic.php?t=1450 for more on this option.

Thomas Linder Puls
Posts: 153
Joined: Tue Jan 20, 2004 2:28 am
Location: PDC, Copenhagen Denmark
Contact:

Post by Thomas Linder Puls » Wed Aug 18, 2004 6:30 am

Bandwidth is an issue (well actually it is my waiting time that is the issue).

The figure 9MB is found on my VPN status dialog, by simple subtraction of the before and after numbers (I would have uploaded a picture, but I got an error message: "Upload Error: Could not upload Attachment to ./files/vpnstatus.png").

The CacheMember_Repository file is 1.5MB (the other files are nearly zero).

We do not have a lot of shared files and there are nearly no files that are checked out.

I cannot see that TreeDeltaCompressionThreshhold should have any effect on my problem.

To me the problem seems to be that the complete tree is sent rather than a delta. Since my data is very large it should be compressed (given the default setting of the TreeDeltaCompressionThreshhold), so the 9MB is for a compressed tree delta.
Thomas Linder Puls
Visual Prolog www.visual-prolog.com

jeremy_sg
Posts: 1821
Joined: Thu Dec 18, 2003 11:39 am
Location: Sourcegear
Contact:

Post by jeremy_sg » Wed Aug 18, 2004 7:23 am

All right, I need even more info.

1. I'm assuming that you're connecting with the GUI client. If you're connecting with the IDE, make sure that the option to get the latest version on startup isn't on. If you are using the IDE, try it with the GUI client.

2. I just used Ethereal (http://www.ethereal.com) to watch as I connected to get a full tree for our 33,000 node tree, and the entire connection dialog was only 1M (with SSL on, even), and the CacheMember_Repository file afterwards was only 3M. Something is going on, but it is likely that the tree size is a red herring. Here's all the stuff that we request from the server with the repository structure.

RepositoryStructure (First ask for a delta based on the cache, if that fails, request the full tree)
CheckOutList
CloakedObjectsList (only if there was a change to the tree like an object moved, renamed or deleted).

Perhaps if you use a network monitor like ethereal, you can determine which call is generating so much bandwidth usage.

Thomas Linder Puls
Posts: 153
Joined: Tue Jan 20, 2004 2:28 am
Location: PDC, Copenhagen Denmark
Contact:

Post by Thomas Linder Puls » Wed Aug 18, 2004 8:30 am

Well, :oops: it turned out that one of my collegues did actually have a lot of files checked out.

During this period where he had all these files checked out, all of us experienced that vault was very slow. Connection was only one of the places where I experienced problems. In fact all operations seemed extremely slow.

Currently, we are only 5 people on your Vault database, but we have plans about going to 50 persons. It seems very bad if performance is bad for all these 50 persons if the number of checked-out files is high. 50 persons will check-out many files. Especially, you sometimes need to make a global update to your source files, and therefore you recursively check-out a complete directory.

I do not think we can tollerate bad performance just because another person has many irrelevant files checked out.

It seems that this sets a limit to the number of persons that can work with a single repository. If you are many persons you have to split into several repositories. But then you cannot as easily share code.
Thomas Linder Puls
Visual Prolog www.visual-prolog.com

ericsink
Posts: 346
Joined: Mon Dec 15, 2003 1:52 pm
Location: SourceGear
Contact:

Post by ericsink » Wed Aug 18, 2004 8:56 am

Thomas Linder Puls wrote:Well, :oops: it turned out that one of my collegues did actually have a lot of files checked out.
How many is "a lot" ?
Eric Sink
Software Craftsman
SourceGear

Thomas Linder Puls
Posts: 153
Joined: Tue Jan 20, 2004 2:28 am
Location: PDC, Copenhagen Denmark
Contact:

Post by Thomas Linder Puls » Wed Aug 18, 2004 2:00 pm

As mentioned our test suite has 14.000 tests, and this directory contains the "expected results": 14.000 files. They needed a "global" update, due to some globally affecting change. At some point in time I believe that all 14.000 files was checked out.

I can give a better answer tomorrow: Yes all 14.000 files was checked out.
Thomas Linder Puls
Visual Prolog www.visual-prolog.com

Thomas Linder Puls
Posts: 153
Joined: Tue Jan 20, 2004 2:28 am
Location: PDC, Copenhagen Denmark
Contact:

Post by Thomas Linder Puls » Fri Sep 17, 2004 7:06 pm

Is there any news on this for the next release?

We are planning to use Vault for 50 persons, and then there is without doubt going to be many files checked out from time to time. It would be a killer if all 50 persons should have long delays, in such situations.
Thomas Linder Puls
Visual Prolog www.visual-prolog.com

Thomas Linder Puls
Posts: 153
Joined: Tue Jan 20, 2004 2:28 am
Location: PDC, Copenhagen Denmark
Contact:

Post by Thomas Linder Puls » Mon Jan 17, 2005 3:15 am

We have just experienced long delays again, because somebody needed to check-out 10.000+ files.

Do you have any idea when this problem will be solved?
Thomas Linder Puls
Visual Prolog www.visual-prolog.com

jclausius
Posts: 3706
Joined: Tue Dec 16, 2003 1:17 pm
Location: SourceGear
Contact:

Post by jclausius » Mon Jan 17, 2005 8:19 am

Thomas:

This particular case has been somewhat improved in Vault 3.0.

The Vault 3.0.x client now caches the checkout list, so at least on startup, the server won't send the entire locked list each time. Note, you will still have a large download if a client has not yet received the updated checkout list. There will be a one-time hit until the information has been retrieved.
Jeff Clausius
SourceGear

Thomas Linder Puls
Posts: 153
Joined: Tue Jan 20, 2004 2:28 am
Location: PDC, Copenhagen Denmark
Contact:

Post by Thomas Linder Puls » Mon Jan 17, 2005 11:50 am

We are using Vault 3.0, so the problem we feel has not been solved.

I am not sure that the problem is only the downloading of the information.

I also believe that a lot of operations are much slower because of look-ups in the cached information. Maybe the cache is not as efficient as it should be, or maybe you simply look-up in it too many times.
Thomas Linder Puls
Visual Prolog www.visual-prolog.com

jclausius
Posts: 3706
Joined: Tue Dec 16, 2003 1:17 pm
Location: SourceGear
Contact:

Post by jclausius » Mon Jan 17, 2005 1:17 pm

As mentioned, Vault 3.0 has improved for startup, but the problem of transmitting the *new* 10,000 file list has not gone away.

We have discussed ways of tackling this problem ( sending the newly added locks in one big chunk ), but a change of that nature will require a large redesign on both client and server.

As for the caching, nothing has changed between Vault 2 / Vault 3. The caching was the same for both versions. What did change is Vault 3 now saves the cached list, rather than requesting a NEW list on Vault startup ( hence the 9MB download on each startup of a Vault client ).



With that said, once the list has been transferred, the client machine should respond relatively fast if the checkout list has not changed. In other words, upon startup / loading of the repository, subsequent refreshes ( when the server has had no activity ) should be relatively fast - depending on the type of connection between Vault client and server. Are you not seeing this type of behavior?

Also, do you run your repository with folder security enabled? Does temporarily disabling folder security speed things up during the same operations?
Jeff Clausius
SourceGear

Post Reply