We're in the process of moving from VSS to Vault. The documentation helpfully tells me that I should partition my archives into repositories, where a repository contains a set of potentially related projects. It doesn't really tell me why I should do that.
I've seen notes in the knowledgebase and/or docs that indicate that sharing can't span repositories, and that there's no way to move a project from one repository to another.
Question: What benefits do I get from using multiple repositories? Is it an issue of performance, convenience in scoping the data? At the moment, it seems that when in doubt, I should leave things in the same repository, since it's unclear what benefit I'd get from splitting them, and there are costs stemming from the hard partition between them.
FWIW - We're using vault for a team of 20 or so developers doing internal support for our company on a variety of projects. There are some core shared infrastructure components, and a number of standalone develiverables that may or may not use elements of the shared architecture.
Repository setup guidance
Moderator: SourceGear
The main reason to split up projects into different repositories is performance. The size of the tree can bog the client down if it gets into hundreds of thousands of nodes, and also it would scan fewer working folders per repository, which can be noticeble on large trees. Also, the server performance is better when multiple users are accessing smaller trees.
There is a quick overview of this at: http://support.sourcegear.com/viewtopic.php?t=49 which probably needs to be moved to the knowledge base.
There is a quick overview of this at: http://support.sourcegear.com/viewtopic.php?t=49 which probably needs to be moved to the knowledge base.
Repository setup guidance
Can you give me a ballpark idea of how many files you need to have in a repository before the performance becomes an issue?
There are no hard and fast numbers to give you, unfortunately, since it all depends on your tree, working model, and hardware.
Some factors include:
- client machine speed and disk speed affects how efficient the client will be in processing large trees. I think if you have 100K nodes or more (files and folders), you'd want to be careful about how many of those folders have actual working folders set, and you'd probably want to have "Require Checkout before Checkin" turned on (which it is by default), so the client doesn't have to scan working folders as often to determine what has changed.
- If you have lots of users constantly checking in or modifying the same repository, you probably want to make sure the repository has only files that are related. I don't like giving an actual number of users here, since it really depends on how many users are actually making modifications at the same time. This probably only becomes an issue around 50 users, but again, it depends on your work environment. This isn't to say that performance will be terrible, but that you will always get better performance for concurrent users if those users are using different repositories.
Sorry for the weasel-y answers, but it is true that everyone will experience slightly differen results based on their own setup, and what works in your environment is the only thing that will matter to you. It is also true that breaking up separate projects into different repositories will always help performance.
Some factors include:
- client machine speed and disk speed affects how efficient the client will be in processing large trees. I think if you have 100K nodes or more (files and folders), you'd want to be careful about how many of those folders have actual working folders set, and you'd probably want to have "Require Checkout before Checkin" turned on (which it is by default), so the client doesn't have to scan working folders as often to determine what has changed.
- If you have lots of users constantly checking in or modifying the same repository, you probably want to make sure the repository has only files that are related. I don't like giving an actual number of users here, since it really depends on how many users are actually making modifications at the same time. This probably only becomes an issue around 50 users, but again, it depends on your work environment. This isn't to say that performance will be terrible, but that you will always get better performance for concurrent users if those users are using different repositories.
Sorry for the weasel-y answers, but it is true that everyone will experience slightly differen results based on their own setup, and what works in your environment is the only thing that will matter to you. It is also true that breaking up separate projects into different repositories will always help performance.
-
- Posts: 114
- Joined: Fri Mar 05, 2004 11:18 am
- Location: Raleigh, NC
Thought I'd throw in my recent experience...
We just went through the VSS to Vault conversion a few weeks ago, and decided NOT to partition the projects into different repositories.
Our project tree has about 30 nodes, with LOTS of sub-folders. There are approx 10,000 files (>500MB) now managed by our single repository. The vault server is running on a single 3Ghz processor with 2GB of memory, and mirrored 10K Sata drives (WD Raptors).
I just connected through a VPN tunnel from home (via cable modem), and the client was up in seven seconds. My home machine has a 2.2ghz processor. Local activation is comparable to starting a local VSS client.
We might create new repositories for future projects, but we didn't want to change our workflow procedures at the same time we were trying to get everyone moved onto Vault.
Don
We just went through the VSS to Vault conversion a few weeks ago, and decided NOT to partition the projects into different repositories.
Our project tree has about 30 nodes, with LOTS of sub-folders. There are approx 10,000 files (>500MB) now managed by our single repository. The vault server is running on a single 3Ghz processor with 2GB of memory, and mirrored 10K Sata drives (WD Raptors).
I just connected through a VPN tunnel from home (via cable modem), and the client was up in seven seconds. My home machine has a 2.2ghz processor. Local activation is comparable to starting a local VSS client.
We might create new repositories for future projects, but we didn't want to change our workflow procedures at the same time we were trying to get everyone moved onto Vault.
Don