Repository Size Considerations?

If you are having a problem using Vault, post a message here.

Moderator: SourceGear

Post Reply
mover
Posts: 8
Joined: Wed Dec 17, 2008 3:36 pm

Repository Size Considerations?

Post by mover » Wed Dec 17, 2008 3:54 pm

We are preparing to import a large amount of source code from VSS into Vault and want to take the opportunity to reorganize our source libraries. Can anyone tell me if there are any best practices or recommendations for how to organize the repository(s)? I've found lots of good information regarding the process of importing.

One specific concern that I have is whether the size of a repository has any impact on performance. We have over 20GB of VSS history and thousands of projects in about 15 separate repositories - the largest repository is over 10GB. We plan to do a lot of cleanup and archive older & no longer used projects but I'm still interested in any information about the relative pro's & con's of one large repository vs. a number of smaller ones. It seems to me that security / permissions can be more granular with multiple repositories but that limits your ability to share code as broadly. What am I missing?

Thanks!

Mike:o

Beth
Posts: 8550
Joined: Wed Jun 21, 2006 8:24 pm
Location: SourceGear
Contact:

Re: Repository Size Considerations?

Post by Beth » Thu Dec 18, 2008 4:09 pm

We are preparing to import a large amount of source code from VSS into Vault and want to take the opportunity to reorganize our source libraries. Can anyone tell me if there are any best practices or recommendations for how to organize the repository(s)? I've found lots of good information regarding the process of importing.
Every company is a little different in their organization. There is no set rule for this. I have a few suggestions that might help.
  • 1) Use mulitple repositories.
    2) Information cannot be shared across repositories, so organize the code by what parts are shared and what parts or projects are entirely separate from everything else.
    3) Balance: 20 repositories isn’t bad at all. 120 repositories will mean the the repository list will be slow.
    4) All best practices articles are posted in our KB articles area.
    5) Don’t duplicate information.
    6) Don’t place multiple solution files together in one folder.
    7) Don’t have paths that exceed 250 characters. To avoid that, either minimize name lengths or limit subfolders.
We can look more closely at your situation if you want to post some specific examples or you can always email support with details.
One specific concern that I have is whether the size of a repository has any impact on performance.
The number of nodes in a repository can affect performance. Every folder and ever file constitutes a node. Performance is improved by dividing what you have up into repositories, but you don’t want to make divisions between parts of code that need some sort of link. If it means adding in a duplicate copy of the code into other repositories, then that might not be the best place for a division. If there are good break points in security, then repositories can make for easier security configuration.

We have some recommendations that will help you get the best performance you can from Vault.
Recommendations for optimal Vault performance
Maintenance: The Vault Server database


When you say “share code as broadly” are you talking amongst users or amongst projects?
Beth Kieler
SourceGear Technical Support

mover
Posts: 8
Joined: Wed Dec 17, 2008 3:36 pm

Re: Repository Size Considerations?

Post by mover » Fri Dec 19, 2008 10:54 am

Thanks for all the food for thought! I'm working on digesting and incorporating it into our plan.

My comment about sharing code broadly had to do with sharing code between projects - something that we don't do much of currently but there is some interest in it. Even with a larger number of repositories we will be able to share functionality by compiling it into a common class library and using it within projects stored in different repositories which is the way that we'll ultimately want to go (IMHO).

A couple of clarification questions:
1) Regarding your statement about long paths (over 250 characters) - you're talking about the folder structure within a single repository, correct? For example in RepositoryA the path to a source file would be:
$/folderName/subfolderName/../fileName.ext
so keeping the path short involves both number of nested folders and length of folder / file names.

2) You mentioned a performance impact to enabling Folder Security - is this tied entirely to whether this feature is enabled or is it also influenced by the extent to which folder-level security is used? If I had two identical repositories, both with Folder Security ON but in RepositoryB only one folder in the tree has security set on it while RepositoryC has 500 folders with security set on them will performance be different between them? Does the number of security groups also play into this?

Thanks again!

Beth
Posts: 8550
Joined: Wed Jun 21, 2006 8:24 pm
Location: SourceGear
Contact:

Re: Repository Size Considerations?

Post by Beth » Fri Dec 19, 2008 11:59 am

Here is a KB article you should look at too: Best practices for setting up repositories
Even with a larger number of repositories we will be able to share functionality by compiling it into a common class library and using it within projects stored in different repositories which is the way that we'll ultimately want to go (IMHO).
You could keep those projects that use that library together, or you could copy the library into each repository. If the library is small, I think you'd get better performance on the second option.
so keeping the path short involves both number of nested folders and length of folder / file names
Correct. I've only ever ran into one person that hit a path size limit.
You mentioned a performance impact to enabling Folder Security
Just enabling it causes Vault to have to go for the security details. I'm unsure if the amount of configuration makes a difference. I'll see if I can find information on that.
Beth Kieler
SourceGear Technical Support

mover
Posts: 8
Joined: Wed Dec 17, 2008 3:36 pm

Re: Repository Size Considerations?

Post by mover » Fri Dec 19, 2008 2:45 pm

Actually, my comment about creating a common class library was referring to a DLL / API that can be used without accessing the code.

How significant is the performance impact of Folder Security?

If we hire a consultant to work on code that's in a repository that does not have Folder Security enabled, can we turn it on, grant access to specific folders for the consultant and turn Folder Security back off once they have completed their work without causing long-term problems? In other words, is the performance completely restored to its prior level when Folder Security turned off?

Beth
Posts: 8550
Joined: Wed Jun 21, 2006 8:24 pm
Location: SourceGear
Contact:

Re: Repository Size Considerations?

Post by Beth » Fri Dec 19, 2008 3:02 pm

Actually, my comment about creating a common class library was referring to a DLL / API that can be used without accessing the code.
I don't think I understand what you mean.

With folder security, the more folder security settings you have the more things will slow down. That should only be the case though if it's turned on. When it's turned off, then there's no reason for it to access what was configured there.
Beth Kieler
SourceGear Technical Support

Post Reply