So my last post focused on the modifications my shop has made to our implementation of Scrum without a lot of details about how we manage our code changes. This post is intended to explain how we set up source control to support the scrum process.
Source control is one of those poorly-defined practices; it’s something we know that we should be doing as a developer, but it’s not always done as a DBA. In fact, I would guess that many shops that have DBA’s doing most of their reporting and data integration tasks don’t practice source control methods at all; if they do, it’s typically in the form of scripting out the entire database or relying on differential backups. Separating the data structure from the data itself is often difficult to do, so I am sure that many teams don’t do it at all.
We’re currently using Visual Studio for Team Systems with Team Foundation Server as our source control repository; it’s been a steep learning curve for those of us who came from a Visual SourceSafe background, and applying that knowledge to the unfamiliar context of a new UI (VSTS:DB Pro) and concept (source control for databases) has been more than challenging, particularly since we’ve adopted a new development method (Scrum) as well. It’s take a lot of discussion (sometimes quite heated) to get where we are today, especially since there’s not a lot of “best practices” discussion for source controlling SQL Server out there.
The biggest challenge has been the concept of branching and merging; TFS recognizes that you may have multiple development lines going on, especially among multiple developers. When do you split code out to work on it? When do you put it back together? How does this affect deployment of new features, vs. the release of patches to fix the currently deployed line of code?
For us, we decided to address our database source control thusly:
- We set up the initial databases in source control in our Main branch. The Main branch is supposed to represent the heart of our code; it’s the final resting place of changes before they go to QA.
- From Main, we branch off a line of code for each PBI (Product Backlog Item; see my first post) that we’re working on. If a PBI spans multiple databases, we include a copy of each database under that PBI. Our cookie trail in TFS looks something like this:
- Main…Database1
- Main…Database2
- Development…PBI 1…Database1
- Development…PBI 1…Database2
- Development…PBI 2…Database1
- Once a PBI is ready to be deployed, we merge the databases for that PBI back to their original sources; if another developer merges in their changes which would cause conflict, those conflicts have to be resolved before we allow the final merge (this means that if Dev1 working on PBI 1 has some changes that would affect Dev2 working on PBI 2, those changes are discovered at the second merge).
- We also have a line of code that represents Productions deployment; at the beginning of each Sprint, we branch from the Main line into Production and then do a schema comparison with production to ensure that the Production line actually resembles Production. In case of a needed patch, we patch the production line, deploy it, and then merge those changes back into the Main line. If there are conflicts, we have to find them and resolve them.
There are some drawbacks; merging a patch into Main doesn’t always go smoothly, since we have to track down who is responsible for what changes. Documentation in the procs helps, but we need to do a better job (especially when those changes involve objects that are not easily commented; like tables). Furthermore, when it comes time to deploy from the Main branch, if QA decides that a feature isn’t ripe yet, then we have to do some legwork to back it out. All in all, however, it works, despite the bumps along the way.
I’d be interested in hearing how others are doing source control for their database scripts; please feel free to comment below.