May 2009

SQL Saturday #7 wrap-up

Finally back in my hotel room; I’ve forgotten how exhausting it is to actually present these days.  John Baldwin and the rest of the Steel City SQL user group did an awesome job pulling this together.  These guys are a lot of fun to be around, and they really went out of their way to make sure that speakers felt welcome.

Some highlights:

  • The facility was awesome.    It was held at a new building on the Jefferson State Community College campus; every room was equipped with a projector and audio equipment, and seating was comfortable, etc.
  • There were a few familiar faces, but I also had the chance to meet other Southeastern SQL geeks.  I’m hoping that we can build on this relationship, and encourage new growth for all of our user groups.
  • I particularly enjoyed hearing Andy Warren’s session on statistics; even though I don’t do a lot of work in performance tuning anymore, it was useful for me as a developer to understand what is a potential cause for bottlenecks on our servers.
  • My own session (Confessions of a Data Integrator) went well; I need to trim it a bit and sharpen the focus, but I got a lot of positive feedback and positive comments from attendees.  I definitely think I can sharpen it a bit, and I also think I can revamp it for SQL Saturday 17 (in Baton Rouge).

The only snafu I noticed was that the schedule had some interesting time slots; some sessions were 60 minutes in length and some sessions were 75 minutes.  On at least two occasions, the 75 minute sessions were cut short (because the volunteer misread the length of time for that session).  Overall, it was a great day, and I’m looking forward to future events.

Announcing AtlantaMDF Reviews

Just a quick post to tout a recent sister site that I’ve started (with the help of several other AtlantaMDF members): AtlantaMDF Reviews.  Our goal is to provide a forum of book and tool reviews that would be of interest to our local PASS user group.

The site has just started, with only 2 brief reviews as of this date.  However, we’ve got several people working behind the scenes reading and reviewing new material so that we can begin posting several articles soon (BTW, if you’re an AtlantaMDF attendee and would like to contribute, please find me and let me know).

Anyway, if you’re a Google Reader user, here’s a button for you to add the feed to your collection.   If you’re not a Google Reader user, the feedburner URL is below. Keep your eyes open for new material soon:

Add to Google

http://feeds2.feedburner.com/AtlantaMDFReviews

SCRUM, Source Control, and the SQL Server Developer (Part 2)

So my last post focused on the modifications my shop has made to our implementation of Scrum without a lot of details about how we manage our code changes.  This post is intended to explain how we set up source control to support the scrum process.

Source control is one of those poorly-defined practices; it’s something we know that we should be doing as a developer, but it’s not always done as a DBA.  In fact, I would guess that many shops that have DBA’s doing most of their reporting and data integration tasks don’t practice source control methods at all; if they do, it’s typically in the form of scripting out the entire database or relying on differential backups.  Separating the data structure from the data itself is often difficult to do, so I am sure that many teams don’t do it at all.

We’re currently using Visual Studio for Team Systems with Team Foundation Server as our source control repository; it’s been a steep learning curve for those of us who came from a Visual SourceSafe background, and applying that knowledge to the unfamiliar context of a new UI (VSTS:DB Pro) and concept (source control for databases) has been more than challenging, particularly since we’ve adopted a new development method (Scrum) as well.  It’s take a lot of discussion (sometimes quite heated) to get where we are today, especially since there’s not a lot of “best practices” discussion for source controlling SQL Server out there.

The biggest challenge has been the concept of branching and merging; TFS recognizes that you may have multiple development lines going on, especially among multiple developers.  When do you split code out to work on it?  When do you put it back together?  How does this affect deployment of new features, vs. the release of patches to fix the currently deployed line of code?

For us, we decided to address our database source control thusly:

  1. We set up the initial databases in source control in our Main branch.  The Main branch is supposed to represent the heart of our code; it’s the final resting place of changes before they go to QA. 
  2. From Main, we branch off a line of code for each PBI (Product Backlog Item; see my first post) that we’re working on.  If a PBI spans multiple databases, we include a copy of each database under that PBI.  Our cookie trail in TFS looks something like this:
    • Main…Database1
    • Main…Database2
    • Development…PBI 1…Database1
    • Development…PBI 1…Database2
    • Development…PBI 2…Database1
  3. Once a PBI is ready to be deployed, we merge the databases for that PBI back to their original sources; if another developer merges in their changes which would cause conflict, those conflicts have to be resolved before we allow the final merge (this means that if Dev1 working on PBI 1 has some changes that would affect Dev2 working on PBI 2, those changes are discovered at the second merge).
  4. We also have a line of code that represents Productions deployment; at the beginning of each Sprint, we branch from the Main line into Production and then do a schema comparison with production to ensure that the Production line actually resembles Production.  In case of a needed patch, we patch the production line, deploy it, and then merge those changes back into the Main line.  If there are conflicts, we have to find them and resolve them.

There are some drawbacks; merging a patch into Main doesn’t always go smoothly, since we have to track down who is responsible for what changes.  Documentation in the procs helps, but we need to do a better job (especially when those changes involve objects that are not easily commented; like tables).  Furthermore, when it comes time to deploy from the Main branch, if QA decides that a feature isn’t ripe yet, then we have to do some legwork to back it out.  All in all, however, it works, despite the bumps along the way.

I’d be interested in hearing how others are doing source control for their database scripts; please feel free to comment below.

SCRUM, Source Control and the SQL Server Developer (Part 1)

(If you’re just visiting me from SQLServerPedia, thanks for stopping by!)

I recently responded to TJay Belt’s post about his Release Management process, and thought it would be better to fully explain our development cycle here.   Although I come from a DBA background, I currently work on the development side of the house in our enterprise.  About 6 months ago, we made a decision to adopt Scrum as our development process, and for the most part, it has greatly simplified our release cycle.  I’m planning on this being a two part post; this part will focus mostly on what Scrum is, and how we use it in our shop to release database changes, and part two will focus on how we use source control within the Scrum methodology.

You can read the Wikipedia article about Scrum, and it sums things up far better than I can; however, what I want to contribute to the discussion are some guidelines for implementing the scrum process.  There are two basic goals to scrum: first, you want to increase visibility into the development process so that customers (the product users) know what to expect and when to expect it, and second, you want to organize your deliverables around measurable increments (sprints have daily scrums, which identify progress and impediments).  It’s important to understand these two goals in order for you to understand why we implemented the rules we did.  Please note that these are our rules as of this date, and they may change, and your shop is free to change them as you need.

  1. The Sprint will last 4-5 weeks.  Most scrums talk about Sprints lasting x number of days, but we broke it down into weeks because of the following constraints:
    • We have certain processes that begin at the beginning of the month, and some that begin at the end of the month.  We wanted to be sure that our Release Date always fell in the middle of the month, and
    • We wanted to have a consistent day of release.  We’re a 24x7x365 shop; for our business, most installation is best done in the middle of the week (so we can quickly identify any adverse affects).
  2. We wanted to minimize the number of releases.  Most issues can wait 4-5 weeks for complete development, even though the Product Users obviously want it NOW!!!  To accomplish this, we had to deal with the issues of units of work, and emergency patching:
    • If an issue is an emergency, and cannot wait until the end of the Sprint for release, we will release a patch on Wednesdays.  In our model, patches are discouraged, but if they have to happen, they at least happen on the same day of the week as the regular sprint release.  This ensures that everyone affected knows when patches may be implemented (and when to expect downtime).
    • In the scrum model, a Product Backlog Item (PBI) is a prioritized unit of work; a Sprint Backlog Item (SBI) is a unit of work that is accomplished during the sprint.  We enforce a hierarchy in our shop; PBI’s are composed of SBI’s.  We do not release partial PBI’s; all of the child tasks (SBI’s) for a given Product Backlog Item MUST be completed before we release the code into production.  If some set of SBI’s is required for a given release, we may split the PBI into two in order to accomplish this goal.

As a database developer, most of work is at the core of many of the changes; there are few PBI’s that do not involve some sort of change to the database.  What we’ve noticed is that I’m often in demand at the beginning of the sprint, but my time consistently frees up toward the end; to help shift the burden a bit, we often stub out a dataset for the application (i.e., the app developers will dummy up a local dataset as opposed to making a stored proc). This is useful for two reasons; it delays their need of my services until later in the sprint, and it also defines what they want the output of a stored proc to look like.

In the second part of this article, I’ll discuss our source control arrangement and how it maps to the sprint.

SQLSaturday #7: Birmingham, AL May 30, 2009

I’ve been so busy with all of my other projects at work that I’ve forgotten to really tout this.  On May 30, I’ll be presenting at the Birmingham SQLSaturday (hosted by the Steel City SQL User Group).  My talk is titled “Confessions of a Data integrator”, and will focus on dealing with other people’s database designs. 

Over the last twelve years, I’ve built a career on pulling data from a variety of data sources, most of them which were poorly designed (according to the basic rules of normalization).  From that experience, I’ve pulled together a set of principles for extracting data which I plan to share.  This is a bit of challenge for me, because I’ve never actually written any of these rules of mine down 🙂

Anyway, if you can make it, please do; stop by and see me.

SQL Server & XML: links of interest

I’ve been working on a project recently that requires using XML in SQL Server, so I’ve been searching the web to find references. Books Online is unusually obtuse on this subject (the samples are sub-par for this complex subject), so I’ve been trying to find good examples of how it works. Here’s some of the better ones:

Passing lists to SQL Server 2005 with XML Parameters Jon Galloway gives a short and sweet example of shredding an XML variable into rows.

XQuery Inside SQL Server 2005 A longer set of examples using CROSS APPLY to associate XML data in a column to a value in a standard column. This is probably the most important query I needed, because we have a lot of XML values (config files) that are stored in a table; those config files are specific to particular clients.

SQL Server 2008 Books Online Here’s the basic data elements; again, the vocabulary is dense, but you at least need to know where it is.

Now, on to FLWOR (which I’m just beginning to study): Blooming FLWOR – An Introduction to the XQuery FLWOR Expression

Death by a thousand cuts…

This is has been an awful week; things have just not gone as planned from the get-go.  I’m gonna run through several issues in this post, and perhaps someone will find some value in them; lessons learned may help others avoid the pain of my ways.

 

VSTS: DB Pro

This week, I’ve spent way too much time just trying to figure out how to use Visual Studio; I’ve mentioned in previous posts that I’m a database developer, but that I’ve spent most of my career working with the standard SQL Server DBA tools: Query Analyzer & Management Studio.  When we migrated to TFS, my manager encouraged me to adopt VSTS:DB Pro as a method of developing, and I’ve put some real effort into learning how the system works.  I’m slowly becoming accustomed to it, but there’s still some very quirky things that I’ve yet to figure out.

Issue 1: Logins.  Whenever I import a database into a database project, I get a long list of errors because we have Windows accounts associated with users in the database; since those users don’t exist on my laptop, the project throws all kinds of ugly errors.  Googling the error reveals very little information; the only method I’ve found to eliminate the errors is to drop the associated users and schemas, which of course, can lead to VERY BAD THINGS in deployment scripts.  I’m sure there is a method to resolve it; I just haven’t found it yet.

Issue 2: The Silent Source Control Failure.  Most of my projects involve references to vendor databases; we have a policy of not modifying vendor databases, so when I need to write a stored procedure that accesses data from a third-party source, I create a linked database that sits on the server beside the vendor database, and include my proc in there.  I don’t touch their code, but am able to customize my queries to pull data.  When setting up projects in VSTS:DB Pro, I usually make my custom database the primary project, and include a database reference to a project based on the vendor db.  This usually works out OK, but I ran into an error yesterday where the project wizard would complete, but the project was unable to load.  Nothing I did would fix it; I finally removed the vendor db from the solution, and was able to load my project (albeit with tons of errors because of the lack of references).

The problem? One of the stored procs in the vendor db has an extremely long name (nearly 100 characters); combined with the path to the workspace, the name of the file exceeds the 255 character limit for Windows file management.  TFS apparently doesn’t know how to tell me that the file name is too long, so it just refuses to open.  Try wasting 2 hours of precious development time tracking that down, and you can imagine how frustrating that is.

 

Haste makes waste.

A few months ago, we made a decision to adopt a version of Scrum to standardize our development processes; we’ve made up some of our rules along the way, and for the most part, we’ve done OK with it.  However, the pressure to show progress and meet the needs of the organization means that sometimes I’ve made the wrong decision in my desire to please our Product Users.  One case was today;  one of our rules is that we a) strive to only deploy once a month (at the end of a sprint), and b) if we need to deploy an emergency patch, we only deploy on one day of the week (Wednesdays).

One of our users found an issue, and a production DBA researched it and made a recommendation for a relatively simple code change; I was out of the office Monday, so I only saw the recommendation on Tuesday.  The issue was important, and the code looked good, so I thought I could get away with rolling it out today; unfortunately, I didn’t do due diligence on the code, and neglected to check for dependencies on that particular proc.  We deployed my change today at 8:30 AM; by 3:30 PM, I was rolling it back.  Luckily, the only side effect was that we generated a few extra tickets for our clients, but still, it was a painful experience.

The one thing I want to carry away from this experience is that as our system becomes increasingly complex, the less likely it is that a simple change can be made in less than a day. The code may be simple, but the context of the code in terms of surrounding processes SHOULD warrant time for thorough review.  I need to quit reacting to the deemed urgency of the issue, and think about long-term stability instead.

 

Even the best deployments can have issues.

Overall, I think our deployment process works well, but even a good process can have points of failure.  Even though we have several sets of eyes examining all of our code before it goes into production, things can still slip by.  For example, at 3:45 this afternoon, I was informed that a stored procedure in production was not returning all of the data; the procedure pulls data from a third-party database, and makes certain assumptions about the nature of that data.  In this case, we have a standard naming syntax for all of the devices we manage; ClientID, followed by a space, then device name (e.g., 999 Joe’s Server).  The stored proc had a where clause that ended with LIKE ‘[0-9][0-9][0-9] %’; the missing row of data had a device name with the space omitted.

Now, it wasn’t entirely my fault because I had written the code to spec; however, I should have anticipated that the naming convention could be slightly off (and still be parseable; if they had a 2-digit client id, it would be a different story); in the end, our company looked bad to a client because of  typo, and that’s not good.  Easy to fix the code, but hard to regain the trust of the client.

 

Tomorrow WILL be a better day.

I’m not much for mumbo-jumbo, but I do believe that life is what you make of it.  If I go back to work tomorrow and assume that it’s going to be a bad day, then I’ll probably be proven right.  I’m changing my attitude as of this moment, and I choose to believe that the priority issue that I address tomorrow (while it may not be the issue I want to work on) will be the issue I was meant to work on. 

Educational opportunities a blooming….

Just in case you’re not following http://sqlsaturday.com, there’s been a few announcements of upcoming SQLSaturdays.  I’d love to go to a few of these, but we’ll have to wait and see how the economy shakes out a bit. 

  • SQLSaturday #7 – May 30, 2009 – Birmingham, AL.  I’m still working on my slides for this one; if you show up, please come find me and say hi.
  • SQLSaturday #12 – June 6, 2009 – Portland, OR.  I’d love to go to this one, but obviously Oregon’s a bit of a haul from the ATL (plus, see reason for not going to #14 below).
  • SQLSaturday #14 – June 6, 2009 – Pensacola, FL.  My family’s leaving on a cruise on June 7th, and I’m very tempted to say “we could swing by Pensacola a day early…”  Ain’t gonna happen.
  • SQLSaturday #16 – August 8, 2009 – Miramar, FL.  South Florida will rock.

 

Of course, there’s lots of other development code camps going on; here’s a couple that I want to attend:

  • Codestock – The dev community in Knoxville, TN is an incredibly awesome bunch of people; it’s southern-fried hippy geek culture at its finest.
  • DevLink – Technically, devlink is not a code camp (they actually charge a substantial fee – a whopping $100), but they’re close enough that I feel comfortable placing them on this list.

You can find a much more complete list at http://codecampjukie.com

Gonna be some changes around here….

If you’ve been here recently, you may have noticed a new theme; I’m trying to clean up the look around here a bit. I’ve decided I want to be a little more proactive about my blogging, and decided that I wanted a new look to go with it. Why the sudden interest in blogging?

First, SQLSaturday is over, and I need a new challenge to keep me busy until the next one rolls around.

Second, I’ve noticed that I’ve gotten rather stale about doing anything technical related; I’m not challenging myself to learn something new, and that was the whole point of this blog in the first place.

Anyway, I hope you enjoy the new theme, but keep an eye out for some other minor changes around the bend.

Stu