How many licks DOES it take…?

So, it’s been a while since I’ve posted, but I’ve finally managed to eke out some time after Summit 2010, and wanted to follow up on a conversation that Tim Mitchell, Arnie Rowland, and I had at the Friends of Red Gate dinner.  We were doing a SQL Server oriented trivia contest which asked the following question:

How many nonclustered indexes does SQL Server 2008 support on a single table?

And the answer is: 999, which is enough to cause most DBA’s to grumble under their breath abut the consequences of setting up that many indexes and what they would do to if they ever found that many indexes on a table.  Of course, being the amateur math geek that I am, I casually asked Tim and Arnie:

What’s the smallest table (in terms of amount of columns) that would support 999 indexes?

After knocking it around for a bit, we came up a estimate of 6, which actually isn’t too far off; however, our method of getting there was mostly intuitive, and I wanted to figure out the actual formula for calculating that number.  I knew it had to with factorials, but I wasn’t exactly sure how to get there.  After searching the internet, I finally figured out the following principles:

  • Column order matters when building indexes, so when choosing pairs from a set of columns, a set of ab <> ba.
  • The more columns on the table, the wider the indexes could be; choosing columns from a wider set would require iteration.  In other words, if you have 3 columns on a table, you would have 3 single-column indexes, 6 double-column indexes, and 6 triple-column indexes.

The formula that represents this is SUM(n!/(n-k)!), where n represents the number of columns in the table and k represents the number of columns in the index.  Plugging this into an spreadsheet, you get the following matrix:

    Number of Columns in Index (k)
  1 2 3 4 5 6 SUM
Number of Possible Columns (n) 1 1           1
2 2 2         4
3 3 6 6       15
4 4 12 24 24     64
5 5 20 60 120 120   325
6 6 30 120 360 720 720 1956

 

At first glance, we’re done; it looks like 6 was the right answer, because with only 6 columns in a table, you have a whopping 1,956 possible indexes to choose from.  However, there’s more to the story: SQL Server 2005 introduced the INCLUDE option to indexes, which throws a kink in the above formula. 

At first, I thought it was relatively simple; you had two subsets for each n, where the elements in each subset couldn’t be in the other one, but it’s a little more deceptive.  Here’s the principles for generating it:

  • For a set (n) of possible columns, there are two mutually exclusive subsets: the base (k) and the included columns (l).  The number of elements in the two subsets must be less than or equal to the number of elements in the master set.
  • Column order matters in the base columns, but not the included columns, so the formula above can work for a base set of columns, but iterating through the included columns requires only the unique set of elements.

And here’s the part where my brain exploded; I couldn’t figure out a way to mathematically demonstrate the two relationships, so I built a SQL script, iterating through a set of 5 columns; all in all I ended up with a listing of 845 possible combinations, which means that 6 still stands as the minimum number of columns on a table needed to generate the maximum number of nonclustered indexes.

The point to this story?  None, really.  Just a silly geek exercise.  However, I think it does point out that index strategy is a complex problem, and there are multiple ways to index any given table.  Choosing the right one is more difficult than it looks.


DECLARE @c TABLE ( NAME VARCHAR(100) ) ; INSERT  INTO @c
       
( NAME )
VALUES  ( 'a' ),
        (
'b' ),
        (
'c' ),
        (
'd' ),
        (
'e' )
      
SELECT  n = 1
     
, k = 1
     
, l = 0
     
, NAME
     
, INCLUDE = NULL
INTO    #tmp
FROM    @c
UNION ALL
SELECT  n = 2
     
, k = 2
     
, l = 0
     
, NAME = c1.NAME + ',' + c2.NAME
     
, INCLUDE = NULL
FROM    @c c1
       
CROSS JOIN @c c2
WHERE   c1.name <> c2.name
UNION ALL
SELECT  n = 2
     
, k = 1
     
, l = 1
     
, NAME = c1.NAME
     
, INCLUDE = c2.NAME
FROM    @c c1
       
CROSS JOIN @c c2
WHERE   c1.name <> c2.name
UNION ALL
SELECT  n = 3
     
, k = 3
     
, l = 0
     
, NAME = c1.NAME + ',' + c2.NAME + ',' + c3.NAME
     
, INCLUDE = NULL
FROM    @c c1
       
CROSS JOIN @c c2
       
CROSS JOIN @c c3
WHERE   c1.name <> c2.name
       
AND c2.name <> c3.name
       
AND c1.name <> c3.name
UNION ALL
SELECT  n = 3
     
, k = 2
     
, l = 1
     
, NAME = c1.NAME + ',' + c2.name
     
, INCLUDE = c3.name
FROM    @c c1
       
CROSS JOIN @c c2
       
CROSS JOIN @c c3
WHERE   c1.name <> c2.name
       
AND c2.name <> c3.name
       
AND c1.name <> c3.name
UNION ALL
SELECT  n = 3
     
, k = 1
     
, l = 2
     
, NAME = c1.NAME
     
, INCLUDE = c2.NAME + ',' + c3.name
FROM    @c c1
       
CROSS JOIN @c c2
       
CROSS JOIN @c c3
WHERE   c1.name <> c2.name
       
AND c1.name <> c3.name
       
AND c2.name < c3.name
UNION ALL
SELECT  n = 4
     
, k = 4
     
, l = 0
     
, NAME = c1.NAME + ',' + c2.NAME + ',' + c3.NAME + ',' + c4.name
     
, INCLUDE = NULL
FROM    @c c1
       
CROSS JOIN @c c2
       
CROSS JOIN @c c3
       
CROSS JOIN @c c4
WHERE   c1.name <> c2.name
       
AND c2.name <> c3.name
       
AND c1.name <> c3.name
       
AND c1.name <> c4.NAME
       
AND c2.name <> c4.NAME
       
AND c3.name <> c4.name
UNION ALL
SELECT  n = 4
     
, k = 3
     
, l = 1
     
, NAME = c1.NAME + ',' + c2.NAME + ',' + c3.NAME
     
, INCLUDE = c4.name
FROM    @c c1
       
CROSS JOIN @c c2
       
CROSS JOIN @c c3
       
CROSS JOIN @c c4
WHERE   c1.name <> c2.name
       
AND c1.name <> c3.name
       
AND c1.name <> c4.NAME
       
AND c2.name <> c3.name
       
AND c2.name <> c4.NAME
       
AND c3.name <> c4.name
UNION ALL
SELECT  n = 4
     
, k = 2
     
, l = 2
     
, NAME = c1.NAME + ',' + c2.NAME
     
, INCLUDE = c3.name + ',' + c4.NAME
FROM    @c c1
       
CROSS JOIN @c c2
       
CROSS JOIN @c c3
       
CROSS JOIN @c c4
WHERE   c1.name <> c2.name
       
AND c1.name <> c3.name
       
AND c1.name <> c4.NAME
       
AND c2.name <> c3.name
       
AND c2.name <> c4.NAME
       
AND c3.name < c4.name
UNION ALL
SELECT  n = 4
     
, k = 1
     
, l = 3
     
, NAME = c1.NAME
     
, INCLUDE = c2.name + ',' + c3.NAME + ',' + c4.NAME
FROM    @c c1
       
CROSS JOIN @c c2
       
CROSS JOIN @c c3
       
CROSS JOIN @c c4
WHERE   c1.name <> c2.name
       
AND c1.name <> c3.name
       
AND c1.name <> c4.NAME
       
AND c2.name < c3.name
       
AND c2.name < c4.NAME
       
AND c3.name < c4.name
UNION ALL
SELECT  n = 5
     
, k = 5
     
, l = 0
     
, NAME = c1.NAME + ',' + c2.name + ',' + c3.NAME + ',' + c4.NAME + ',' + c5.NAME
     
, INCLUDE = NULL
FROM    @c c1
       
CROSS JOIN @c c2
       
CROSS JOIN @c c3
       
CROSS JOIN @c c4
       
CROSS JOIN @c c5
WHERE   c1.name <> c2.name
       
AND c2.name <> c3.name
       
AND c1.name <> c3.name
       
AND c1.name <> c4.NAME
       
AND c2.name <> c4.NAME
       
AND c3.name <> c4.name
       
AND c1.name <> c5.NAME
       
AND c2.name <> c5.NAME
       
AND c3.name <> c5.name
       
AND c4.name <> c5.name
UNION ALL
SELECT  n = 5
     
, k = 4
     
, l = 1
     
, NAME = c1.NAME + ',' + c4.name + ',' + c3.NAME + ',' + c2.NAME
     
, INCLUDE = c5.NAME
FROM    @c c1
       
CROSS JOIN @c c2
       
CROSS JOIN @c c3
       
CROSS JOIN @c c4
       
CROSS JOIN @c c5
WHERE   c1.name <> c2.name
       
AND c2.name <> c3.name
       
AND c1.name <> c3.name
       
AND c1.name <> c4.NAME
       
AND c2.name <> c4.NAME
       
AND c3.name <> c4.name
       
AND c1.name <> c5.NAME
       
AND c2.name <> c5.NAME
       
AND c3.name <> c5.name
       
AND c4.name <> c5.name
UNION ALL
SELECT  n = 5
     
, k = 3
     
, l = 2
     
, NAME = c1.NAME + ',' + c2.name + ',' + c3.NAME
     
, INCLUDE = c4.NAME + ',' + c5.NAME
FROM    @c c1
       
CROSS JOIN @c c2
       
CROSS JOIN @c c3
       
CROSS JOIN @c c4
       
CROSS JOIN @c c5
WHERE   c1.name <> c2.name
       
AND c1.name <> c3.name
       
AND c1.name <> c4.NAME
       
AND c1.name <> c5.NAME
       
AND c2.name <> c3.name
       
AND c2.name <> c4.NAME
       
AND c2.name <> c5.NAME
       
AND c3.name <> c4.name
       
AND c3.name <> c5.name
       
AND c4.name < c5.name
UNION ALL
SELECT  n = 5
     
, k = 2
     
, l = 3
     
, NAME = c1.NAME + ',' + c2.name
     
, INCLUDE = c3.NAME + ',' + c4.NAME + ',' + c5.NAME
FROM    @c c1
       
CROSS JOIN @c c2
       
CROSS JOIN @c c3
       
CROSS JOIN @c c4
       
CROSS JOIN @c c5
WHERE   c1.name <> c2.name
       
AND c1.name <> c3.name
       
AND c1.name <> c4.NAME
       
AND c1.name <> c5.NAME
       
AND c2.name <> c3.name
       
AND c2.name <> c4.NAME
       
AND c2.name <> c5.NAME
       
AND c3.name < c4.name
       
AND c3.name < c5.name
       
AND c4.name < c5.name
UNION ALL
SELECT  n = 5
     
, k = 1
     
, l = 4
     
, NAME = c1.NAME
     
, INCLUDE = c2.name + ',' + c3.NAME + ',' + c4.NAME + ',' + c5.NAME
FROM    @c c1
       
CROSS JOIN @c c2
       
CROSS JOIN @c c3
       
CROSS JOIN @c c4
       
CROSS JOIN @c c5
WHERE   c1.name <> c2.name
       
AND c1.name <> c3.name
       
AND c1.name <> c4.NAME
       
AND c1.name <> c5.NAME
       
AND c2.name < c3.name
       
AND c2.name < c4.NAME
       
AND c2.name < c5.NAME
       
AND c3.name < c4.name
       
AND c3.name < c5.name
       
AND c4.name < c5.name SELECT n, COUNT(*)
FROM #tmp
GROUP BY n
ORDER BY n DROP TABLE #tmp   

#SQLPASS Summit 2010… Here we go again…

Just finished the first night of PASS Summit, and it’s been a great day.  Saw a lot of friends that I don’t get to spend enough time with, and it was a real blessing to talk about all of the changes that had happened in my life over the past year.  It’s hard to believe that nearly a year ago, I was leaving Summit and headed toward a divorce.  Life has changed in the last year, and I think for the better.  I haven’t progressed as much professionally as I had hoped, but nonetheless, I think I’m doing OK.  I’m happy, healthier, and the future looks bright.

I reflected on this a bit this weekend during Brent Ozar’s FreeCon, where I was lucky enough to sit in the room with some very brilliant people, and spend time about what it takes to be a better technical writer.  I enjoyed the experience, and walked away with a lot of rough ideas for future blog posts, and one key thought burning in my brain: I’m not doing enough fun work.

Fun work? Let me try to explain.  As a technical person, I enjoy challenges, and I especially enjoy learning new ways to solve problems.  I’ve spent a lot of time in my career solving problems, but unfortunately (and I think it’s due in part to personal issues I’ve faced over the last year), I’ve lost my way a bit.  I believe that a technical person should spend at least 10 hours out of a 40 hour work week learning something new (fun work); my balance isn’t there.  I probably spend 38 hours out of the week doing work that needs to be done, but isn’t really fun anymore; my goal is to change that balance over the next year, and to find ways to enjoy my career again.

I’m still fleshing out what this means to me as a blogger, but I’m excited about the possibility of mastering my craft.   Hopefully, I can use this blog as a vehicle to take me there.

#sqlpass #awesomesauce Moving up to the big leagues…

Just got word that my XML session was moved from an alternate topic to a full-fledged presentation.   The only problem?  I already submitted a trimmed down version as a Chalk Talk, and it got accepted.  Soooooo… here’s my plan:  I’ve got the weekend to beef up the XML presentation and add some spiffy new material to it.  It’s gonna be a busy couple of days.

BTW, I’ve enjoyed walking around Seattle the last couple of days; it’s a beautiful city.  Can’t wait until the conference starts and I get to see some old friends and learn some new stuff.

#sqlpass Less than an hour to go…

Till I drive to the airport.  I’m looking forward to this year’s Summit in a lot of ways, and I hope that it will energize me into blogging again.  It’s been a rough few weeks, and I’m looking forward to getting excited about my craft again.  I’m also looking forward to the few extra days ahead of the conference; taking some time to explore the city.

Anyway, if you’d like to meet me, I’ll be around.  Here’s a snapshot of my itinerary so far:

MONDAY

SQLSaturday Networking Group
Welcome Reception
Expressor Community Dinner
Speaker\Volunteer AfterParty

TUESDAY

I’m presenting, as well as going to other presentations.  Looking forward to lots of networking in the hallways.

AD145CT

XML in SQL Server 2008: An Introduction to XQuery

Chalk Talk

Stuart Ainsworth

While most database professionals have had some exposure to XML, few have actually embraced the potential of using XML in their relational design. This presentation focuses on a basic introduction to XML. This session will cover the basic structure of XML (it’s more than text with tags), and the basic XQuery methods in SQL Server. In addition, specific use case scenarios will be introduced including how to shred an XML document, moving data as XML, and other tips and tricks. Driven from the persepctive of the average database professional, this talk is designed to "get your feet wet" in handling XML data within SQL Server 2008. While some advanced topics will be referenced (such as FLWOR queries), this talk is primarily designed to introduce the audience to basic concepts and tools; participants should expect to walk away with unanswered questions, as well as the motivation to learn more.

Expo Hall 4A
Tuesday, November 9
1:30 PM – 2:10 PM

LT100T

Lightning Talk – Tuesday

Regular session

Adam Machanic, Brent Ozar, Buck Woody, Craig Utley, Joe Webb, Michelle Ufford, Stuart Ainsworth

NEW to PASS this year – Lightning Talks.

A lightning talk is a short, prepared talk that is no more than 5 minutes in length-which keeps things moving quickly. One session everyday will be dedicated to lightning talks and 7 speakers will participate in each talk.

201
Tuesday, November 9
3:00 PM – 4:15 PM

THURSDAY

Chapter Leader Luncheon

 

Of course, there’s lots of stuff I haven’t listed, but I’m sure you’ll see me around.  Gotta run.

#TSQL2sDay – My Least Favorite SQL Server Myth

TSQL2sDay150x150

 

It’s time again for another T-SQL Tuesday, hosted this month by Sankar Reddy; the topic is misconceptions in SQL Server.  It’s been a while since I wrote one of these (I usually forget about them until the following Wednesday), but this topic is a good one.  I’ve had many discussions with people about the following myth for a long time, so it’s nice to be able to put it to rest again.

The myth?  “You should always use stored procedures in your code, because SQL Server is optimized for stored procedure re-use.”  Don’t get me wrong; there are lots of arguments to use stored procedures (security, obfuscation, code isolation), but performance is not necessarily a good one.   This myth has been around for a long time (since SQL Server 7), and Binging “stored procedures SQL Server performance” yields such gems as the following:

SQL Server Performance Tuning for Stored Procedures

As you know, one of the biggest reasons to use stored procedures instead of ad-hoc queries is the performance gained by using them. The problem that is that SQL Server will only … www.sqlserverperformance.com/tips/stored_procedures_p2.aspx · Cached page

Increase SQL Server stored procedure performance with these tips

Database developers often use stored procedures to increase performance. Here are three tips to help you get the most from your SQL Server storedarticles.techrepublic.com.com/5100-10878_11-1045447.html · Cached page

 

The guts of this myth originate from the fact that prior to version 7 (released in 1998), SQL Server WOULD precompile stored procedures and save an execution plan for future reuse of that procedure, BUT THAT CHANGED AS OF VERSION 7.0.  Here’s a quote from Books Online (SQL Server 2000) that tries to explain what happened (emphasis added by me):

Stored Procedures and Execution Plans

In SQL Server version 6.5 and earlier, stored procedures were a way to partially precompile an execution plan. At the time the stored procedure was created, a partially compiled execution plan was stored in a system table. Executing a stored procedure was more efficient than executing an SQL statement because SQL Server did not have to compile an execution plan completely, it only had to finish optimizing the stored plan for the procedure. Also, the fully compiled execution plan for the stored procedure was retained in the SQL Server procedure cache, meaning that subsequent executions of the stored procedure could use the precompiled execution plan.

SQL Server 2000 and SQL Server version 7.0 incorporate a number of changes to statement processing that extend many of the performance benefits of stored procedures to all SQL statements. SQL Server 2000 and SQL Server 7.0 do not save a partially compiled plan for stored procedures when they are created. A stored procedure is compiled at execution time, like any other Transact-SQL statement. SQL Server 2000 and SQL Server 7.0 retain execution plans for all SQL statements in the procedure cache, not just stored procedure execution plans. The database engine uses an efficient algorithm for comparing new Transact-SQL statements with the Transact-SQL statements of existing execution plans. If the database engine determines that a new Transact-SQL statement matches the Transact-SQL statement of an existing execution plan, it reuses the plan. This reduces the relative performance benefit of precompiling stored procedures by extending execution plan reuse to all SQL statements.

SQL Server 2000 and SQL Server version 7.0 offer new alternatives for processing SQL statements. For more information, see Query Processor Architecture.

Note from the above quote that the query optimizer now uses execution plans for ALL T-SQL statements, not just stored procedures.  The perceived performance gain from stored procedures stems not from some magic use of CREATE PROC, but rather in plan re-use, which is available to ad-hoc queries as well. 

So what promotes plan re-use?  The simplest answer is parameterization; SQL statements which use parameters efficiently (which includes many stored procedures) will be more likely to reuse a plan.  Developers should focus on making the most out of parameters, rather than simply assuming that a stored procedure will be efficient simply because of some magical aspect of said procs.

A final thought: For a great starting place on understanding SQL Server plan reuse, see http://msdn.microsoft.com/en-us/library/ee343986(SQL.100).aspx, including the Appendix A: When Does SQL Server Not Auto-Parameterize Queries.  Also, this post by Zeeshan Hirani explains why LINQ to SQL query plans don’t get reused.

Upcoming events & presentations:

My dance card is filling up fast:

SEPTEMBER, 2010

13 AtlantaMDF: I’m not presenting, but I’m on the leadership team.  We’re watching the PASS Summit 2009 keynote, and hoping to generate some additional interest in Summit 2010. 
21 Columbus GA SSUG: Presenting on Red Gate’s SQL Source Control.  This is a new presentation for me, and I’m still working out the kinks, but I’m excited about it.

 

OCTOBER, 2010

2 SQLSaturday 48: I’ll be in Columbia, SC, presenting 3 different sessions (2 of my own, and 1 for Red Gate):

 

NOVEMBER 2010

8-12 PASS Summit 2010: So far I haven’t been selected to present, but I did submit a Chalk Talk abstract.  I do plan to spend a LOT of time drinking beer and rubbing elbows with some of the brightest and best minds in the industry; I hope to see you there as well.

Of course, somewhere in the midst of all of this prep time, I’ve got to squeeze in my day job, my night job, my kids, and my girlfriend.  Who needs sleep?

#passvotes Tweaks and Data Geeks: PASS Election 2011

Obviously, the angst over the 2010 PASS elections still lives on for some, and some of you are probably ready to move on to more immediate concerns, but as for me, I’m stuck here trying to figure out what I should learn from all of this.  I do think we as a community need to move forward, and to that end, I’ve agreed to participate in Kevin Kline’s series on the PASS elections process, and answer the following question:

“Many in the community seem to think that the PASS election process is badly broken.  Do you think that PASS needs to implement fundamental and far-reaching changes to its election process, or does it only need some fine tuning?  Please explain your thoughts?”

Learning from the experience…

I don’t want to keep rehashing the past, because it’s been done (here, here, here, and here, to name a few; if you want a LOT more reading material, the elections site has a bunch of blog entries).   While there are lots of lessons to be learned, and ideas to be discussed, I think there are two key points that are not stressed enough:

  1. Our discourse as a community has been healthy, and
  2. Nobody involved in the process has walked away clean.

For the first point, I think we’ve been able to keep our disagreements civil; there have been some comments made from several key contributors to the discussion that have been more cutting than others, and I have said some things that I should have kept to myself, but all in all, I don’t think we’ve burned any bridges that can’t be rebuilt.   The only lingering meme that continues to bother me is the occasional “community-vs.-PASS” theme that is implied in some of the discussions; I’ll talk more about why that bothers me in a bit, but I fear we too easily fall into the trap of needing to define an invisible “Them” to be an antithesis to our “We”.   We is Them, and They are We.

The second point was driven home to me like a dagger to the heart in a response from Andy Warren to a dialogue in the comments section of Brent Ozar’s post on the issue:

Agreeing to disagree is seldom satisfying, but often necessary. I’m entirely biased about my views on this, but I think it’s easy to forget the pain that continues along with this discussion. Replace the name of SJ with yours, and imagine how it would feel to be left off the slate (fairly, unfairly or otherwise), and then have a conversation continue that seems to imply a great failure during the interview, but no details emerge. Do you defend yourself or lay low? What if you allow the inner details to be published, but they are not?

We don’t agree on the way things worked out. Ok. I think we should let things heal a little more, then have the values and process conversation that we should have had last year – one that I’ve admitted I should have helped to make happen. Yell at me. Yell at the process. But I think we’re at the point where we leave a good man to tend his wounds and we try to do better next year, or we have the entire conversation and let the chips fall where they way on each side. I vote for moving forward.

Not my place to tell you not to talk about it, and much of this has been good conversation. But please remember my friend is not an abstraction. Actually, I wish I could say that better. I know that you don’t he is an abstraction, but I’m not sure that this conversation, right now, helps any of us.

I’ve been so busy defending what I perceived to be attacks on the character of the Nom Com or the quality of my work that I’ve forgotten what it must be like to be in Steve’s place, and for that I’m truly sorry.   I don’t regret my choice to point out what I think are inaccurate statements about the process, but I should have tempered my comments with more grace than I did.  That being said, I think we all need to step back and realize that nobody was completely happy with this year’s election process; obviously, the controversy was not a desired outcome by anyone on the Nom Com, the Board, or the membership at large.  So how do we fix the process moving forward?

Is the process irreparably broken?

Short answer: No.

Longer answer:  I think if you look at any business that has an electable Board of Directors, the nominations process is similar; the Board appoints a committee to find qualified candidates, and then votes to approve or reject the slate.   The elections process should be simple, but fair; I don’t think that a Board of an organization need be run like the federal government (with election cycles dictating workflow). That being said, I think there are lots of opportunities to tweak the process, and learn from our mistakes.

Tweak 1: Standardize the qualifications for a Director

I think PASS made strides this year, but there should be very specific minimum requirements for a seat at the table.  Andy Leonard and K. Brian Kelley both made excellent points about the inadequacy of using a ranking system to evaluate certain concrete measures like Education and References.  If we must use a ranking system, then the criteria for the ranks need to be carefully defined, like so:

Applicant must have a Bachelor’s Degree (or equivalent University degree).
Add 1 point for a Master’s Degree in any subject.
Add 1 point for a Doctorate Degree in any subject.

Please note that the above example is merely intended to illustrate specific ranking criteria; the definition of Education itself warrants more discussion than I have time for in this post.

Experience with PASS is another great example of a objective measure that needs standard ranking; how long have you been a member?  What volunteer roles have you performed?  Is experience as a chapter leader more or less valuable than experience on the Program Committee?  Whatever standards are chosen, they should be well-defined and applied evenly from year to year; if an applicant doesn’t change their behavior from year to year (i.e., an applicant with little PASS experience in year 1 doesn’t get involved in year 2), they should have the same score.

Tweak 2: Transparent Application, Opaque Interviews, Translucent Board

Applications should be made available to the public, as well as the ranking using the standardized questionnaire described above; the general members of PASS should feel confident that the scoring system used to evaluate an application reflects the applicant’s ability to meet the minimum requirements.

However, I think that discussions within the Nominations Committee, including interviews with the applicants, should remain opaque: a black box.  I know others think that the Nom Com should be completely transparent, but I think that the job of the Nom Com is to probe areas beyond the minimum qualification, and in order to do so, the members of that group need the ability to ask questions that may not be appropriate for general consumption.  I think this protects both the applicants and the volunteer interviewers (what if I, as a volunteer, were to ask a really stupid question?  Should I be vilified on the Internet by the membership?).  But here’s the rub: the interviews need to be recorded.

The interviews should NEVER be released to the general membership, but once the slate has been presented to the Board of Directors for approval or rejection, the interview tapes need to be included as part of the recommendation in order to give the Board full insight into why the Nom Com chose to recommend or deny certain candidates.  The board should then accept or reject the slate, and if they choose to reject the slate, decide how they’re going to move forward; the discussions surrounding the slate should not be released, but the vote should.  That way, if the general membership felt that the process was unfair, they could contact the Board members and move forward with resolving the issue.

Tweak 3: Maximize member involvement

I think one of the hardest problems to tackle in this discussion is the issue of member involvement; as I mentioned earlier, the theme of “community vs. PASS” is a difficult pill for me to swallow because I see every member of PASS (including the Board) as member of the community.  I also realize (as a chapter leader) that there are many members of the community which are not PASS members, and members of both subsets that are not active in many of the social networks that were abuzz with concerns over the process.

Let me back up and clarify: I’m a chapter leader for AtlantaMDF, and at one point, we had nearly 1500 SQL Server Professionals on our private mailing list.  These were people who at one point had registered for a meeting of ours; I’d consider them member of the community.  Are they PASS members? Maybe.  Are they active on twitter, or active bloggers?  Dunno; my experience presenting on these topics leads me to believe that the vast majority of SQL Server Professionals are NOT involved in social networking.  Kendra Little had a similar take on the problem of uninvolved membership (I stole the graph from her):

Voting stats from the last few elections would probably back up this idea; although I don’t remember the actual numbers, I do remember thinking that the voter turnout was abysmal compared to the number of people that PASS claims as members (remember that membership is free).  Sort of like the same feeling I get when I think that AtlantaMDF invites 1500 people every month to come to a meeting, and 75 do; I’m grateful we got 75, but we’re missing a lot of interaction from 1425 other people.

So how do we involve the membership?  I think that Chapter Leaders (including Virtual Chapter Leaders) should be randomly selected and asked to serve on the Nom Com every election cycle, and the number of Chapter Leaders should always outweigh the number of Board seats by at least 1.   Why Chapter Leaders?  To me, they represent the local faces of PASS, and are the most likely ones to understand the pulse of the community.  Why not bloggers, tweeters, or columnists?  Although I think that social networkers provide a great service to the community, they don’t necessarily have an official relationship with PASS.   PASS serves the community, but I think the election process needs to be membership driven.

Involving the membership via Chapter Leaders on the Nom Com should (I hope) instill some trust in the process, even if the interview process yields an unpopular result.  It’s a lot harder to accuse the Nom Com of predetermining the outcome of an interview because of their ties to the Board if the Nom Com consists of a majority of non-Board members.

Summing up.

I have several other ideas (such as the possibility of a Chapter Congress), but many of them require drastic overhauls of how PASS does business; that’s simply not feasible.  I think many of the problems of the last two elections can be easily resolved with a few minor tweaks.  However, the first step is the conversation, and that conversation needs to begin well before the election season.  It’s easy to be upset about the outcome, but in order for real change to occur, we (the community) need to step up and participate in the conversations.  I am eagerly awaiting the chance; are you?

Querying XML in SQL Server 2005+: namespaces

I recently helped a friend solve an XML problem, and thought I would post the solution here.  Although there are lots of notes on how to use XQuery in SQL Server 2005+, this was a real world scenario that was trickier than I expected.  The friend works for an insurance company broker, and in one of their applications, accident questionnaires (and their answers) are stored in XML.  This allows them to treat all questionnaires as the same, regardless of their origin as long as the QuestionCodes are common across vendors.

Below is the sample data that he was asking me about; he needed to get one question and answer per row into a data set:


DECLARE @T TABLE ( RowID INT, Fragment XML )
INSERT  INTO @T
       
( RowID, Fragment )
VALUES  ( 1, '<Questionnaire xmlns="http://www.xxx.com/schemas/xxxXXXXXxxx.0">
<Question>
<QuestionCode>74</QuestionCode>
<Question>Why did you wreck your car?</Question>
<Answer>I was drunk</Answer>
<Explanation />
</Question>
<Question>
<QuestionCode>75</QuestionCode>
<Question>Why is the rum all gone?</Question>
<Answer>Because I drank it.</Answer>
<Explanation />
</Question>
</Questionnaire>'
)
,       (
2, '<Questionnaire xmlns="http://www.xxx.com/schemas/xxxXXXXXxxx.0">
<Question>
<QuestionCode>74</QuestionCode>
<Question>Why did you wreck your car?</Question>
<Answer>Stuart was drunk</Answer>
<Explanation />
</Question>
<Question>
<QuestionCode>75</QuestionCode>
<Question>Why is the rum all gone?</Question>
<Answer>Because I made mojitos.</Answer>
<Explanation />
</Question>
</Questionnaire>'
)

I thought it was a simple query; simply use the .nodes() method to rip each of the questions and corresponding answers into their own rows, but for some reason, when I ran the following, I got interesting results:

SELECT  t.RowID
     
, QuestionCode = t1.frag.value('(QuestionCode)[1]', 'int')
      ,
Question = t1.frag.value('(Question)[1]', 'varchar(max)')
      ,
Answer = t1.frag.value('(Answer)[1]', 'varchar(max)')
      ,
Explanation = t1.frag.value('(Explanation)[1]', 'varchar(max)')
FROM    @t t
       
CROSS
APPLY Fragment.nodes('//Questionnaire/Question') AS t1 ( frag )

RowID  QuestionCode    Question    Answer  Explanation

That’s right, nothing.  Strange, considering I’ve done variations of this query for a couple of years now to parse out firewall data fragments.  I looked closer, and tried to see what was different about the XML fragment from this example compared to mine, and it was clear: a namespace reference.   Most of the data I deal with is not true XML, but rather fragments I convert to XML in order to facilitate easy transformations.  To test, I stripped the namespace line (xmlns="http://www.xxx.com/schemas/xxxXXXXXxxx.0" ) out, and voila!  Results!

RowID QuestionCode Question Answer Explanation
1 74 Why did you wreck your car? I was drunk
1 75 Why is the rum all gone? Because I drank it.
2 74 Why did you wreck your car? Stuart was drunk
2 75 Why is the rum all gone? Because I made mojitos.

Well, that was great, because it showed me where the problem was but how do I fix it?  I stumbled upon a solution, but to be honest, I’m not sure it’s the best one.  If I modify my query to refer to any namespace (the old wildcard: *) like so:

 

SELECT  t.RowID
     
, QuestionCode = t1.frag.value('(*:QuestionCode)[1]', 'int')
      ,
Question = t1.frag.value('(*:Question)[1]', 'varchar(max)')
      ,
Answer = t1.frag.value('(*:Answer)[1]', 'varchar(max)')
      ,
Explanation = t1.frag.value('(*:Explanation)[1]', 'varchar(max)')
FROM    @t t
       
CROSS
APPLY Fragment.nodes('//*:Questionnaire/*:Question') AS t1 ( frag )

 

I get the correct results.

RowID QuestionCode Question Answer Explanation
1 74 Why did you wreck your car? I was drunk
1 75 Why is the rum all gone? Because I drank it.
2 74 Why did you wreck your car? Stuart was drunk
2 75 Why is the rum all gone? Because I made mojitos.

Here’s the question for any XML guru that stumbles along the way; is there a better way to do this?

#passvotes Welcome to the Nom Com: Here’s Your Helmet.

It’s been over a week since my last post on this subject, and to be honest, I was hoping that I wasn’t going to have to write any more.  I felt like I had stated my position that the Nominations Committee for the PASS elections of 2010 had been faced with an unpopular decision, and we chose to take the high road of following the process we established before picking the slate.  Since then, I’ve had the peculiar experience of both being hailed as a hero (for making a public statement) and yet having my work deemed inferior (the decision we reached as a committee).  I’m hoping that I can clarify some things, and publically answer some questions that have been asked of me online and face-to-face.

Catching up…

If you haven’t been following along, or (like me) have missed a couple of really good postings on the issue, you may want to check out the list of postings at the Discussion page on http://elections.sqlpass.org.  I think it speaks volumes about our community that we worry about things like fairness and openness, and that the organization which is at the center of the controversy is hosting comments (both positive and negative).

The caveats: It’s all about me, really.

I should note that I’m making this post as an individual, not as a sanctioned representative of the Nominating Committee or PASS.  It’s all me, and these are my beliefs and my opinions.  I like to think that I hold myself to the highest standard possible, so there are some things that I’m going to frame in terms of my beliefs, even those beliefs may be more restrictive than what is legally or professionally required.

First, I believe that the Nominations Committee should be a “black box”; you define the standard by which decisions should be reached, the applicants to whom those decisions apply, and the decision makers before putting that in the box.  Whatever happens in the box, stays in the box.  I had earlier alluded to protecting the privacy of the applicants, but to be honest, the Committee should also protect the privacy of the members (in my opinion).  This was a volunteer gig, and even though many of the volunteers were sitting Board members, a few of us were not.  I know I said some things in those meetings about applicants that I don’t necessarily want broadcast, and according to the election procedure, my comments should be kept in committee; I believe that volunteers should have the right to express ideas in discussion that are unpopular, and those comments should not be used in a court of public opinion.   If I thought that Jack Corbett was an evil genius intent on destroying the world, I should have the right to use that to spark a discussion in committee without ruining my relationship with him (by the way, I don’t believe that; Jack is one of the sweetest guys I know).  

Second, I believe that no matter how unpopular the decision, groups that are composed of volunteers shouldn’t throw the committee under the bus when the work is done.   I SUPPORT THE RESULTS OF THE NOM COM.  Period.  This is not “circling the wagons” or “hiding behind the process”; this is a fundamental precept of my understanding of civilized discourse.   Seven people had an opportunity to discuss all of the applicants, and seven people reached a majority decision to promote five applicants to the slate.  I respect the work of my colleagues on the Nom Com, and I hope they feel the same way about me.

Third, we’ve all got to work together tomorrow.  The SQL Server community is large, but it’s still a small village in some ways, and I want to be careful in what I say because I want to engage in productive dialogue, and not destructive.  It was funny to see some of the tweets about the perceived “us-vs-them” (and I know those tweets were in jest), but the truth is, I respect a lot of people that I don’t always agree with.  Andy Leonard and I had some great discussions over the weekend at SQL Saturday 51, as did Andy Warren and I.  Lynda Rab hugged me.   Steve Jones and I have exchanged DM’s, and I’m hoping to sit down over a beer with him at some point.  I’ve also had several emails and phone calls from others with whom I disagree; I’m friends with people, and I hope that at the end of the day, we can walk away respecting each other even though we may have different ideas about how to solve the problem.  There’s no “them” in this discussion; we’re all us.

On to the issues…

Andy Leonard recently posted a great wrap-up about the process as he understands it; he does make some conjectures about the process which I think are incorrect, but for the most part, I think his analysis is insightful and a great read for people who can only observe from the outside.

Andy makes a great argument about process failures versus execution failures, and I can see his point. 

I left Farmville Virginia at 4:00 AM EDT Friday morning heading to SQL Saturday #51 in Nashville Tennessee. I thought about where I was going the night before, printed some basic instructions, pre-programmed a few addresses into my Garmin, and then drove roughly 550 miles in about 9 hours… this was my process. If I’d ended up in New York City instead of Nashville I could offer the excuse that I followed a process, but you would see right through that excuse, wouldn’t you? You’d say things like "That’s weak Andy". If you had a vested interest in meeting me in Nashville and I called emailed you from New York City to tell you I wasn’t in Nashville and that I was, in fact, even further from Nashville than I was at the beginning of following my process, you would be understandably put out.

It’s a great analogy, but it’s hinged on one troublesome precept: the Nom Com and the Board of Directors failed.  I don’t think we failed (and I know that statement’s about to open up a heap of trouble).   Did we (the Nom Com) arrive at an unpopular decision?  Yep.  Did the Board of Directors support that unpopular decision and ratify the slate?  Yep.   Is that a failure?  No.  If you’re assuming that Steve Jones deserves to be a candidate, then I can see how you want to blame somebody or something for reaching a different outcome than you desire.  But that’s not a failure; it was an unplanned outcome <G>.

Here’s my take on it: We (the Nom Com) had an agreement with PASS to find qualified candidates for the 2010 Board of Directors; to fulfill that agreement, we developed a process that we published and had explicitly approved by the BOD, and implicitly approved by the PASS membership (I don’t remember getting any feedback at all from the membership after making this post).  We used a template to evaluate written applications, and then interviews (more about the template below).  After all was said and down, it came down to an Yes/No/Abstain vote from each of the Nom Com members (as explained by Tom LaRock); Steve and Jack didn’t get enough votes to go on.  Did the Nom Com deliver a slate of qualified candidates?  Damn skippy; I believe that each and every one of the five individuals of the slate deserves an opportunity to be a Director.  Did we leave off candidates that were qualified according to public opinion? Obviously so.  Is that a failure?  I wouldn’t call it one; to rephrase Andy’s analogy, I think we made it to Nashville, but we left behind part of the slide deck.  Can we still go on?

What about the Numbers?

Both Andy Leonard and K. Brian Kelley provide some insight into the numbers from the templates used to rank the candidates.  Again, very well written and thoughtful posts that certainly raise some questions about why Steve (and to some degree, Jack, who is getting a very short shrift in this controversy) didn’t make the slate.  It’s understandable, because those templates are comfortable to look at for data-oriented geeks.  We like numbers, and we like it when numbers make sense.   It would be great if we had some sort of objective measurement by which to determine the qualifications of an applicant; unfortunately, the template ain’t it.

Brian’s analysis shows the holes in the theory that we had an objective measure by questioning how we (the Nom Com) arrived at those ratings; granted, his rankings are based on his personal knowledge of Steve Jones, and he’s a sample of 1, but it shows the subjectivity inherent in a ranking system.  Granted, the averages were supposed to ease some of the subjectivity (by reducing the emphasis on outliers), but 7 is still a small sample, so extreme differences among Nom Com members could have definitely impacted the outcome.  Furthermore,  the Nom Com was privy to information that was not public (the complete application and interview); our rankings were based on our perception of how well the candidate met the criteria using all information available to us, and most of that information came from the application and the interview.  Did I know about Steve’s black belt? No.  Did I know his son was on the Eagle Scouts?  Wasn’t on the application form.

Ultimately, the numbers were used as a guide to facilitate discussion; there was no cut-off point, and 7 interviewees were well within the maximum number of candidates for the Board (as defined by the bylaws; see the election process for more details).  It came down to a majority vote, and the majority of the committee felt that Steve and Jack were not ready at this time to be on the slate (for Jack, it was 0 yea’s, 6 no’s, and an abstention; for Steve, it was 1 yea, 5 no’s, and an abstention).  

The question of fitness.

Here’s a quote from Andy Leonard:

I know Steve. I believe he would be disruptive. I think he would challenge the status quo and defend the SQL Server Community with every ounce of his being. I believe he would start his two years of service on the Board more stubborn and obstinate than he would end it. Like every Board of Director member that’s taken time to share their experience, Steve would evolve. For some, that’s a problem. They cannot tolerate the disruptive-ness. I believe Steve was deemed "unFit" for the Board for this reason.  

I may be misreading this, but Andy seems to be accusing some members of the Nom Com and the Board of Directors of bias against disruption or discussion, and to that end I ask: Have you met Rick Bolesta?  Andy Warren?  Lynda Rab?   None of these folks would I consider to be peacekeepers; they’re all opinionated, strong-willed, intelligent, and fair people (and I can name others on the BoD).   To assume that we (the Nom Com) deliberately excluded an applicant because they met the same characteristics of many of the sitting BoD is illogical.

To be fair, I can see how Andy got there; Steve is not known for pulling punches when it comes to PASS, and I can see how it looks like “Steve was a bully, so I’m not going to let him play on my team”.   But, to linger on that assumes that the we (the Nom Com) are incapable of being professional when it comes handling criticism.  To pull another quote from Andy regarding the Nom Com:

You could argue that they did their job with the same zeal as the people who shrink-wrap CDs and DVDs. But understand they were told to protect us from another marketing executive. This isn’t a complaint about the NomCom – it’s constructive criticism. I believe members of the NomCom know the difference.

I, too, believe the Nom Com knows when criticism is constructive, and I think we factored that in to the definition of Fit.  But one’s ability to deliver constructive criticism is NOT the only component of that measure; we examined all nominee’s skills, experiences, and strength to derive that measure.  Granted, it was a subjective measure masked as objectivity, but it should not have been used as a bias against Steve (or any other candidate) because he’s a rabble-rouser.

Other bits and bytes…

After all of that is said is done, there’s still some lingering questions to be answered.  I’d been meaning to write a post about this, but obviously the controversy is outweighing the need to understand what happened in committee.  Let me briefly summarize some of the main points:

Would I have changed anything about the process?

Yes.  I think that the Nom Com qualification process needs to be standardized and used every year; that would alleviate some of the concerns as to why the quality of candidates differentiated so wildly from last year to now.   I think the template is a great idea, but the Board needs to define (in conjunction with the membership at large; stole that idea from Andy Warren) what constitutes a qualified candidate.  Those qualifications need to be simple and easy to understand, and allow for a broad range of candidates to qualify.

I also think the board needs to have more community members sitting on it, like others have suggested (I couldn’t find the reference, so if you were first to post this, sorry).  I do believe that a Nom Com is necessary for PASS, and I’m open to discussions about the role of that committee. 

Am I mad about the controversy?

There have been times over the last week that I have wanted to haul a few posters aside and have a discussion out in the parking lot (despite the fact that the last fight I was in was in 7th grade, and I lost because my lip got stapled to my braces).  I’ve gotten very good at relieving stress via Wii Boxing.   I think the criticism that hurt most was from people whom I respect and they posted something without thinking about the people involved in the process; most I have forgiven (and talked it out).  Others will take a while before they gain my trust back.

I also HATE that 5 talented individuals that did make the slate (and 1 who did not) are being marginalized because of the outcry over Steve.  We’ve got to move on, and soon, or we’re going to miss out an opportunity to understand the next set of Directors.

Would I serve on the Nom Com again?

Hell, yes.   I plan to serve until I get the chance to deny Paul Randall a seat on the board (j/k).   To be honest, I’m very proud of my service, and I hope to do again.  I realize it’s not the easiest job in the world, but nothing important is.