SQLServerPedia Syndication

SQL Saturday 25; reflections

Another SQLSaturday down!  This was my second one to organize in a year, and it was much different than the first event we held in April.  I learned a lot this time about managing the event, so it’s time for another wrap-up post 🙂  (YAY! You know you’re excited!).

Before I go to far, I need to express a huge THANK YOU to the faculty and students at Gainesville State College and the Institute for Environmental and Spatial Analysis, particularly Allison Ainsworth (and yes, we’re related).  I’ll explain more about their role in a bit, but they really made the event flow flawlessly and their presence helped make this event a huge success in a number of ways.  I’d also like to thank my daughter Isabel, who (like a typical twelve-year-old) alternated between being bored and excited about being there.  She was constantly in motion throughout the event, and she made me very proud.

I also want to thank our sponsors: Interworks and RedGate, as well as our hosting user group AtlantaMDF.  Without their involvement, there would be no SQL Saturday, and I hope that we can continue this tradition in our area for a long, long time.

Basic stats

  • 103 registered; 75 attendees
  • 12 faculty and student workers
  • 3 tracks; 14 speakers

What worked well…

  • Faculty & Student Volunteers rocked.  Having someone running the show who did not have a vested interested in seeing the speakers was a great asset; if you can partner with a local school and work out a deal to have student workers, DO IT!.  The team at GSC worked very hard, and while they managed to sneak in a few sessions, they didn’t expect to see all of them.  This was very different from the Alpharetta event earlier in the year where I felt guilty because several volunteers were unable to attend sessions.
  • Attendee Volunteers were awesome.  I had several attendees who were very willing to proctor the sessions they were in (thus relieving the student workers of that responsibility).  They handled speaker time, took care of announcements, and made sure that the sessions flowed smoothly from one to the next.
  • Lunch!  Lunch was a bit of an accident, really; we had made arrangements with Sonny’s Bar-B-Q to provide the meal, but had neglected to follow up until right before the event.  We were expecting box lunches, but because of the timing, they provided us with a buffet.  This happy accident was actually OK, as our costs were lower, and the amount of food provided was enough to ensure that everyone was well-fed.  In general, everybody ate well (breakfast, lunch, and a snack).
  • Speakers!  Just thumbing through the evaluation forms, I saw a lot of positive comments.  I also heard a lot from the attendees about how well the speakers did in general.  We had several first-time speakers at this event, and I think they handled themselves very well.
  • Facility.  I had several attendees compliment me on the choice of the facility; even though GSC is a bit of a drive from downtown Atlanta, several people seemed to indicate that the spaciousness of the floor plan was worth the drive.

What needs improvement…

  • Sponsor recruitment.  We only had a couple of sponsors, and I wished I had managed to contact more.  Part of the difficulty was in approaching sponsors so soon after the Alpharetta event.  However, I did have a few attendees remark that they missed having the opportunity to meet with the sponsors.
  • Prizes-for-evals.  In each session, we had attendees complete an evaluation form for a ticket; this way the form could be confidential, and they would have a chance to win a prize at the end of the day.  According to Allison, this was tough to manage (making sure that tickets and forms were collected at the end of each session).   I also think it biased the evals; since people HAD to complete one to get a ticket, many of them simply circled all #1’s or #4’s.
  • Session oppositions.  I had a couple of situations where I wished I could have touted the GIS sessions more than I did; at the end of the day, for example, most of the SQL Server folks went to the open spaces sessions rather than the GIS session describing the program at GSC.  People have free choice, but I wish I could have found a way to encourage them to learn more about the hosting organization.
  • Student presenters.  We had originally intended that there be a SQL Saturday #24 on Friday, featuring Student presenters.  It didn’t happen, so we need to figure out a way to open doors to students at GSC (and other schools).  We did have two presenters on Saturday, but I wish we had more.

What I would change next time…

  • Speaker polos; student worker T-shirts.  We had speaker –t-shirts made up for the event, and we also shared them with most student volunteers.  I had a couple of speakers remark that they missed the polos (but they understood it was a budgeting issue).   It was nice, however, to see a sea of green t-shirts manning the registration desk, and helping attendees find their way.
  • More vegetarian options.   No one went hungry; however, we had ordered vegetarian meals from a local Chinese restaurant for the attendees who had registered as vegetarians.  We had a few people who thought they had registered as a vegetarian, but it didn’t list that way on the extract from the site (my guess is they never actually clicked the button).  IF we do a buffet next year, we may do it as a Chinese buffet with additional vegetables.
  • Bill the event as a North Atlanta or Lake Lanier event.  The space is great, and we’d love to make it a bigger event next year; however, we have to convince the Atlanta development community that it’s worth the drive, AND we have to figure out a way to get more sponsors involved.   IF we could get a well-known keynote (or special topic) speaker to appear, that could boost our attendance rate, and boost the sponsorship.
  • Give away prizes during the last few minutes of a session.  Let’s do away with the prize-for-eval; still hand out eval forms, and tickets, but let the proctor in each session do the prize pull at the end of the session REGARDLESS of whether or not they have a completed form.  Hang on to all of the tickets for extra prizes at the end of the day.

Additional writeups:

A couple of blogs have already made it around the net; here’s a couple of different perspectives on the day.

http://dyfhid.blogspot.com/2009/10/sql-saturday-25-and-epiphany.html

http://ammonsonline.com/jeffablog/post/SQL-Saturday-25-Fun.aspx

http://arcanecode.com/2009/10/10/sql-saturday-25-gainesville-ga-october-10-2009/

<facedesk>

So, I’ve been answering questions on StackOverflow for two days now, and I still like the experience.  However, I have noticed a disturbing trend in several of the questions being asked that involve SQL Server.  Here’s an example of one; I’ll give you the question, and my answer, and then tell you what’s bugging me.

TSQL Generate 5 character length string, all digits [0-9] that doesn’t already exist in database

What’s the best way to do this?

I need to generate a 5 digit length string where all the characters are numeric. However, I need to be able to do this ‘x’ amount of times (user variable) and store this random strings in a database. Furthermore, I can’t generate the same string twice. Old strings will be removed after 6 months.

Pseudo-code

DECLARE @intIterator INT,
@intMax

SET @intIterator = 1
SET @intMax = 5 (number of strings to generate)

WHILE @intIterator <= @intMax
BEGIN

  -- GENERATE RANDOM STRING OF 5 NUMERIC DIGITS
   ???

  -- INSERT INTO DB IF DOESN'T ALREADY EXIST
  INSERT INTO TSTRINGS
  SELECT @RANDOMSTRING

  IF @@ERROR = 0
    SET @intIterator = @intIterator + 1

END

I know this probably isn’t the best way to do it, so advice is appreciated. But really looking for ideas on how to generate the numeric 5 length strings.

 

I think this is a great question; the guy’s obviously looking for a list of randomly generated five-character strings comprised entirely of digits; this means there are only 100,000 possibilities.  This question is tagged as SQL and TSQL, which means that he’s looking for a database-driven solution (I’m aware that there are several ways to skin this cat; I’m just showing you that he’s specifically looking for a SQL solution).  Note, however, that he’s using a loop to create, validate, and insert 1 row at a time. 

Here’s where a lot of reading lately on numbers tables comes into play; I think about this a bit, and post the following solution:

 

Here’s a set-based approach, using SQL 2005 syntax (would be a little easier with SQL 2008, but you didn’t specify). Also, if you had a numbers table, a large chunk of it can be cut out.

No looping, no duplicates, and should be nearly instantaneous (assuming the target column is indexed).

DECLARE @intMax INTEGER
SET @intMax = 5
INSERT  INTO TSTRINGS
        SELECT  q.nString
        FROM    ( SELECT    ROW_NUMBER() OVER ( ORDER BY ( NEWID() ) ) AS N2 ,
                            RIGHT(REPLICATE(‘0’, 5) + CONVERT(VARCHAR(5), N), 5) AS nString
                  FROM      –the subquery below could be replaced by a numbers table   
                            ( SELECT TOP 100000
                                        ROW_NUMBER() OVER ( ORDER BY ( ac1.Object_ID ) ) – 1 AS N
                              FROM      Master.sys.columns ac1
                                        CROSS JOIN Master.sys.columns ac2
                                        CROSS JOIN Master.sys.columns ac3
                            ) numbers
                  WHERE     RIGHT(REPLICATE(‘0’, 5) + CONVERT(VARCHAR(5), N), 5) NOT IN ( SELECT    nString
                                                                                          FROM      TSTRINGS ) –check to see if reused
                ) q
        WHERE   q.N2 <= @intMax

A nice, tidy, set-based solution that I’ll probably keep in mind for generating passwords in the future (with a little work).  So, what’s bugging me?  Is it the fact that I have 0 votes for this solution?  No, that’s not really it, because votes are kind of whimsical; some times you get them; sometimes you don’t.

No, what bugs me, is that out of the 6 answers posted, 3 of them involve CURSORS.  For a SQL question.   I know.  Shock and horror.

Seriously, though; why do people still use cursors?  I know that there are times when cursors are an appropriate tool, but you shouldn’t use them everyday.  It’s kind of like buying those specialized drill bits for removing a stripped screw; for that specific application, they work great.  If you leave them in your drill, and you try to use them in an everyday situation; guess what?  You’ve stripped out a screw.

Let me be as clear as I can:

CURSORS SUCK, PEOPLE.  HEAR ME?  THEY SUCK.

There you go; I feel better now.

Becoming a better me…

This will be a short post, mainly because I’m in the home stretch for SQLSaturday #25; lots of little stuff to get done between now and Saturday.  However, a few interesting things have been happening lately, and I thought I’d share some of them.

First, Microsoft announced its latest round of MVP’s.  No, I didn’t get one; things haven’t been THAT exciting.  However, I do know a lot of people who were first-time recipients, and that means that even though I don’t contribute nearly as much to the community as they do, I am brushing shoulders with them.   It’s kind of like of being the geek in high school who tutors the quarterback; I’m not cool, but I know people who are. 

Speaking of those people, I had a brief email exchange with one of them to get a feel for how you become an MVP, and he shared with me his list of accomplishments over the last year.  I know that the MVP process is completely subjective, but seeing what this friend of mine did over the last year helped me realize that a) I do a lot for the community in terms of organizing events and making sure that educational opportunities are available, but b) I don’t do enough technical writing and knowledge-sharing to establish myself as a SQL Server expert.  

Do I want to be a MVP?  Well, yes, that would be nice.  It’s always a good thing to be recognized by your peers for contributing to the community.  However, my ultimate goal is to be the best I can be at what I do, and that path just happens to coincide with some of the same stated goals of the MVP program.  If I want to up the ante, and be a rock star, I need to find ways to share knowledge, not just networking skills (and still manage to keep my marriage and family intact), regardless of whether or not those accomplishments are recognized by my peers.

That brings me to my second point; I’m now sharing on StackOverflow.com.  Brent Ozar’s been a long-time advocate for this site, and I finally saw the light.  I like it; the interactive interface is richer than USENET, and the reputation system allows you a way of validating that the advice you’re getting (and giving) is good.  Today is my second day of posting, and I’m already up to a reputation score of 216 (woo-hoo!)

Finally, Brent Ozar, Andy Warren, and Jeremiah Peschka have just announced the PASS Log Reader Awards.  I submitted a couple of entries from my blog for their review, and I’m hoping that I’ll pick up a few readers in the process.  I know a lot of database bloggers started this year, and while this is a competition, I’m hoping it’s more of a motivational experience for me.  I need to write more technical content, and I need inspiration from others to do so.

SQLSaturday 25 October 10, 2009 – Final Stretch!

Less than two weeks until SQLSaturday 25, and I have lots of stuff to wrap up; however, there’s been some exciting stuff happening in the last couple of weeks.

Attendance is up!

Attendance – Planning to Attend: 95  (as of today)

New Sponsor!

Interworks, Inc has really stepped up to the plate and become a Gold Sponsor for us!  YAY!  The more I hear about this company, the more I like about them.

Student Sessions!

Way back when I agreed to take on this project with Gainesville State College and IESA, it was our intent to make this a student-centered conference as much as possible; unfortunately, things haven’t gone according to plan.  However, we do have two student presentations (and we’re waiting on a few more):

Patrick Taylor: ArcIMS and SQL Server

Jeremy Rylee: Feature Data Object

We’ve also got a full roster of other sessions, which you can view here:

Start Time DBA Development GIS/BI/Special
8:30 AM Trevor Barkhouse Audrey Hammonds Michael Stark
  Deadlock Detection, Troubleshooting, & Prevention The T-SQL Trifecta: Enhancements for Everyone SQL Server 2008 Spatial Data
9:45 AM Neal Waterstreet Robert Cain Sudhanshu Panda
  SQL Server Backups Introduction to SQL Server Integration Services SQL Application in Geographic Information Systems
11:00 AM Trevor Barkhouse Julie Smith Mark Tabladillo
  Leveraging SQLDiag for Efficient Troubleshooting Cool Tricks to Pull from your SSIS Hat— Data Mining Beyond Adventure Works
1:30 PM Stuart Ainsworth Sergey Barskiy Dan Murray
  The Social DBA: Resources for Career Building CLR integration Data Visualization and High Value Low Cost BI
2:45 PM Jeff Ammons Robert Cain College Students
  Data Persistence in a Web Hosted World SQL Server Full Text Searching – An Introduction. Student Showcase Session 1
4:00 PM Chris Semerjian & Allison Ainsworth Cliff Jacobson More College Students
  GIS Certificate Program at IESA DBA vs ORM: Strategies for Data in a Code Focused Student Showcase Session 2

 

And, here comes the pain…

Stuff left to do includes:

  • Follow up with volunteers at IESA to make sure food is ready
  • Speaker dinner arrangements
  • Make sure Speaker shirts are on their way
  • Write my presentation – D’oh!
  • Make sure signage is done properly
  • Remember to breathe

Building ranges using numbers

A while back, I posted a solution to a problem I was facing where I needed to generate a table of ranges.  While I haven’t experienced any sort of issues with the solution, a couple of things have recently occurred which has caused me to go back and take a look at the problem, and see if I can come up with a more elegant solution.

The first was a comment by Jeff Moden on the original post; Jeff suggested that recursion may not be the best choice for a solution given the relatively large number of logical reads compared to his solution and/or a permanent Numbers table.  Jeff posted a solution, but his solution only addressed the first part of the article (building a dynamic numbers table) and NOT the second part (coming up with a table of ranges).  The problem that I was struggling with using a table of numbers was the dynamic span between rows; in other words, I couldn’t figure out how to force row 1 to start with 10, and row 2 start with 215

The second driving factor was the FizzBuzz solution I proffered yesterday; while I was reviewing that code, I had a glimmer of an idea about how to handle the dynamic spanning using MODULO.  I reworked the solution using Jeff’s suggestion, as well as using a permanent numbers table.  My examples are below:

Original code:

/*A dynamically generated table of ranges.
2009-01-25 Stuart R Ainsworth
    http://codegumbo.com/

Thanks to Stefan Keir Gordon for the inspiration
 http://www.keirgordon.com */


DECLARE @MaxRanges int, @MinChunk int, @Start int, @End int
SELECT @Start=10, @End=200000

/*@MaxRanges must be <= 32767;
  if the range between Start and End is large,
  and @MaxRanges > 32767, the End point won't be reached.*/


SELECT @MaxRanges=100, @MinChunk=2

IF (SELECT CEILING((@End-@Start)/(@MaxRanges*1.00)))>@MinChunk
    SET @MinChunk = CEILING((@End-@Start)/(@MaxRanges*1.00))

;WITH Ranges(MinValue) AS
  (SELECT @Start AS MinValue
   UNION ALL
   SELECT (Minvalue + @MinChunk) AS MinValue
   FROM Ranges WHERE MinValue < @End)
SELECT MinValue, MaxValue = CASE WHEN MinValue + @MinChunk < @End
                 THEN MinValue + @MinChunk
                 ELSE @End End
FROM Ranges
WHERE MinValue < @End
OPTION(MAXRECURSION 32767)

Modification based on Jeff’s suggestion:

/*A dynamically generated table of ranges. part 2
2009-09-27 Stuart R Ainsworth
    http://codegumbo.com/

Thanks to Jeff Moden for the inspiration*/

DECLARE @MaxRanges int, @MinChunk int, @Start int, @End int
SELECT @Start=10, @End=200000
SELECT @MaxRanges=100, @MinChunk=2

IF (SELECT CEILING((@End-@Start)/(@MaxRanges*1.00)))>@MinChunk
    SET @MinChunk = CEILING((@End-@Start)/(@MaxRanges*1.00))


SELECT MinValue, MaxValue = CASE WHEN MinValue + @MinChunk < @End
                 THEN MinValue + @MinChunk
                 ELSE @End End
FROM (    SELECT 
            ROW_NUMBER() OVER (ORDER BY (ac1.Object_ID)) -1 + @Start AS MinValue
        FROM Master.sys.columns ac1
        CROSS JOIN Master.sys.columns ac2) x
WHERE (MinValue = @Start
    OR (Minvalue-@Start) % @MinChunk = 0)
    AND MinValue + @MinChunk <= @End + @MinChunk

I also used a permanent table of numbers (per Jeff’s suggestion), but given the similarity to Jeff’s modification, I won’t post the code here.  Simply replace the subquery with a numbers table (you’ll need to monkey with the math to get the dynamic starting point).  What was interesting to me was the output for each of the three solutions when using SET STATISTICS IO and TIME:

Original code:

Table 'Worktable'. Scan count 2, logical reads 607, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

SQL Server Execution Times:   CPU time = 16 ms,  elapsed time = 88 ms.

Modification (dynamic table):

Table 'syscolpars'. Scan count 2, logical reads 19, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

SQL Server Execution Times:   CPU time = 187 ms,  elapsed time = 203 ms.

Modification (permanent table):

Table 'NUMBERS'. Scan count 1, logical reads 424, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

SQL Server Execution Times:   CPU time = 125 ms,  elapsed time = 156 ms.

At first glance, it would appear that Jeff’s assertion is correct; the number of logical reads is greatly reduced by his suggested modification (more so by the dynamic table than the permanent table, interestingly enough).  However, the CPU time is increased.  And, while I know that execution plans are no guarantee for performance, in every situation SQL Server estimated that avoiding the original code was 0% of the three plans (dynamic modification 65% and permanent numbers table 35%).

So now, I’m stuck.  I can see Jeff’s point, but I’m not sure that a solution based on his suggestion is that much better than my original solution; do I worry more about I/O or CPU?

FizzBuzz

Stumbled across Mike Hillwig’s post in response to a question posed by Brent Ozar.  Depending on how literally you want to use the word “print” (do you mean the actual command, or do you just want output?), I think I have a slightly more elegant solution than what Mike proposed.

SELECT CASE WHEN n%3=0 AND n%5=0 THEN 'FizzBuzz' 
            WHEN n % 3 = 0 THEN 'Fizz'
            WHEN n % 5 = 0 THEN 'Buzz'
            ELSE CONVERT(varchar(8), n) END as "N"
FROM (    SELECT TOP 100 ROW_NUMBER() OVER (ORDER BY (ac1.Object_ID)) AS "N"
    FROM Master.sys.All_Columns ac1) x

Basically, it avoids the WHILE loop; however, if you must use a PRINT command, then Mike’s suggestion is probably as elegant as I can imagine.  What’s interesting is that this may also provide me a better solution to my need to build a dynamic range that I posted a while back.  I’ll have to think about it a bit more, but I think it’s possible.

Coming to PASS Summit 2009: Twitter BINGO!

So, last year at PASS Summit, I had my first real experience with a heavily-twittered event.  What was interesting was that I met a lot of people in real-life that I had just recently begun following on Twitter; I had made the comment (in passing) to someone that we should have had BINGO cards to keep track of all of the face-to-face interaction.  This year (since I just recently found out that I was going to Summit 2009 – yay!), I decided to do what I could to make the idea come to fruition.  Since I know Brent Ozar is big into social networking and he has some pull with Quest, I approached him with the idea.

We’re still working out the details, but it looks like Twitter BINGO is on for Summit 2009 (note: this is NOT TwitterINGO).  I do know that we’re going to need at least 25 tweeps (is there another name for people who use Twitter?) to volunteer as game board icons.  As an icon, you’ll be sought out by game players seeking to fill in their game cards.  If you’re interested in volunteering, please leave a comment with your twitter handle below (or DM me, @stuarta); to be eligible you must:

        • Use Twitter semi-regularly;
        • Be planning to attend PASS Summit 2009, and;
        • Be willing to talk to lots of people you don’t know now face-to-face.

More details will follow as we start working the kinks out, but I’m hoping this will help the online twitter community intersect with meatspace.  Looking forward to hearing from volunteers!

EDIT: Since it sounds like Quest and/or SQLServerPedia might be putting up prizes for this, we’re going to have to exclude vendor twitter accounts from playing.  Personal accounts are fine, so if you work for Quest, and they have a booth at PASS, you can still play as yourself.  Kapiche?

Leave-A-Comment Day!

  I recently watched the movie Julie and Julia with my wife; even though I was the only male in the audience (seriously), I really enjoyed the film.  It was pretty funny, particularly if you remember who Julia Child was.  Back when I was a kid (before cable TV), we only had 4 channels, and my mom loved watching this super-tall lady with a cleaver cooking on PBS.  Meryl Streep was amazing.

I’m sure that I’m giving nothing away by recapping the basic plot line; the story is about Julie, a middle-class government employee who escapes from life by cooking; she decides to write a blog detailing her experiences working through Julia Child’s classic French cookbook in a year.  The movie correlates events in Julie’s life with similar experiences in Julia’s.

Why am I telling you all of this?  Because, as a blogger, there’s a moment in the film where I can completely identify with Julie.  She’s been blogging for a while, and the only comment she had received was from her mother.  She was beginning to question her reason for writing, when she started receiving small gifts from her readers; realizing that “there was someone out there” helped her focus on achieving her goal.

 

Yeah, it’s sappy, and yeah, it’s a dramatization of something that probably isn’t true, but it did make me realize something; I’m a big believer in contributing to the IT community via blogs (and other means).  But a blog needs to be more than a lonely man shouting into cyberspace; at some point, there needs to be a conversation.  So that leads me to today: I, as king of my own little universe, hereby declare that today is “Leave-A-Comment Day!”  Huzzah!

What does this mean?  It means that I will pick at least three blogs today, and join in on the conversation; I will do my best to leave a pithy comment (something more than “thank you” or “oh, wow”).  My goal is to absorb and reflect on the material I encounter, and try to encourage the authors I’ve been stealing ideas from to generate more.

If you wish to encourage me on this quest, leave your own comment below.  Onward!

At what point did I switch from a film about French cooking to Monty Python & the Holy Grail?  Must be an attempt to salvage my testosterone.  Anyway, enjoy the Bridge of Death.

SQLSaturday #25 – big news!

Just a quick post to announce a couple of cool developments that are happening for SQLSaturday #25 in Gainesville, GA, Oct 10.

  • We’ve got 75 attendees registered!  I was really worried about the low turnout, but I’m feeling a lot more comfortable with it now.  One of the struggles that we’re facing is that Gainesville, GA is 55 miles from Atlanta; many residents seem to think that Gainesville is a long way away from the tech corridors.
  • We’ve got 15 presentations scheduled; that’s almost 3 full tracks, and I’m working to get those last 3 slots filled. 
  • Finally, WE HAVE A SPONSOR WITH CASH!  I’ve been very successful at getting the book publishers (Apress, Pearson, & O’Reilly) to donate stuff to giveaway, but I’ve had a hard time getting cash money collected.  Luckily, this is a small event, so our costs will be small, and we’re charging a lunch fee, but still… we need money 🙂   Anyway, thanks to Red Gate for stepping up and helping out; I love this company!

More news to come; we’re still working out the kinks with the student conference, but I’m excited about the opportunity. 

Another tale of the unwilling DBA – tempdb and the LUN

As many of you know, I’m a database DEVELOPER; I’ve been a DBA in the past, but my real specialty lies in getting different database platforms to exchange information with SQL Server.  I haven’t done a lot of hands-on administration (backups, maintenance plans, etc) in a while, but that’s changed recently.  Our primary production DBA has been out of the office for an extended period of time, and I’ve had to step back into a support role.

One of the projects he was working on before he left was performance tuning our databases on the SAN; he had recently segregated the tempdb into its own LUN, and everything looked OK, but during some of our overnight jobs we started seeing messages like:

Exception Information: System.Data.SqlClient.SqlException: The transaction log for database ‘tempdb’ is full. To find out why space in the log cannot be reused, see the log_reuse_wait_desc column in sys.databases

Uh-oh.  I had no clue where to begin; I knew that the database was on its own LUN, and I knew that that LUN was full, but how big should the tempdb LUN be?  I tried searching the Internet for guidance, but got nowhere; there doesn’t seem to be a rule-of-thumb for sizing tempdb based on other databases.

The LUN was currently sized for 10GB; we have 1.5TB of data.  I knew that seemed small, but we had been running for a couple of weeks without issues, so I suggested to our production DBA team that we should double it, and see what happens.   All was well until we got to the month-end processes, and BAM! 20GB of space sucked up by tempdb.  Now what?

I turned to Twitter, and got several helpful (and some not so helpful) responses; as many people pointed out, it needs to have as much free space available as possible, and the size of it was more dependent on activity than the amount of data.  Both excellent points, but neither one addresses the need to have a specific LUN size on a SAN.  Space ain’t free.

@venzann came up with the best guide: “Size of indexes + estimate on largest transaction size + wiggleroom.”

Using his suggestion on our 1.5TB server, I came up with an estimate of 430GB in indexes.   I stopped there, knowing that my SAN admin would shoot me if I tried to run that past them; besides, our server had run OK for a couple of weeks on 10GB; if we needed 430GB, we would have choked long before that (in retrospect, our production DBA should have recorded the size of tempdb BEFORE he moved it to a LUN; c’est la vie).   I decided to pick a smaller number: 100 GB.  Why?

First, it made logical sense using @venzann’s formula.  We use a partitioning schema for most of our data, so the sum of the active indexes was really only about 75GB.  Other indexes may be occasionally used by queries, but 90% of our searches only look at the last 2 days worth of data.

Second, it’s a nice round number.  100GB is 15% of 1.5TB.  Round numbers are good for rules-of-thumb.  You never hear anyone say “that ____ should be 14% of ____; just a rule-of-thumb”.

Finally, it was the maximum size I could request from our SAN admin without them ordering new drives.  Hey, if someone says you can have $100, you don’t say “all I need is 50” (unless of course, you’re Porter).

Anyway, it seems to be working; we’re humming along now without errors; in another 20 days or so, we’ll be doing month-end again, and we’ll see what happens.  I can’t wait for the production DBA to get back, so I can quit thinking about this stuff, and go back to my safe zone.