Stuart Ainsworth

Leave-A-Comment Day!

  I recently watched the movie Julie and Julia with my wife; even though I was the only male in the audience (seriously), I really enjoyed the film.  It was pretty funny, particularly if you remember who Julia Child was.  Back when I was a kid (before cable TV), we only had 4 channels, and my mom loved watching this super-tall lady with a cleaver cooking on PBS.  Meryl Streep was amazing.

I’m sure that I’m giving nothing away by recapping the basic plot line; the story is about Julie, a middle-class government employee who escapes from life by cooking; she decides to write a blog detailing her experiences working through Julia Child’s classic French cookbook in a year.  The movie correlates events in Julie’s life with similar experiences in Julia’s.

Why am I telling you all of this?  Because, as a blogger, there’s a moment in the film where I can completely identify with Julie.  She’s been blogging for a while, and the only comment she had received was from her mother.  She was beginning to question her reason for writing, when she started receiving small gifts from her readers; realizing that “there was someone out there” helped her focus on achieving her goal.

 

Yeah, it’s sappy, and yeah, it’s a dramatization of something that probably isn’t true, but it did make me realize something; I’m a big believer in contributing to the IT community via blogs (and other means).  But a blog needs to be more than a lonely man shouting into cyberspace; at some point, there needs to be a conversation.  So that leads me to today: I, as king of my own little universe, hereby declare that today is “Leave-A-Comment Day!”  Huzzah!

What does this mean?  It means that I will pick at least three blogs today, and join in on the conversation; I will do my best to leave a pithy comment (something more than “thank you” or “oh, wow”).  My goal is to absorb and reflect on the material I encounter, and try to encourage the authors I’ve been stealing ideas from to generate more.

If you wish to encourage me on this quest, leave your own comment below.  Onward!

At what point did I switch from a film about French cooking to Monty Python & the Holy Grail?  Must be an attempt to salvage my testosterone.  Anyway, enjoy the Bridge of Death.

SQLSaturday #25 – big news!

Just a quick post to announce a couple of cool developments that are happening for SQLSaturday #25 in Gainesville, GA, Oct 10.

  • We’ve got 75 attendees registered!  I was really worried about the low turnout, but I’m feeling a lot more comfortable with it now.  One of the struggles that we’re facing is that Gainesville, GA is 55 miles from Atlanta; many residents seem to think that Gainesville is a long way away from the tech corridors.
  • We’ve got 15 presentations scheduled; that’s almost 3 full tracks, and I’m working to get those last 3 slots filled. 
  • Finally, WE HAVE A SPONSOR WITH CASH!  I’ve been very successful at getting the book publishers (Apress, Pearson, & O’Reilly) to donate stuff to giveaway, but I’ve had a hard time getting cash money collected.  Luckily, this is a small event, so our costs will be small, and we’re charging a lunch fee, but still… we need money 🙂   Anyway, thanks to Red Gate for stepping up and helping out; I love this company!

More news to come; we’re still working out the kinks with the student conference, but I’m excited about the opportunity. 

Another tale of the unwilling DBA – tempdb and the LUN

As many of you know, I’m a database DEVELOPER; I’ve been a DBA in the past, but my real specialty lies in getting different database platforms to exchange information with SQL Server.  I haven’t done a lot of hands-on administration (backups, maintenance plans, etc) in a while, but that’s changed recently.  Our primary production DBA has been out of the office for an extended period of time, and I’ve had to step back into a support role.

One of the projects he was working on before he left was performance tuning our databases on the SAN; he had recently segregated the tempdb into its own LUN, and everything looked OK, but during some of our overnight jobs we started seeing messages like:

Exception Information: System.Data.SqlClient.SqlException: The transaction log for database ‘tempdb’ is full. To find out why space in the log cannot be reused, see the log_reuse_wait_desc column in sys.databases

Uh-oh.  I had no clue where to begin; I knew that the database was on its own LUN, and I knew that that LUN was full, but how big should the tempdb LUN be?  I tried searching the Internet for guidance, but got nowhere; there doesn’t seem to be a rule-of-thumb for sizing tempdb based on other databases.

The LUN was currently sized for 10GB; we have 1.5TB of data.  I knew that seemed small, but we had been running for a couple of weeks without issues, so I suggested to our production DBA team that we should double it, and see what happens.   All was well until we got to the month-end processes, and BAM! 20GB of space sucked up by tempdb.  Now what?

I turned to Twitter, and got several helpful (and some not so helpful) responses; as many people pointed out, it needs to have as much free space available as possible, and the size of it was more dependent on activity than the amount of data.  Both excellent points, but neither one addresses the need to have a specific LUN size on a SAN.  Space ain’t free.

@venzann came up with the best guide: “Size of indexes + estimate on largest transaction size + wiggleroom.”

Using his suggestion on our 1.5TB server, I came up with an estimate of 430GB in indexes.   I stopped there, knowing that my SAN admin would shoot me if I tried to run that past them; besides, our server had run OK for a couple of weeks on 10GB; if we needed 430GB, we would have choked long before that (in retrospect, our production DBA should have recorded the size of tempdb BEFORE he moved it to a LUN; c’est la vie).   I decided to pick a smaller number: 100 GB.  Why?

First, it made logical sense using @venzann’s formula.  We use a partitioning schema for most of our data, so the sum of the active indexes was really only about 75GB.  Other indexes may be occasionally used by queries, but 90% of our searches only look at the last 2 days worth of data.

Second, it’s a nice round number.  100GB is 15% of 1.5TB.  Round numbers are good for rules-of-thumb.  You never hear anyone say “that ____ should be 14% of ____; just a rule-of-thumb”.

Finally, it was the maximum size I could request from our SAN admin without them ordering new drives.  Hey, if someone says you can have $100, you don’t say “all I need is 50” (unless of course, you’re Porter).

Anyway, it seems to be working; we’re humming along now without errors; in another 20 days or so, we’ll be doing month-end again, and we’ll see what happens.  I can’t wait for the production DBA to get back, so I can quit thinking about this stuff, and go back to my safe zone.

SQLSaturday #25 Gainesville GA Oct 10, 2009

Just a quick post to note that a) I’m still alive and posting, and b) I’ve been super busy trying to get this little conference off the ground.  I say little, because it looks like the Gainesville event is going to be a lot more intimate than the Atlanta event in April.  That’s not necessarily a bad thing, because I’m hoping that it will stimulate the growth of the North GA development community.

We’re still looking for sponsors and speakers, so if you’re interested in either, feel free to let me know (http://www.sqlsaturday.com/eventhome.aspx?eventid=26).  The biggest obstacle that I’m running into is the fact that many sponsors are concerned that the event is too close to the Atlanta event; on the other hand, many of the typical attendees are concerned that the event is too far from Atlanta.  It’s an odd catch-22.

Nevertheless, we have managed to garner some impressive presentations, and we’re continuing to slowly add attendees.  If we don’t have a lot of sponsors, we’ll be OK; as long as we can build community, we should.

Speaking of building community, if you haven’t read Alan Stevens’ recent blog postings about his experiences at the Jacksonville Code Camp, you should. 

http://netcave.org/AnEyeOpeningExperience.aspx

http://netcave.org/AnEyeOpeningApology.aspx

I’m not saying that I necessarily agree with everything Alan posted, but his experience should be considered when planning one of these events; although we want to keep SQLSaturdays (and other code camps; note the little “c”) community-focused, we need to be sure that expectations between speakers, attendees, and organizers are well-managed.  I appreciate the fact that Alan apologized for his statements; I also think that as an organizer, I need to learn from his experience to avoid similar situations at the events I help with.

Gotta run.

Wrapup: Columbus, GA SQL Server User Group, August 25, 2009

Tuesday, I drove to Columbus,GA to speak at the Columbus GA SQL Server User Group; I had first met Ken Simmons at the Birmingham SQL Saturday, and had promised him that I would drive down there to present.  Ken’s done a great job of organizing a PASS chapter in a small-town setting; Columbus doesn’t have a lot of technical employers, so founding a specialized user group requires a lot of work and input from the few large employers in town.   There were about 14 people in attendance, and 10 of them worked for the same company (although in different divisions).

Just as a side note; the group is meeting at the Columbus Public Library, which is a BEAUTIFUL facility.  The staff was very nice and accommodating.  The only limitation to meeting there is that they have to shut down by 8 PM.

My talk went OK; I’m always critical of my own work, and I realized that it’s a bit esoteric for some people.  One thing that I have learned (but haven’t really applied) is that many people who attend user group sessions and SQLSaturday’s don’t always have the same experiences that I do.  For example, I briefly discussed indexes in this presentation (Confessions of a Data Integrator; look for the upcoming blog posts), and I was a little surprised that there were so many questions about indexing strategies.  I was glad I was able to discuss it, but if I’d known that was still a hot topic, I would have prepared more.

So here’s the lesson learned: if you’re a beginning speaker, or even an advanced speaker, don’t assume that fundemental topics are overdone.  There’s still a great demand for people who can take an important concept (like index strategies, or disk I/O) and explain it in a simple fashion.  Even people who have been DBA’s for years may still have a question about something that you assume is too basic to focus on.

Speaking of basics, just a quick reminder that the Gainesville GA SQLSaturday (#25) is coming up in October; we’re still looking for speakers and attendees.  If you’re in the Atlanta area, Gainesville is only a half-hour away up I-85 and I-985.

Posting to post (part 2)

Another one of my infamous postings to post; not much to write about, really.  I’ve been extremely busy at work, and there’s lots of stuff going on in my personal “meatspace”.  When I get done for the day, I really don’t want to write or think anything.  I just want to sit in a chair and veg.

Anyway, I’m hoping that I’ll be more productive this week, and get a few postings out.  I’ve been working a lot getting ready for rollout, so not much time to experiment with new ideas.  Here’s hoping that will change.

Columbus, GA SQL Server Users Group – August 25th

Just confirmed today with Ken Simmons that I’ll be presenting at the Columbus GA SQL Server Users Group on August 25th at 6 pm at the Columbus Public library; if you’re in mid-west Georgia (and a data geek), come join us, and stop by and say “hi!”

I’ll be re-hashing a talk I gave at the Birmingham SQLSaturday regarding dealing with “somebody else’s bad data”; this is always a fun subject, and it’s great for interactive discussions (because every database professional deals with somebody else’s problem).

See you there!

Fun with NEWSEQUENTIALID()

We’re thinking about converting the constrain on an existing column in a table with billions of rows of data.  The column is a uniqueidentifier serving as a nonclustered primary key; we’re planning on replacing the NEWID() constraint to a NEWSEQUENTIALID().  What is challenging me is my concern over primary key collisions; will the new default values eventually run into an existing primary key value?

I used the following script to test, and I was a little surprised at the results:

CREATE TABLE TestNewSeqID(ID uniqueidentifier CONSTRAINT DF1 DEFAULT newID(),
                            Marker integer Identity(1,1))
GO

DECLARE @x int
SET @x= 1
WHILE @x <=10
BEGIN
    INSERT INTO TestNewSeqID DEFAULT VALUES
    SET @x = @x +1
END


SELECT *
FROM TestNewSeqID
ORDER BY ID DESC


ALTER TABLE TestNewSeqID
DROP CONSTRAINT DF1;

ALTER TABLE TestNewSeqID
ADD CONSTRAINT DF1 DEFAULT newsequentialid() for ID;

SET @x= 1
WHILE @x <=10
BEGIN
    INSERT INTO TestNewSeqID DEFAULT VALUES
    SET @x = @x +1
END

SELECT *
FROM TestNewSeqID
ORDER BY ID DESC


My results looked something like the following:

ID    Marker
F664E06A-51BF-4FC1-960E-C9E5E854ADAD    9
0746DBAC-8B14-463C-8C4E-9CC789794963    7
24217231-1865-4E2A-B8EF-944DEC9AC6A2    1
323C15B4-59DA-4B73-94E5-54A1A0860F3D    8
B0C1E7EE-3701-4916-A3E2-465FE8462965    2
BFE8C2AB-4788-4596-867E-221F75BAB338    10
C0BC2263-F5E6-41F1-83F6-14BD9D50A6DD    3
9FA2A683-FF02-458E-8703-10D1DCDEFA97    4
75445937-6B46-4B88-9525-0DD809A7BE4B    6
E93284BE-E93B-4009-B206-04CB2B107293    5
4390AD7A-CD7D-DE11-98B8-00196665566A    20
4290AD7A-CD7D-DE11-98B8-00196665566A    19
4190AD7A-CD7D-DE11-98B8-00196665566A    18
4090AD7A-CD7D-DE11-98B8-00196665566A    17
3F90AD7A-CD7D-DE11-98B8-00196665566A    16
3E90AD7A-CD7D-DE11-98B8-00196665566A    15
3D90AD7A-CD7D-DE11-98B8-00196665566A    14
3C90AD7A-CD7D-DE11-98B8-00196665566A    13
3B90AD7A-CD7D-DE11-98B8-00196665566A    12
3A90AD7A-CD7D-DE11-98B8-00196665566A    11

If I looped through several more times, the seed changed and the values got progressively higher, but they were still sorting lower than the original data set.  I’d appreciate it if others would run this script and see if they got the same results.

SQLSaturday #25 (Gainesville, GA) – No good topic turned away!

I think in my rush to emphasize the hybrid nature of this session of SQLSaturday with its focus on Geographic Information Systems and Environmental Policy, I inadvertently downplayed the fact that this is STILL a SQLSaturday.  If you’re passionate about SQL Server, and you want to share that passion with others, this is a great way to interact with others in the community.

So, if you’ve got a topic to share (even if it’s NOT one of our special topics), please sign up to do so.  I’m sorry for the confusion.

SQLSaturday #25
Gainesville, GA
October 10, 2009

You CAN teach an old dog….

This is more of a sticky note kind of post, to remind me to use a better syntax in the future.  Maybe it’ll help some of you as well.

A long time ago, when I was starting to write T-SQL, and I needed to extract only the date from a DATETIME column, I used to do something like this:

SELECT CONVERT(varchar(10), GETDATE(), 101)

 

and it worked OK.  That is, until someone pointed out that I needed to return a datetime instead of a varchar to my calling applications, so I wrapped the whole thing in another CONVERT function, like so:

SELECT CONVERT(datetime, CONVERT(varchar(10), GETDATE(), 101))

At least I now had the correct datetype.  But then I read on the newsgroups, that there may be a (marginal) performance benefit to using the DATEADD and DATEDIFF functions to return the date portion of a datetime, and the following method was suggested:

SELECT DATEADD(day, 0, DATEDIFF(day, 0, GETDATE()))

and that works great! I’ve been using that for years now.  However, it doesn’t work so well when trying to find the first date of a month or year or week.   I figured out that if I flip the parts of the DATEADD statement, then I could easily extract those values:

SELECT DATEADD(month, DATEDIFF(month, 0, GETDATE()),0) --returns 1st day of month
SELECT DATEADD(year, DATEDIFF(year, 0, GETDATE()),0) --returns 1st day of year
SELECT DATEADD(hour, DATEDIFF(hour, 0, GETDATE()),0) --returns the hour
SELECT DATEADD(quarter, DATEDIFF(quarter, 0, GETDATE()),0) --returns the 1st day of the quarter

And while that’s good, I still revert to the original positions out of habit when converting a datetime to a date (e.g., 20090724 23:59:00 to 20090724).  This post is more of a reminder to me to STOP DOING THAT, and use a consistent syntax for date conversions.

SELECT DATEADD(day, DATEDIFF(day, 0, GETDATE()), 0)