We use a queueing metaphor for dataloads for the majority of our imports, and it’s been a challenge to figure out how to generate dynamic ranges to add to the queue. In other words, we want to tell our system to go and pcik data from a source that has an id value >x and <=y, and then do it again in a few minutes with a new x and y value. I should say that it’s been a challenge because I’m still not used to SQL 2005 features; I came across a bit of code using a CTE and recursion to generate a numbers table (thanks to Stefan Keir Gordon). Here’s a slightly modified version of his code:
/*thanks to Stefan Keir Gordon for the inspiration http://www.keirgordon.com*/ SET STATISTICS IO ON ;WITH Numbers(n) AS (SELECT 1 AS n UNION ALL SELECT (n + 1) AS n FROM Numbers WHERE n <= 32767) SELECT n FROM Numbers WHERE n<=32767 OPTION(MAXRECURSION 32767)
;WITH Ranges(MinValue) AS (SELECT 1 AS MinValue UNION ALL SELECT (Minvalue + 1) AS MinValue FROM Ranges WHERE MinValue <= 32767) SELECT MinValue, MinValue + 1 as MaxValue FROM Ranges WHERE MinValue <= 32767 OPTION(MAXRECURSION 32767)
I end up with a two-column table with a MaxValue > MinValue, and the ranges increase by 1. But I had some issues to address:
- I needed to be able to pass in Start and End values to this code.
- I needed chunks bigger than 1 value in size; I also had a minimum size I wanted to use.
- In some cases 32767 ranges would be appropriate; in some cases, I needed less.
Here’s the final code; setting the Start and End valeus for the range of ranges is easy to understand, but the other two values may be a bit confusing. The first is @MaxRanges, which simply says “I want no more than this number of ranges to get from the Start to the End”. The maximum value for this number is 32767. @MinChunk refers to the minumum chunk size, or the space between MinValue and MaxValue for a given row.
It may be easier to just play with the values and see; you should never have more rows than @MaxValue, and the difference between MinValue and MaxValue for a given row should always be greater than or equal to @MinChunk:
/*A dynamically generated table of ranges. 2009-01-25 Stuart R Ainsworth http://codegumbo.com/ Thanks to Stefan Keir Gordon for the inspiration http://www.keirgordon.com */ SET STATISTICS IO ON DECLARE @MaxRanges int, @MinChunk int, @Start int, @End int SELECT @Start=10, @End=200000 /*@MaxRanges must be <= 32767; if the range between Start and End is large, and @MaxRanges > 32767, the End point won't be reached.*/ SELECT @MaxRanges=1000, @MinChunk=205 IF (SELECT CEILING((@End-@Start)/(@MaxRanges*1.00)))>@MinChunk SET @MinChunk = CEILING((@End-@Start)/(@MaxRanges*1.00)) ;WITH Ranges(MinValue) AS (SELECT @Start AS MinValue UNION ALL SELECT (Minvalue + @MinChunk) AS MinValue FROM Ranges WHERE MinValue < @End) SELECT MinValue, MaxValue = CASE WHEN MinValue + @MinChunk < @End THEN MinValue + @MinChunk ELSE @End End FROM Ranges WHERE MinValue < @End OPTION(MAXRECURSION 32767)
Good luck; happy coding!