We use a queueing metaphor for dataloads for the majority of our imports, and it’s been a challenge to figure out how to generate dynamic ranges to add to the queue. In other words, we want to tell our system to go and pcik data from a source that has an id value >x and <=y, and then do it again in a few minutes with a new x and y value. I should say that it’s been a challenge because I’m still not used to SQL 2005 features; I came across a bit of code using a CTE and recursion to generate a numbers table (thanks to Stefan Keir Gordon). Here’s a slightly modified version of his code:
/*thanks to Stefan Keir Gordon for the inspiration http://www.keirgordon.com*/ SET STATISTICS IO ON ;WITH Numbers(n) AS (SELECT 1 AS n UNION ALL SELECT (n + 1) AS n FROM Numbers WHERE n <= 32767) SELECT n FROM Numbers WHERE n<=32767 OPTION(MAXRECURSION 32767)
;WITH Ranges(MinValue) AS (SELECT 1 AS MinValue UNION ALL SELECT (Minvalue + 1) AS MinValue FROM Ranges WHERE MinValue <= 32767) SELECT MinValue, MinValue + 1 as MaxValue FROM Ranges WHERE MinValue <= 32767 OPTION(MAXRECURSION 32767)
I end up with a two-column table with a MaxValue > MinValue, and the ranges increase by 1. But I had some issues to address:
- I needed to be able to pass in Start and End values to this code.
- I needed chunks bigger than 1 value in size; I also had a minimum size I wanted to use.
- In some cases 32767 ranges would be appropriate; in some cases, I needed less.
Here’s the final code; setting the Start and End valeus for the range of ranges is easy to understand, but the other two values may be a bit confusing. The first is @MaxRanges, which simply says “I want no more than this number of ranges to get from the Start to the End”. The maximum value for this number is 32767. @MinChunk refers to the minumum chunk size, or the space between MinValue and MaxValue for a given row.
It may be easier to just play with the values and see; you should never have more rows than @MaxValue, and the difference between MinValue and MaxValue for a given row should always be greater than or equal to @MinChunk:
/*A dynamically generated table of ranges. 2009-01-25 Stuart R Ainsworth http://codegumbo.com/ Thanks to Stefan Keir Gordon for the inspiration http://www.keirgordon.com */ SET STATISTICS IO ON DECLARE @MaxRanges int, @MinChunk int, @Start int, @End int SELECT @Start=10, @End=200000 /*@MaxRanges must be <= 32767; if the range between Start and End is large, and @MaxRanges > 32767, the End point won't be reached.*/ SELECT @MaxRanges=1000, @MinChunk=205 IF (SELECT CEILING((@End-@Start)/(@MaxRanges*1.00)))>@MinChunk SET @MinChunk = CEILING((@End-@Start)/(@MaxRanges*1.00)) ;WITH Ranges(MinValue) AS (SELECT @Start AS MinValue UNION ALL SELECT (Minvalue + @MinChunk) AS MinValue FROM Ranges WHERE MinValue < @End) SELECT MinValue, MaxValue = CASE WHEN MinValue + @MinChunk < @End THEN MinValue + @MinChunk ELSE @End End FROM Ranges WHERE MinValue < @End OPTION(MAXRECURSION 32767)
Good luck; happy coding!
Hi Stuart,
After running the following comparison code, you may want to reconsider the idea of using recursion…
SET STATISTICS IO ON
PRINT REPLICATE(‘=’,80)
PRINT ‘========== Stefan Keir Gordon”s method ==========’
SET STATISTICS TIME ON
;WITH Numbers(n) AS
(SELECT 1 AS n
UNION ALL
SELECT (n + 1) AS n
FROM Numbers
WHERE n <= 32767)
SELECT n
FROM Numbers
WHERE n<=32767
OPTION(MAXRECURSION 32767)
SET STATISTICS TIME OFF
PRINT REPLICATE('=',80)
PRINT '========== Jeff Moden''s method =========='
SET STATISTICS TIME ON
SELECT TOP 32767
ROW_NUMBER() OVER (ORDER BY (ac1.Object_ID)) AS N
FROM Master.sys.All_Columns ac1
CROSS JOIN Master.sys.All_Columns ac2
SET STATISTICS TIME OFF
SET STATISTICS IO OFF
PRINT REPLICATE('=',80)
You may also want to start delving into the idea of having a permanent Tally or Numbers table. Please see the following article.
http://www.sqlservercentral.com/articles/T-SQL/62867/
Hey Jeff,
I’ve used Numbers table before, but I couldn’t figure out a way to use them to generate a table of ranges; see the latter half of my post.
I tried modifying your method, but it’s 3:30 on a Friday, and I’m brain dead 🙂
Stu
Heh… after more than a year, I’m just curious… have you embraced the idea of using a numbers table or are you still using recurrsion for such a thing?
Hey Jeff,
Sorry for the late reply; been a busy couple of weeks. I never disavowed the method; I jsut never went back and changed the code. After looking at it for a couple of minutes this morning, I figured out how to alter my proc, but honestly, it’s not a priority. I’ll keep it in mind forthe future, though.
Stu
I realize this post is old, but that isn’t the max number of recursions a CTE query can do, set MAXRECURSIONS to 0.