A while back, I posted a solution to a problem I was facing where I needed to generate a table of ranges. While I haven’t experienced any sort of issues with the solution, a couple of things have recently occurred which has caused me to go back and take a look at the problem, and see if I can come up with a more elegant solution.
The first was a comment by Jeff Moden on the original post; Jeff suggested that recursion may not be the best choice for a solution given the relatively large number of logical reads compared to his solution and/or a permanent Numbers table. Jeff posted a solution, but his solution only addressed the first part of the article (building a dynamic numbers table) and NOT the second part (coming up with a table of ranges). The problem that I was struggling with using a table of numbers was the dynamic span between rows; in other words, I couldn’t figure out how to force row 1 to start with 10, and row 2 start with 215
The second driving factor was the FizzBuzz solution I proffered yesterday; while I was reviewing that code, I had a glimmer of an idea about how to handle the dynamic spanning using MODULO. I reworked the solution using Jeff’s suggestion, as well as using a permanent numbers table. My examples are below:
Original code:
/*A dynamically generated table of ranges. 2009-01-25 Stuart R Ainsworth http://codegumbo.com/ Thanks to Stefan Keir Gordon for the inspiration http://www.keirgordon.com */ DECLARE @MaxRanges int, @MinChunk int, @Start int, @End int SELECT @Start=10, @End=200000 /*@MaxRanges must be <= 32767; if the range between Start and End is large, and @MaxRanges > 32767, the End point won't be reached.*/ SELECT @MaxRanges=100, @MinChunk=2 IF (SELECT CEILING((@End-@Start)/(@MaxRanges*1.00)))>@MinChunk SET @MinChunk = CEILING((@End-@Start)/(@MaxRanges*1.00)) ;WITH Ranges(MinValue) AS (SELECT @Start AS MinValue UNION ALL SELECT (Minvalue + @MinChunk) AS MinValue FROM Ranges WHERE MinValue < @End) SELECT MinValue, MaxValue = CASE WHEN MinValue + @MinChunk < @End THEN MinValue + @MinChunk ELSE @End End FROM Ranges WHERE MinValue < @End OPTION(MAXRECURSION 32767)
Modification based on Jeff’s suggestion:
/*A dynamically generated table of ranges. part 2 2009-09-27 Stuart R Ainsworth http://codegumbo.com/ Thanks to Jeff Moden for the inspiration*/ DECLARE @MaxRanges int, @MinChunk int, @Start int, @End int SELECT @Start=10, @End=200000 SELECT @MaxRanges=100, @MinChunk=2 IF (SELECT CEILING((@End-@Start)/(@MaxRanges*1.00)))>@MinChunk SET @MinChunk = CEILING((@End-@Start)/(@MaxRanges*1.00)) SELECT MinValue, MaxValue = CASE WHEN MinValue + @MinChunk < @End THEN MinValue + @MinChunk ELSE @End End FROM ( SELECT ROW_NUMBER() OVER (ORDER BY (ac1.Object_ID)) -1 + @Start AS MinValue FROM Master.sys.columns ac1 CROSS JOIN Master.sys.columns ac2) x WHERE (MinValue = @Start OR (Minvalue-@Start) % @MinChunk = 0) AND MinValue + @MinChunk <= @End + @MinChunk
I also used a permanent table of numbers (per Jeff’s suggestion), but given the similarity to Jeff’s modification, I won’t post the code here. Simply replace the subquery with a numbers table (you’ll need to monkey with the math to get the dynamic starting point). What was interesting to me was the output for each of the three solutions when using SET STATISTICS IO and TIME:
Original code:
Table 'Worktable'. Scan count 2, logical reads 607, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0. SQL Server Execution Times: CPU time = 16 ms, elapsed time = 88 ms.
Modification (dynamic table):
Table 'syscolpars'. Scan count 2, logical reads 19, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0. SQL Server Execution Times: CPU time = 187 ms, elapsed time = 203 ms.
Modification (permanent table):
Table 'NUMBERS'. Scan count 1, logical reads 424, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0. SQL Server Execution Times: CPU time = 125 ms, elapsed time = 156 ms.
At first glance, it would appear that Jeff’s assertion is correct; the number of logical reads is greatly reduced by his suggested modification (more so by the dynamic table than the permanent table, interestingly enough). However, the CPU time is increased. And, while I know that execution plans are no guarantee for performance, in every situation SQL Server estimated that avoiding the original code was 0% of the three plans (dynamic modification 65% and permanent numbers table 35%).
So now, I’m stuck. I can see Jeff’s point, but I’m not sure that a solution based on his suggestion is that much better than my original solution; do I worry more about I/O or CPU?
Gosh… you never gave the “Tally Table” a chance to succeed. You put limits on your recursive CTE but not on the CROSS JOIN that created the “Tally Table” and it created more than 300,000 rows in the background. No wonder it took so long.
Try the following code which has limits on the “Tally Table” and watch it blow the doors off the recursive CTE. 😉
DECLARE @MaxRanges INT, @MinChunk INT, @Start INT, @End INT;
SELECT @Start = 10, @End = 200000,
@MaxRanges = 100, @MinChunk = 2;
IF (SELECT CEILING((@End-@Start)/(@MaxRanges*1.00)))>@MinChunk
SET @MinChunk = CEILING((@End-@Start)/(@MaxRanges*1.00))
;
WITH
cteTally AS
(
SELECT TOP (@MaxRanges)
N = ROW_NUMBER() OVER (ORDER BY (SELECT NULL))-1
FROM sys.all_columns ac1
CROSS JOIN sys.all_columns ac2
)
SELECT MinValue = @Start + (@MinChunk*N),
MaxValue = CASE
WHEN @Start + (@MinChunk*(N+1)) <= @End
THEN @Start + (@MinChunk*(N+1))
ELSE @End
END
FROM cteTally
;
Trying again to see if I can preserve some of the formatting I do in my code for readability…
DECLARE @MaxRanges INT, @MinChunk INT, @Start INT, @End INT;
SELECT @Start = 10, @End = 200000,
@MaxRanges = 100, @MinChunk = 2;
IF (SELECT CEILING((@End-@Start)/(@MaxRanges*1.00)))>@MinChunk
SET @MinChunk = CEILING((@End-@Start)/(@MaxRanges*1.00))
;
WITH
cteTally AS
(
SELECT TOP (@MaxRanges)
N = ROW_NUMBER() OVER (ORDER BY (SELECT NULL))-1
FROM sys.all_columns ac1
CROSS JOIN sys.all_columns ac2
)
SELECT MinValue = @Start + (@MinChunk*N),
MaxValue = CASE
WHEN @Start + (@MinChunk*(N+1)) <= @End
THEN @Start + (@MinChunk*(N+1))
ELSE @End
END
FROM cteTally
;
Heh… ok… I give up. I don’t know how to make “pretty” code on this blog.