A while back, I posted a solution to a problem I was facing where I needed to generate a table of ranges. While I haven’t experienced any sort of issues with the solution, a couple of things have recently occurred which has caused me to go back and take a look at the problem, and see if I can come up with a more elegant solution.
The first was a comment by Jeff Moden on the original post; Jeff suggested that recursion may not be the best choice for a solution given the relatively large number of logical reads compared to his solution and/or a permanent Numbers table. Jeff posted a solution, but his solution only addressed the first part of the article (building a dynamic numbers table) and NOT the second part (coming up with a table of ranges). The problem that I was struggling with using a table of numbers was the dynamic span between rows; in other words, I couldn’t figure out how to force row 1 to start with 10, and row 2 start with 215
The second driving factor was the FizzBuzz solution I proffered yesterday; while I was reviewing that code, I had a glimmer of an idea about how to handle the dynamic spanning using MODULO. I reworked the solution using Jeff’s suggestion, as well as using a permanent numbers table. My examples are below:
Original code:
/*A dynamically generated table of ranges.
2009-01-25 Stuart R Ainsworth
http://codegumbo.com/
Thanks to Stefan Keir Gordon for the inspiration
http://www.keirgordon.com */
DECLARE @MaxRanges int, @MinChunk int, @Start int, @End int
SELECT @Start=10, @End=200000
/*@MaxRanges must be <= 32767;
if the range between Start and End is large,
and @MaxRanges > 32767, the End point won't be reached.*/
SELECT @MaxRanges=100, @MinChunk=2
IF (SELECT CEILING((@End-@Start)/(@MaxRanges*1.00)))>@MinChunk
SET @MinChunk = CEILING((@End-@Start)/(@MaxRanges*1.00))
;WITH Ranges(MinValue) AS
(SELECT @Start AS MinValue
UNION ALL
SELECT (Minvalue + @MinChunk) AS MinValue
FROM Ranges WHERE MinValue < @End)
SELECT MinValue, MaxValue = CASE WHEN MinValue + @MinChunk < @End
THEN MinValue + @MinChunk
ELSE @End End
FROM Ranges
WHERE MinValue < @End
OPTION(MAXRECURSION 32767)
Modification based on Jeff’s suggestion:
/*A dynamically generated table of ranges. part 2
2009-09-27 Stuart R Ainsworth
http://codegumbo.com/
Thanks to Jeff Moden for the inspiration*/
DECLARE @MaxRanges int, @MinChunk int, @Start int, @End int
SELECT @Start=10, @End=200000
SELECT @MaxRanges=100, @MinChunk=2
IF (SELECT CEILING((@End-@Start)/(@MaxRanges*1.00)))>@MinChunk
SET @MinChunk = CEILING((@End-@Start)/(@MaxRanges*1.00))
SELECT MinValue, MaxValue = CASE WHEN MinValue + @MinChunk < @End
THEN MinValue + @MinChunk
ELSE @End End
FROM ( SELECT
ROW_NUMBER() OVER (ORDER BY (ac1.Object_ID)) -1 + @Start AS MinValue
FROM Master.sys.columns ac1
CROSS JOIN Master.sys.columns ac2) x
WHERE (MinValue = @Start
OR (Minvalue-@Start) % @MinChunk = 0)
AND MinValue + @MinChunk <= @End + @MinChunk
I also used a permanent table of numbers (per Jeff’s suggestion), but given the similarity to Jeff’s modification, I won’t post the code here. Simply replace the subquery with a numbers table (you’ll need to monkey with the math to get the dynamic starting point). What was interesting to me was the output for each of the three solutions when using SET STATISTICS IO and TIME:
Original code:
Table 'Worktable'. Scan count 2, logical reads 607, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
SQL Server Execution Times: CPU time = 16 ms, elapsed time = 88 ms.
Modification (dynamic table):
Table 'syscolpars'. Scan count 2, logical reads 19, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
SQL Server Execution Times: CPU time = 187 ms, elapsed time = 203 ms.
Modification (permanent table):
Table 'NUMBERS'. Scan count 1, logical reads 424, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
SQL Server Execution Times: CPU time = 125 ms, elapsed time = 156 ms.
At first glance, it would appear that Jeff’s assertion is correct; the number of logical reads is greatly reduced by his suggested modification (more so by the dynamic table than the permanent table, interestingly enough). However, the CPU time is increased. And, while I know that execution plans are no guarantee for performance, in every situation SQL Server estimated that avoiding the original code was 0% of the three plans (dynamic modification 65% and permanent numbers table 35%).
So now, I’m stuck. I can see Jeff’s point, but I’m not sure that a solution based on his suggestion is that much better than my original solution; do I worry more about I/O or CPU?