Building ranges using a dynamically generated numbers table

We use a queueing metaphor for dataloads for the majority of our imports, and it’s been a challenge to figure out how to generate dynamic ranges to add to the queue. In other words, we want to tell our system to go and pcik data from a source that has an id value >x and <=y, and then do it again in a few minutes with a new x and y value. I should say that it’s been a challenge because I’m still not used to SQL 2005 features; I came across a bit of code using a CTE and recursion to generate a numbers table (thanks to Stefan Keir Gordon). Here’s a slightly modified version of his code:

/*thanks to Stefan Keir Gordon for the inspiration
 http://www.keirgordon.com*/

SET STATISTICS IO ON

;WITH Numbers(n) AS
   (SELECT 1 AS n
	UNION ALL
	SELECT (n + 1) AS n
	FROM Numbers
	WHERE n <= 32767)
SELECT n
FROM Numbers
WHERE n<=32767
OPTION(MAXRECURSION 32767)
In case you’re wondering, 32767 is the maximum number of recursions supported by SQL Server in a CTE; Stefan’s code neatly generates a single column table with values from 1 to 32767. Looking at that, I realized I could easily get a simple range by adding a second column to the final SELECT, like so:
 ;WITH Ranges(MinValue) AS
  (SELECT 1 AS MinValue
   UNION ALL
   SELECT (Minvalue + 1) AS MinValue
   FROM Ranges WHERE MinValue <= 32767)
SELECT MinValue, MinValue + 1 as MaxValue
FROM Ranges
WHERE MinValue <= 32767
OPTION(MAXRECURSION 32767)

I end up with a two-column table with a MaxValue > MinValue, and the ranges increase by 1. But I had some issues to address:

  • I needed to be able to pass in Start and End values to this code.
  • I needed chunks bigger than 1 value in size; I also had a minimum size I wanted to use.
  • In some cases 32767 ranges would be appropriate; in some cases, I needed less.

Here’s the final code; setting the Start and End valeus for the range of ranges is easy to understand, but the other two values may be a bit confusing. The first is @MaxRanges, which simply says “I want no more than this number of ranges to get from the Start to the End”. The maximum value for this number is 32767. @MinChunk refers to the minumum chunk size, or the space between MinValue and MaxValue for a given row.

It may be easier to just play with the values and see; you should never have more rows than @MaxValue, and the difference between MinValue and MaxValue for a given row should always be greater than or equal to @MinChunk:

/*A dynamically generated table of ranges.
2009-01-25 Stuart R Ainsworth
	http://codegumbo.com/

Thanks to Stefan Keir Gordon for the inspiration
 http://www.keirgordon.com */

SET STATISTICS IO ON

DECLARE @MaxRanges int, @MinChunk int, @Start int, @End int
SELECT @Start=10, @End=200000

/*@MaxRanges must be <= 32767;
  if the range between Start and End is large,
  and @MaxRanges > 32767, the End point won't be reached.*/
SELECT @MaxRanges=1000, @MinChunk=205

IF (SELECT CEILING((@End-@Start)/(@MaxRanges*1.00)))>@MinChunk
	SET @MinChunk = CEILING((@End-@Start)/(@MaxRanges*1.00))

;WITH Ranges(MinValue) AS
  (SELECT @Start AS MinValue
   UNION ALL
   SELECT (Minvalue + @MinChunk) AS MinValue
   FROM Ranges WHERE MinValue < @End)
SELECT MinValue, MaxValue = CASE WHEN MinValue + @MinChunk < @End
			     THEN MinValue + @MinChunk
			     ELSE @End End
FROM Ranges
WHERE MinValue < @End
OPTION(MAXRECURSION 32767)

Good luck; happy coding!

Share