implementing recursive common table expressions (cte)

Last time i needed a table structure for groups and subgroups so i decided to create two tables and when i needed to query the groups from the database a i reminded myself about commont table expressions (cte) and the possibility to use them to query recursive data. So i looked at the msdn and found a good example. Nonetheless this is a good example i decided to post my own example too.

First of all i will describe the two included tables. In the group table the name of the groups gets saved. The data in the table are parent and sub groups. Human is the parentgroup of child, juvenile and adult. Child is the parentgroup of baby and tottler.

create table Groups (PK_Groups int IDENTITY primary key, Name nvarchar(256))

To reflect the hirachie of the groups i created the GroupsToGroups table. In this table the parent child relationship between groups is definied. Child is the child of human so FK_Groups reflects the parent key and FK_SubGroups reflects the child key.

create table GroupsToGroups (PK_GroupsToGroups int IDENTITY primary key, FK_Groups int, FK_SubGroups int)

So i need a function that returns a group with all the subgroups to check if a person is member of a specific group or subgroup. To reach this goal i am using a recursive common table expression. The GroupsFunc takes on parameter PK_Group and returns a list of the overtaken group plus all subgroups.

CREATE function dbo.GroupsFunc(@PK_Group int)
returns table
as
return
(
    with GroupsCte (PK_Groups, Name, FK_SubGroups, [Level])
    as
    (
        select
            PK_Groups, Name, null FK_SubGroups, 1 [Level]
        from
            dbo.Groups g
        where
            g.PK_Groups = @PK_Group

        union all

        select
            gzg.FK_SubGroups, g.Name, gzg.FK_Groups, [Level] + 1
        from
            GroupsCte gcte inner join
            dbo.GroupsToGroups gzg on gzg.FK_Groups = gcte.PK_Groups inner join
            dbo.Groups g on gzg.FK_SubGroups = g.PK_Groups
    )
    select PK_Groups, Name, FK_SubGroups, [Level]
    from GroupsCte
)

The cte consists of two parts. The first part is the entry part, the second part is the recursive part. As you see in the second part the GroupsCte is used to join on the GroupsToGroups table. The recusion gets executed until the second query does not bring an result. If the query returns the result is feeded to the cte and so the next recursion step is executed. Here is the result of the executed GroupsFunc.

select * from GroupsFunc(1)
select * from GroupsFunc(2)
select * from GroupsFunc(5)

After i finished the function with the common table expression i thought about a way to do this without a cte. Here is the result. It gives the same result as the GroupsFunc but i needet two functions and the outer apply keyword to realise this, and the resulting functions are much more complicated than the cte solution. The most imported point in the implementation is the recursive call of the function in the outer apply part. For every group in one hirachie the GroupFuncWithoutCte is called again and returns the child groups. If a group has no childs then the recusion ends.

CREATE function dbo.GroupsFuncWithoutCte(@PK_Group int, @recLevel int)
returns @tempGroups table
(
    PK_Groups int, Name nvarchar(256), FK_SubGroups int, [Level] int
)
as
begin
    declare @count int
    select
        @count = count(gzg.FK_Groups)
    from
        dbo.GroupsToGroups gzg inner join
        dbo.Groups g on gzg.FK_SubGroups = g.PK_Groups
    where
        gzg.FK_Groups = @PK_Group
    if(@count > 0)
    begin
        insert @tempGroups
        select
            g.PK_Groups, g.Name, gzg.FK_Groups, @recLevel + 1
        from
            dbo.GroupsToGroups gzg inner join
            dbo.Groups g on gzg.FK_SubGroups = g.PK_Groups
        where
            gzg.FK_Groups = @PK_Group

        insert @tempGroups
        select gf.PK_Groups, gf.Name, gf.FK_SubGroups, gf.Level
        from @tempGroups tg cross apply dbo.GroupsFuncWithoutCte(tg.PK_Groups, @recLevel + 1) gf
    end    

    return
end
--Entry point
CREATE function dbo.GroupsFuncWithoutCteMain(@PK_Group int)
returns @tempGroups table
(
    PK_Groups int, Name nvarchar(256), FK_SubGroups int, [Level] int
)
as
begin
        insert @tempGroups
        select  g.PK_Groups, g.Name, null FK_SubGroups, 0 [Level]
        from
            dbo.Groups g
        where
            g.PK_Groups = @PK_Group        

        insert @tempGroups
        select gf.PK_Groups, gf.Name, gf.FK_SubGroups, gf.[Level]
        from @tempGroups tg cross apply dbo.GroupsFuncWithoutCte(tg.PK_Groups, 1) gf

    return
end

The starting point is the GroupsFuncWithoutCteMain function that queries the Group table and the calls the GroupsFuncWithoutCte function.

select * from dbo.GroupsFuncWithoutCteMain(1)
Advertisements

Generate sequential guid

If you work with Sql Server and create a new table you often think about using the right primary key. In Sql Server, if you are using artificially primary keys you normally use an int key or and uniqueidentifier key. For performance reasons an int key (used with identity insert) is better, but you can not generate this key in the business logic. A uniqueidentifier can be generated in the business logic. But if you have a table with millions of enties and a uniqueidentifier primary key and you have lot of inserts in this table, then you could become a performance problem because after every insert the clusted index on the primary key has to be reordered. If you use newid() to generate the keys for the primary key, the generated key is not sequential and your inserts become slow. So since Sql server 2005 there is a possibility to generate sequential guids. With the newsequntialid() function the sql server generates sequential ids.

CREATE TABLE SequentialIdTable (PK uniqueidentifier primary key DEFAULT NEWSEQUENTIALID())

But if you try invoke the NEWSEQUENTIALID() function in the sql server management studio, you get the following error.

The newsequentialid() built-in function can only be used in a DEFAULT expression for a column of type 
'uniqueidentifier' in a CREATE TABLE or ALTER TABLE statement.

So if you want to generate the sequential guid in your business logic, you need to implement the sequential guid by yourself. So you need the following expression to generate the sequential guid at the sql server:

select cast(newid() AS BINARY(10) + CAST(getdate() AS BINARY(6)) AS UNIQUEIDENTIFIER)

This expression takes a new guid generated by the newid() function and takes the first 10 bytes and adds the 6 bytes from a datetime generated with getdate() to it. The cast to binary allows it to concenate the part of the guid with the datetime. So now we always get a new guid that is bigger than the guid generated before. This works because sql server orders guids in a very special way. If you want to know how, there is the blog that describes it. So at the database we now can generate a sequential guid and use it as default value in a tablecolumn or in stored procedure or function. But if we want to generate a sequential guid in the business logic we need the following class.

public class SequentialGuidGenerator
{
    public Guid CreateSequentialGuid(Guid guid, DateTime targetDateTime)
    {
       // the datetime datatype on sql server contains 8 bytes 
       // the first 4 bytes store the days since 1900
       // the second 4 bytes store the milliseconds since 00:00 of the actual day
       // diveded by 3.3333333 because milliseconds get rounded
       // the value for the DateTime '2.1.1900 00:00:00.002' in hex
       // 7 6 5 4 3 2 1 0
       // 0x 00 00 00 01 00 00 00 01 
       var baseDateTime = new DateTime(1900, 1, 1);
       var ticksSince0001 = baseDateTime.Ticks;
       var ticksSince1900 = targetDateTime.Ticks - ticksSince0001;
       // convert ticks (nanoseconds) since 1900 into days
       var daysSince1900 = ticksSince1900 / 10000 / 1000 / 3600 / 24;
       // milliseconds today
       var millisecondsToday = targetDateTime.Hour * 60 * 60 * 1000 +
          targetDateTime.Minute * 60 * 1000 +
          targetDateTime.Second * 1000 +
          targetDateTime.Millisecond;
       // divide milliseconds by 3.3333333 because milliseconds get rounded
       double milliseconds = millisecondsToday / 3.333333;
       // shift days four bytes to the left
       var daysValue = daysSince1900 << 32;
       // add milliseconds of the current day to the days value
        daysValue += Convert.ToInt64(Math.Round(milliseconds, 0));
       // generate new guid
       //Guid guid = System.Guid.NewGuid();
       //Guid guid = System.Guid.Empty;
       // convert guid to byte array
       byte[] guidArray = guid.ToByteArray();
       // convert daysValue to byte array
       byte[] daysValueArray = BitConverter.GetBytes(daysValue);
       // override last 8 bytes of guid array with daysValue array
       foreach (int item in Enumerable.Range(0, 7))
       {
           guidArray[15 - item] = daysValueArray[item];
       }
       // generate guid from overriden guidArray
       Guid resultGuid = new Guid(guidArray);
       return resultGuid;
   }
}

As you see in the comments of the CreateSequentialGuid method first i calculate the base time i need. Sql server datetime type starts at 01-01-1900 and the .net DateTime type starts at 01-01-0001 so i had to bring both to the same base time. Then i had to shift the day value 4 bytes to the left and add the milliseconds (divided by 3.33333) to the value. The last step is to add the generated bytes to the correct position in the Guid. The following tests show how the generated Guids look like.

[TestMethod]
public void CreateSequentialGuidTrue()
{
  var sequentialGuidGenerator = new SequentialGuidGenerator();
  var guid = sequentialGuidGenerator.CreateSequentialGuid(System.Guid.Empty,
  new DateTime(1900, 1, 1));
  Assert.AreEqual("00000000-0000-0000-0000-000000000000", guid.ToString());

  guid = sequentialGuidGenerator.CreateSequentialGuid(System.Guid.Empty, new
  DateTime(1900, 1, 2));
  Assert.AreEqual("00000000-0000-0000-0000-000100000000", guid.ToString());

  guid = sequentialGuidGenerator.CreateSequentialGuid(System.Guid.Empty, new
  DateTime(1900, 1, 4, 0, 0, 0, 4));
  Assert.AreEqual("00000000-0000-0000-0000-000300000001", guid.ToString());

  guid = sequentialGuidGenerator.CreateSequentialGuid(System.Guid.Empty, new
  DateTime(1900, 1, 3, 0, 0, 0, 9));
  Assert.AreEqual("00000000-0000-0000-0000-000200000003", guid.ToString());
}

So if you generate sequential guids this way you can gernerate the guid at the database and in the business logic, and you avoid big perfomance problems if you have multible inserts into tables with millions of rows and a uniqueidentifier primary key column.


New tsql commands in sql server 2012 (part 5)

In this post i will write about the FORMAT, TRY_CONVERT, PARSE and the TRY_PARSE command.

FORMAT
With the format command it is possible to format a datetime with valid .NET Framework format string.

declare @datetime datetime = getdate();
declare @language nvarchar(5);

set @language = 'en-gb';

select format(@datetime, 'yyyy/MM/dd', @language) as result, @language [language];

set @language = 'de-de';

select format(@datetime, 'yyyy/MM/dd', @language) as result, @language [language];

This is the result despite of the .net format string (‘en-gb’ or ‘de-de’).

TRY_CONVERT
The TRY_CONVERT command does work as the CONVERT command, but if the conversion fails the TRY_CONVERT command does not throw an error, despite of it returns null.

SELECT try_convert(datetime, 'date', 101) as Date

This is the result, as you see the command returns null because it can not convert the string ‘date’ to an correct datetime.

PARSE and TRY_PARSE

In the following query i try to convert the german date string ‘Montag, 13 Dez 2010’ to datetime. With
the language string ‘de-DE’ it works fine but with the language string ‘en-GB’ the parse command produces an error.

declare @language nvarchar(5);

set @language = 'de-DE';

SELECT PARSE('Montag, 13 Dez 2010' AS datetime USING @language) AS Result

set @language = 'en-GB';

SELECT PARSE('Montag, 13 Dez 2010' AS datetime USING @language) AS Result

Result of the PARSE query:

If we try the same query with the TRY_PARSE command, the parsing with the english language string fails but try_parse reports no error, instead of it returns null.

declare @language nvarchar(5);

set @language = 'de-DE';

SELECT TRY_PARSE('Montag, 13 Dez 2010' AS datetime USING @language) AS Result

set @language = 'en-GB';

SELECT TRY_PARSE('Montag, 13 Dez 2010' AS datetime USING @language) AS Result

Result of the TRY_PARSE query:


New tsql commands in sql server 2012 (Part 4)

Today i will write about two new tsql commands. The first one is choose command. The second one is the iif command. Here is an example of an query that uses choose and iif. The choose command takes the first parameter wich is an integer evaluates this integer and chooses the corresponding parameter from the parameter list. If you use choose(2,’one’,’two’,’three’) the result is ‘two’. The iif command is simular to the case command but easier to write and it has only one condition. When you write the case command case when (id <= 4) then 'first' else 'second' end you also could write iif(id <= 4, 'first', 'second').

select top 6 
        id, 
        data, 
        case when (id <= 4) then 'small' else 'big' end as [case],
        iif(id <= 4, 'small', 'big') as iif,
        choose(id, 'first', 'second', 'third', 'fourth', 'fifth') as choose
from 
        TestData

That is the result:


New tsql commands in Sql Server 2012 (Part 3)

In the new version of the sql server 2012 it is possible to easily implement paging. To do so you need the offset x rows command (x defines the starting row of your query) and the fetch next x rows only command. (x defines the rowcount of your query).

declare @counter int = 0
while(@counter < 50)
begin
    select 
        id, 
        data, 
        iif(id <= 5, 'small', 'big')
    from 
        testData
    order by id
        offset @counter ROWS
    FETCH NEXT
        5 ROWS ONLY;
    set @counter = @counter + 5;
end

Result of the query.

As you see with the offset and the fetch next rows command it is very easy to generate custom paging at database level.


New tsql commands in Sql Server 2012 (Part 2)

The new tsql feature i am writing about today is the with result set feature. Until now if you wanted to change the column name or datatype of the result set of an stored procedure you needed a temptable.

This is the procedure we use to return a dataset.

create procedure testProcedure
(
    @maxid int
)
as
    select 
            id,
            data 
    from 
            testData 
    where 
            id < @maxid
    

Here is the pre sql server 2012 tsql code to rename or change the datatype of the result set of the procedure.

DECLARE @tempTable TABLE (MyIdbigint, MyText nvarchar(MAX))
 
INSERT INTO @tempTable
EXEC testProcedure @maxid = 5
 
SELECT * FROM @tempTable

If you execute this tsql code you get the following result:

In sql server 2012 you could use the with result set command, to rename or change the datatype of the result set of an stored procedure.

EXEC testProcedure @maxid = 5
WITH RESULT SETS
(
    (
        MyId bigint,
        MyText nvarchar(max)
    )
)

If you execute this tsql code you get the following result:


New tsql commands in Sql Server 2012 (Part 1)

Next year there is a new Sql Server release. In my next posts i will write about some of the TSql commands that are new in Sql Server 2012 (Denali).

Today i will start with the command to create sequences. With the create sequence command it is possible to create a sequence of numbers with a startvalue and an increment value.

Selecting the sequence. Every time the next value for “SequenceName” is called the sequence gets incremented by the defined value.

Result of the selection.

The sequence has some additional keywords. The minvalue defines the minimum value of the sequence the maxvalue defines the maximum value of the sequence and the cycle keyword defines that the sequence reuses the values. That means that if the sequence reaches his maxvalue the sequence count starts at the begining. The following sequence uses the minvalue, maxvalue and cycle keyword.

Selecting the sequnce. Every time the next value for CyclingSequence is called the sequence is incremented by 25. When it reaches the maxvalue it starts again from the minvalue.

Result of the selection.