implementing recursive common table expressions (cte)

Last time i needed a table structure for groups and subgroups so i decided to create two tables and when i needed to query the groups from the database a i reminded myself about commont table expressions (cte) and the possibility to use them to query recursive data. So i looked at the msdn and found a good example. Nonetheless this is a good example i decided to post my own example too.

First of all i will describe the two included tables. In the group table the name of the groups gets saved. The data in the table are parent and sub groups. Human is the parentgroup of child, juvenile and adult. Child is the parentgroup of baby and tottler.

create table Groups (PK_Groups int IDENTITY primary key, Name nvarchar(256))

To reflect the hirachie of the groups i created the GroupsToGroups table. In this table the parent child relationship between groups is definied. Child is the child of human so FK_Groups reflects the parent key and FK_SubGroups reflects the child key.

create table GroupsToGroups (PK_GroupsToGroups int IDENTITY primary key, FK_Groups int, FK_SubGroups int)

So i need a function that returns a group with all the subgroups to check if a person is member of a specific group or subgroup. To reach this goal i am using a recursive common table expression. The GroupsFunc takes on parameter PK_Group and returns a list of the overtaken group plus all subgroups.

CREATE function dbo.GroupsFunc(@PK_Group int)
returns table
as
return
(
    with GroupsCte (PK_Groups, Name, FK_SubGroups, [Level])
    as
    (
        select
            PK_Groups, Name, null FK_SubGroups, 1 [Level]
        from
            dbo.Groups g
        where
            g.PK_Groups = @PK_Group

        union all

        select
            gzg.FK_SubGroups, g.Name, gzg.FK_Groups, [Level] + 1
        from
            GroupsCte gcte inner join
            dbo.GroupsToGroups gzg on gzg.FK_Groups = gcte.PK_Groups inner join
            dbo.Groups g on gzg.FK_SubGroups = g.PK_Groups
    )
    select PK_Groups, Name, FK_SubGroups, [Level]
    from GroupsCte
)

The cte consists of two parts. The first part is the entry part, the second part is the recursive part. As you see in the second part the GroupsCte is used to join on the GroupsToGroups table. The recusion gets executed until the second query does not bring an result. If the query returns the result is feeded to the cte and so the next recursion step is executed. Here is the result of the executed GroupsFunc.

select * from GroupsFunc(1)
select * from GroupsFunc(2)
select * from GroupsFunc(5)

After i finished the function with the common table expression i thought about a way to do this without a cte. Here is the result. It gives the same result as the GroupsFunc but i needet two functions and the outer apply keyword to realise this, and the resulting functions are much more complicated than the cte solution. The most imported point in the implementation is the recursive call of the function in the outer apply part. For every group in one hirachie the GroupFuncWithoutCte is called again and returns the child groups. If a group has no childs then the recusion ends.

CREATE function dbo.GroupsFuncWithoutCte(@PK_Group int, @recLevel int)
returns @tempGroups table
(
    PK_Groups int, Name nvarchar(256), FK_SubGroups int, [Level] int
)
as
begin
    declare @count int
    select
        @count = count(gzg.FK_Groups)
    from
        dbo.GroupsToGroups gzg inner join
        dbo.Groups g on gzg.FK_SubGroups = g.PK_Groups
    where
        gzg.FK_Groups = @PK_Group
    if(@count > 0)
    begin
        insert @tempGroups
        select
            g.PK_Groups, g.Name, gzg.FK_Groups, @recLevel + 1
        from
            dbo.GroupsToGroups gzg inner join
            dbo.Groups g on gzg.FK_SubGroups = g.PK_Groups
        where
            gzg.FK_Groups = @PK_Group

        insert @tempGroups
        select gf.PK_Groups, gf.Name, gf.FK_SubGroups, gf.Level
        from @tempGroups tg cross apply dbo.GroupsFuncWithoutCte(tg.PK_Groups, @recLevel + 1) gf
    end    

    return
end
--Entry point
CREATE function dbo.GroupsFuncWithoutCteMain(@PK_Group int)
returns @tempGroups table
(
    PK_Groups int, Name nvarchar(256), FK_SubGroups int, [Level] int
)
as
begin
        insert @tempGroups
        select  g.PK_Groups, g.Name, null FK_SubGroups, 0 [Level]
        from
            dbo.Groups g
        where
            g.PK_Groups = @PK_Group        

        insert @tempGroups
        select gf.PK_Groups, gf.Name, gf.FK_SubGroups, gf.[Level]
        from @tempGroups tg cross apply dbo.GroupsFuncWithoutCte(tg.PK_Groups, 1) gf

    return
end

The starting point is the GroupsFuncWithoutCteMain function that queries the Group table and the calls the GroupsFuncWithoutCte function.

select * from dbo.GroupsFuncWithoutCteMain(1)

currying in c#

Today i want to write about a functional programming topic. This topic is curried functions. To curry a function is named after the mathematician Haskell Curry and means to build a function with n – 1 parameters out of a function with n parameters. If you have the function:

Func<int, int, int> mul = (x, y) => x * y;

the curried function out of this Function would be:

Func<int, Func<int, int>> curriedMul = mul.Curry();

That means that the curried function takes only one parameter and returns a function with one parameter. If we now call the curried function:

Func<int, int> executedCurriedMul = curriedMul(2);

Now the function executedCurriedMul has saved the first paramter (2) in the function. If we now call the resulting function

 int mulResult = executedCurriedMul(3);
 Console.WriteLine(mulResult);
 6

it returns 6. That means that we split up the pass of the parameters.
Alternative we could invoke the curried function this way:

int mulResult = curriedMul(2)(3);

In haskell every function is a curried function:

mult :: Integer -> Integer -> Integer
mult a b = a * b

*Main> let curriedMult = mult 2
*Main> curriedMult 3
6

In f# also every function is a curried function:

let mult a b = a * b
let curriedMult = mult 2.0
let multResult = curriedMult 3.0

printfn "result = %f" multResult
6.0

but in c# we need a extension method to curry a given function.

//Currying a function with two parameters using anonyme methodes.
//The method generates two nested anonymus methodes.
//with the call of the method a delegate to a anonymus method 
//with one parameter of type T1 and a return type of type
//Func<T1, Func<T2, TResult>> is created.
//if the curried function is called we overtake the p1 parameter (first parameter)
//and get back a Func<T2, TResult> func delegate.
//if we now call the returned Func<T2, TResult> with parameter p2 (second parameter)
//we call the func function with p1 and p2. That works because p1 becomes a closure and
//so we can use it in the second call.
public static Func<T1, Func<T2, TResult>>
    CurryAnonym<T1, T2, TResult>(this Func<T1, T2, TResult> func)
{
    return (Func<T1, Func<T2, TResult>>)delegate(T1 p1)
    {
        return (Func<T2, TResult>)delegate(T2 p2)
        {
            TResult tresult = func(p1, p2);
            return tresult;
        };
    };
}

//Currying a function with two parameters using lambda expressions
//the return type of the Func<T1, Func<T2, TResult>> delegate is explicite to
//make the return types clear. It works the same way a the CurryAnonym function
//but it uses lambda expressions instead of anonym methodes.
public static Func<T1, Func<T2, TResult>>
    Curry<T1, T2, TResult>(this Func<T1, T2, TResult> func)
{
    return new Func<T1, Func<T2, TResult>>(
            p1 => new Func<T2, TResult>(
                p2 => func(p1, p2)
                )
        );
}

//Currying a function with three parameters using lambda expressions
public static Func<T1, Func<T2, Func<T3, TResult>>>
    Curry<T1, T2, T3, TResult>(this Func<T1, T2, T3, TResult> func)
{
    return p1 => p2 => p3 => func(p1, p2, p3);
}

//Currying a function with four parameters using lambda expressions
public static Func<T1, Func<T2, Func<T3, Func<T4, TResult>>>>
    Curry<T1, T2, T3, T4, TResult>(this Func<T1, T2, T3, T4, TResult> func)
{
    return p1 => p2 => p3 => p4 => func(p1, p2, p3, p4);
}

//Uncurry a function with two parameters. Here we overtake the curried function
//and build a new Func<T1, T2, TResult> lambda function 
//that executes the curried function
public static Func<T1, T2, TResult>
    Uncurry<T1, T2, TResult>(this Func<T1, Func<T2, TResult>> func)
{
    return new Func<T1, T2, TResult>(
            (p1, p2) => func(p1)(p2)
        );
}

With this extension methodes it is easy to build a curried function. With the Uncarry extension method it is also possible to revoke the currying of a function.


Generate sequential guid

If you work with Sql Server and create a new table you often think about using the right primary key. In Sql Server, if you are using artificially primary keys you normally use an int key or and uniqueidentifier key. For performance reasons an int key (used with identity insert) is better, but you can not generate this key in the business logic. A uniqueidentifier can be generated in the business logic. But if you have a table with millions of enties and a uniqueidentifier primary key and you have lot of inserts in this table, then you could become a performance problem because after every insert the clusted index on the primary key has to be reordered. If you use newid() to generate the keys for the primary key, the generated key is not sequential and your inserts become slow. So since Sql server 2005 there is a possibility to generate sequential guids. With the newsequntialid() function the sql server generates sequential ids.

CREATE TABLE SequentialIdTable (PK uniqueidentifier primary key DEFAULT NEWSEQUENTIALID())

But if you try invoke the NEWSEQUENTIALID() function in the sql server management studio, you get the following error.

The newsequentialid() built-in function can only be used in a DEFAULT expression for a column of type 
'uniqueidentifier' in a CREATE TABLE or ALTER TABLE statement.

So if you want to generate the sequential guid in your business logic, you need to implement the sequential guid by yourself. So you need the following expression to generate the sequential guid at the sql server:

select cast(newid() AS BINARY(10) + CAST(getdate() AS BINARY(6)) AS UNIQUEIDENTIFIER)

This expression takes a new guid generated by the newid() function and takes the first 10 bytes and adds the 6 bytes from a datetime generated with getdate() to it. The cast to binary allows it to concenate the part of the guid with the datetime. So now we always get a new guid that is bigger than the guid generated before. This works because sql server orders guids in a very special way. If you want to know how, there is the blog that describes it. So at the database we now can generate a sequential guid and use it as default value in a tablecolumn or in stored procedure or function. But if we want to generate a sequential guid in the business logic we need the following class.

public class SequentialGuidGenerator
{
    public Guid CreateSequentialGuid(Guid guid, DateTime targetDateTime)
    {
       // the datetime datatype on sql server contains 8 bytes 
       // the first 4 bytes store the days since 1900
       // the second 4 bytes store the milliseconds since 00:00 of the actual day
       // diveded by 3.3333333 because milliseconds get rounded
       // the value for the DateTime '2.1.1900 00:00:00.002' in hex
       // 7 6 5 4 3 2 1 0
       // 0x 00 00 00 01 00 00 00 01 
       var baseDateTime = new DateTime(1900, 1, 1);
       var ticksSince0001 = baseDateTime.Ticks;
       var ticksSince1900 = targetDateTime.Ticks - ticksSince0001;
       // convert ticks (nanoseconds) since 1900 into days
       var daysSince1900 = ticksSince1900 / 10000 / 1000 / 3600 / 24;
       // milliseconds today
       var millisecondsToday = targetDateTime.Hour * 60 * 60 * 1000 +
          targetDateTime.Minute * 60 * 1000 +
          targetDateTime.Second * 1000 +
          targetDateTime.Millisecond;
       // divide milliseconds by 3.3333333 because milliseconds get rounded
       double milliseconds = millisecondsToday / 3.333333;
       // shift days four bytes to the left
       var daysValue = daysSince1900 << 32;
       // add milliseconds of the current day to the days value
        daysValue += Convert.ToInt64(Math.Round(milliseconds, 0));
       // generate new guid
       //Guid guid = System.Guid.NewGuid();
       //Guid guid = System.Guid.Empty;
       // convert guid to byte array
       byte[] guidArray = guid.ToByteArray();
       // convert daysValue to byte array
       byte[] daysValueArray = BitConverter.GetBytes(daysValue);
       // override last 8 bytes of guid array with daysValue array
       foreach (int item in Enumerable.Range(0, 7))
       {
           guidArray[15 - item] = daysValueArray[item];
       }
       // generate guid from overriden guidArray
       Guid resultGuid = new Guid(guidArray);
       return resultGuid;
   }
}

As you see in the comments of the CreateSequentialGuid method first i calculate the base time i need. Sql server datetime type starts at 01-01-1900 and the .net DateTime type starts at 01-01-0001 so i had to bring both to the same base time. Then i had to shift the day value 4 bytes to the left and add the milliseconds (divided by 3.33333) to the value. The last step is to add the generated bytes to the correct position in the Guid. The following tests show how the generated Guids look like.

[TestMethod]
public void CreateSequentialGuidTrue()
{
  var sequentialGuidGenerator = new SequentialGuidGenerator();
  var guid = sequentialGuidGenerator.CreateSequentialGuid(System.Guid.Empty,
  new DateTime(1900, 1, 1));
  Assert.AreEqual("00000000-0000-0000-0000-000000000000", guid.ToString());

  guid = sequentialGuidGenerator.CreateSequentialGuid(System.Guid.Empty, new
  DateTime(1900, 1, 2));
  Assert.AreEqual("00000000-0000-0000-0000-000100000000", guid.ToString());

  guid = sequentialGuidGenerator.CreateSequentialGuid(System.Guid.Empty, new
  DateTime(1900, 1, 4, 0, 0, 0, 4));
  Assert.AreEqual("00000000-0000-0000-0000-000300000001", guid.ToString());

  guid = sequentialGuidGenerator.CreateSequentialGuid(System.Guid.Empty, new
  DateTime(1900, 1, 3, 0, 0, 0, 9));
  Assert.AreEqual("00000000-0000-0000-0000-000200000003", guid.ToString());
}

So if you generate sequential guids this way you can gernerate the guid at the database and in the business logic, and you avoid big perfomance problems if you have multible inserts into tables with millions of rows and a uniqueidentifier primary key column.


Simulating a guard in c#

Currently i am reading a book about haskell programming and there is one chapter about guards. Because i like the concept of guards a lot i tried to implement one in c#. In this post you can see the result of my efforts. First of all i want to explain what a guard is and how you can implement one in haskell. So a guard is a pattern matching construct. You can define cases and when the overtaken value corresponds with the case then that case is choosen. This is the definition of a guard in Haskell:

guard :: Integer -> String
guard value
  | doublevalue <= 10 && value >= 0 = "small"
  | doublevalue <= 20 && value >= 0 = "middle"
  | doublevalue <= 30 && value >= 0 = "big"
  | otherwise = "not definied"
  where doublevalue = value * 2

If we call the guard with 2 the this:

 guard 2
 small

is the result. In haskell there is the where clause with this clause it is possible define additional values that can be used in the guard. In this case the value gets doubled and this doublevalue is used in the guard. In f# sharp there is also a construct that works like a guard but it is called match pattern. This is the definition of the match pattern (guard) in f#:

let guard v =
    match v with
    | v when v < 10 && v > 0 -> 1
    | v when v = 50 -> 2
    | v when v > 100 -> 3
    | _ -> 0

guard 2
1

I think that guards could also be a practical construct in c#. Here is my implementation of a c# guard like construct. To see if it works i wrote some test cases.

[TestClass]
public class GuardsTests
{
    [TestMethod]
    public void GuardsTestTrue()
    {
        int i = 2;
        var result = i.Match().
            When(v => v < 10, v => 1).
            When(v => v == 50, v => 2).
            When(v => v > 100, v => 3).
            Default(v => 0).Guard();
        Assert.AreEqual(result, 0);

        string value = "test";
        var resultTest = value.Match().
            When(s => s == "not test", s => "not test").
            When(s => s == "test", v => "test").
            Default(v => "no match").Guard();
        Assert.AreEqual(resultTest, "test");

        i = 25;
        int y = 2;
        var doubleresult = i.Match().Where<int>(x => x * y).
            When(v => v < 10, v => 1).
            When(v => v == 50, v => 2).
            When(v => v > 100, v => 3).
            Default(v => 0).Guard();
        Assert.AreEqual(doubleresult, 2);
    }
}

The Guards class is a generic class with two generic types. T is the type of the overtaken value and TResult is the result type. The When method is used to define the cases of the guard, the Default method defines the Default case.

// the Guards class stores the cases that are defined in the when, else and default clauses and
// when the Guard method is called all stored cases are evaluated and the result is calculated.
public class Guards<T, TResult>
{
    private readonly T value;

    // in this list of tuples all defined when cases are stored
    private List<Tuple<Predicate<T>, Func<T, TResult>>> patterns =
        new List<Tuple<Predicate<T>, Func<T, TResult>>>();

    private Tuple<Predicate<T>, Func<T, TResult>> elsePattern = null;
    // in this func the default case is stored
    private Func<T, TResult> defaultPattern = null;
    // in this func the where case is stored
    private Func<T, T> wherePattern = null;

    public Guards(T value)
    {
        this.value = value;
    }
    // every time the when method is called a tuple of a predicate and a result func is stored in 
    // the patterns list
    public Guards<T,TResult> When(Predicate<T> predicate, Func<T, TResult> result)
    {
        patterns.Add(new Tuple<Predicate<T>, Func<T, TResult>>(predicate, result));
        return this;
    }
    // with the where method it is possible to transform the overtaken value before it comes
    // evaluated
    public Guards<T, TResult> Where(Func<T, T> result)
    {
        wherePattern = result;
        return this;
    }

    public Guards<T, TResult> Else(Predicate<T> predicate, Func<T, TResult> result)
    {
        elsePattern = new Tuple<Predicate<T>, Func<T, TResult>>(predicate, result);
        return this;
    }
    // defines the default case
    public Guards<T, TResult> Default(Func<T, TResult> result)
    {
        defaultPattern = result;
        return this;
    }
    // the guard method evaluates the patterns. That means it calls the
    // predicate definied for the pattern and if the pedicate is true it
    // calles the func defined in the tuple. If no pattern matches the predicate
    // then the default pattern is called.
    public TResult Guard()
    {
        if(elsePattern != null)
            patterns.Add(elsePattern);
        if (wherePattern != null)
        {
            var whereValue = wherePattern(value);
            foreach (var item in patterns)
            {
                if (item.Item1(whereValue))
                {
                    return item.Item2(whereValue);
                }
            }
        }
        else
        {
            foreach (var item in patterns)
            {
                if (item.Item1(value))
                {
                    return item.Item2(value);
                }
            }
        }
        if (defaultPattern != null)
            return defaultPattern(value);
        throw new GuardException();
    }
}

The GuardExtensions class definies the Match method that generates a GuardContext object and overtakes the value the Guard should evaluate.

public static class GuardExtensions
{
    public static GuardContext<T> Match<T>(this T value)
    {
        return new GuardContext<T>(value);
    }
}

The GuardContext class defines the connection between the overtaken value that comes from the Match extension method and the Guards object.
// The GuardContext class holds the value that is evaluated
// and is the starting point for the chain of Guards that evaluates the value.
// the GuardContext is nessecary because it has one generic class type T that defines the type of
// the value and the methodes Where, When and Default have the generic TResult type that defines the
// result type of the chaining Methodes. It is the connection between the overtaken value and the Guards object.
// so if the expression:
// string stringResult =
// 10.Match().When(x => x < 10, x => “klein”).When(x => x >= 10, x => “big”).Guard();
// is evaluated the following things happen:
// 1. the extension method Match is called for 10
// 2. in the Match method a GuardContext Object is created and returned
// 3. the when method of the GuardContext Object is called
// 4. a Guards object is created the value is overtaken to this Guard
// and the When method of this object is called
// 5. the predicate and the result func are stored in the patterns collection of the Guards object
// and the Guards object is returned.
// 6. then the When method of the Guards object is executed the predicate and the result func are
// stored in the patterns collection (now the Guard object has two pattern entries).
// 7. The Guard method of the Guard object is called
// and the entries made in the patterns collection are executed with the value
// stored in the Guard object and return a TResult of type string to the stringResult

public classGuardContext<T>
{
    private T value;

    public GuardContext(T value)
    {
        this.value = value;
    }

    public Guards<T, TResult> Where<TResult>(Func<T, T> result)
    {
        var match = new Guards<T, TResult>(value);
        return match.Where(result);
    }
    // the when method generates a Guards object and overtakes the value to this Guards object.
    // After that it executes the When method of the Guards object and starts the evaluation chain also
    // called a Guard.
    public Guards<T, TResult> When<TResult>(Predicate<T> predicate, Func<T, TResult> result)
    {
        var match = new Guards<T, TResult>(value);
        return match.When(predicate, result);
    }
    public Guards<T, TResult> Default<TResult>(Func<T, TResult> result)
    {
        var match = new Guards<T, TResult>(value);
        return match.Default(result);
    }
}

So my self defined Guard in c# is not so beautiful as in haskell or f# but it served its purpose.