Tuesday, March 11, 2008

Generate Sequential GUIDs for SQL Server 2005 in C#

Why generate sequential GUID in C#?

Originally, uniqueidentifier (GUID) column in SQL Server was not supposed to be sequential. But in my case, having sequential GUID is quite useful.
My application needs to know, what record was inserted first.

Fortunately, SQL Server 2005 supports "default newsequentialid()" constraint, that makes uniqueidentifier column grow sequentially [with every inserted record].

That worked quite well for me, until I decided to generate sequential GUID in C#.
(I needed it, because I use SqlBulkCopy and try to save two tables that share the same generated GUID key).

That turned out to be a tricky task. The reason -- .NET and SQL Server treat GUIDs quite different. In particular, they sort them quite differently.

Searching for solution

Alberto Ferrari's post How are GUIDs sorted by SQL Server? gave me a good idea about how to handle the problem.

I used modified Alberto's SQL code to find out what C#.NET GUID bytes are more [or less] significant from SQL Server 2005 ORDER BY clause perspective.

With UIDs As (
Select ID = 3, UID = cast ('01000000-0000-0000-0000-000000000000' as uniqueidentifier)
Union Select ID = 2, UID = cast ('00010000-0000-0000-0000-000000000000' as uniqueidentifier)
Union Select ID = 1, UID = cast ('00000100-0000-0000-0000-000000000000' as uniqueidentifier)
Union Select ID = 0, UID = cast ('00000001-0000-0000-0000-000000000000' as uniqueidentifier)
Union Select ID = 5, UID = cast ('00000000-0100-0000-0000-000000000000' as uniqueidentifier)
Union Select ID = 4, UID = cast ('00000000-0001-0000-0000-000000000000' as uniqueidentifier)
Union Select ID = 7, UID = cast ('00000000-0000-0100-0000-000000000000' as uniqueidentifier)
Union Select ID = 6, UID = cast ('00000000-0000-0001-0000-000000000000' as uniqueidentifier)
Union Select ID = 8, UID = cast ('00000000-0000-0000-0100-000000000000' as uniqueidentifier)
Union Select ID = 9, UID = cast ('00000000-0000-0000-0001-000000000000' as uniqueidentifier)
Union Select ID = 10, UID = cast ('00000000-0000-0000-0000-010000000000' as uniqueidentifier)
Union Select ID = 11, UID = cast ('00000000-0000-0000-0000-000100000000' as uniqueidentifier)
Union Select ID = 12, UID = cast ('00000000-0000-0000-0000-000001000000' as uniqueidentifier)
Union Select ID = 13, UID = cast ('00000000-0000-0000-0000-000000010000' as uniqueidentifier)
Union Select ID = 14, UID = cast ('00000000-0000-0000-0000-000000000100' as uniqueidentifier)
Union Select ID = 15, UID = cast ('00000000-0000-0000-0000-000000000001' as uniqueidentifier)
)
Select * From UIDs Order By UID


Note, that first line with ID=3 corresponds to:
new Guid(new bytes[16]{ 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 }
That means:
(new Guid("01000000-0000-0000-0000-000000000000").ToByteArray()[3] == 1)


Now, when I run modified Alberto's query, I'm getting the following sequence:
3, 2, 1, 0, 5, 4, 7, 6, 9, 8, 15, 14, 13, 12, 11, 10

That means, that GUID's byte #3 is the least significant and GUID's byte #10 is the most significant [from SQL Server ORDER BY clause perspective].

Final solution

Now we're ready to write C# code, that would sequentially increment any given GUID.
I also made it more convenient to increment GUID, by using "++" operator.
Here's how it's used:

private void Test()
{
SequentialGuid = new SequentialGuid(Guid.Empty);
SequentialGuid++;
}


and C# code that increments GUIDs sequentially:

public class SequentialGuid
{
Guid _CurrentGuid;
public Guid CurrentGuid
{
get { return _CurrentGuid; }
}

public SequentialGuid()
{
_CurrentGuid = Guid.NewGuid();
}

public SequentialGuid(Guid previousGuid)
{
_CurrentGuid = previousGuid;
}

public static SequentialGuid operator ++(SequentialGuid sequentialGuid)
{
byte[] bytes = sequentialGuid._CurrentGuid.ToByteArray();
for (int mapIndex = 0; mapIndex < 16; mapIndex++)
{
int bytesIndex = SqlOrderMap[mapIndex];
bytes[bytesIndex]++;
if (bytes[bytesIndex] != 0)
{
break; // No need to increment more significant bytes
}
}
sequentialGuid._CurrentGuid = new Guid(bytes);
return sequentialGuid;
}

private static int[] _SqlOrderMap = null;
private static int[] SqlOrderMap
{
get
{
if (_SqlOrderMap == null)
{
_SqlOrderMap = new int[16] { 3, 2, 1, 0, 5, 4, 7, 6, 9, 8, 15, 14, 13, 12, 11, 10 };
// 3 - the least significant byte in Guid ByteArray [for SQL Server ORDER BY clause]
// 10 - the most significant byte in Guid ByteArray [for SQL Server ORDERY BY clause]
}
return _SqlOrderMap;
}
}
}

7 comments:

Magnus said...

Thank you for your article.

In my case I run code on several different machines, and thus is not able to use a single shared SequentialGuid object. Is there a way to call the constructor with a value depending on e.g. DateTime.Now?

In other words, I need a method to call anywhere from any server and still get i sequential guid larger than those generated before.

Regards
Magnus

Dennis Gorelik said...

Magnus,

I see at least two possible ways to solve "Generate Sequential GUID across multiple machines":

Solution #1:
Create web service that all your machines would call. This service would generate unique sequential GUIDs for all other machines.

Solution #2:
Put DateTime.Now into appropriate bytes of GUID (your idea). Select these GUID bytes in such a way that GUIDs will be sorted by using these datetime bytes first and would sort by other bytes later.
You may put machine number into last (the least significant) byte of GUID, so machine number would not affect sorting, but it still would ensure that GUIDs generated on different machines are unique.
Use "middle" GUID bytes to put sequential number in case if you need to generate more than one GUID during the same millisecond.

dotnetchris said...

Dennis,

Could you please elaborate on your discussion of the most and least significant bits. Both on how you determined the significance of the bits and then what the implications are of changing the highest vs the lowest.

Dennis Gorelik said...

dotnetchris: the more significant byte is the stronger it affects overall sorting order.

For example, if you compare
100 with 010 then 100 is bigger, because the most significant [leftmost] bit in "100" is "1", which is bigger than the most significant bit in "010" which is "0".

It seems than in SQL server the most significant byte is not on the left, but on position #10.

Does it help?

dotnetchris said...

Thanks for your answer Dennis, yes that makes sense I just wasn't sure which way it went if the sort was based off the most significant or least significant bytes there until your comment cleared it up for me.

MikeGledhill said...

This is excellent code, and has just solved a huge bulk-insert performance problem we were having.

However, I really don't want records in my database with GUID values like 01000000-0000-0000-0000-000000000000

So, when I use your class, I start it off with a random GUID, and let subsequent GUIDs get calculated from there:

SequentialGuid guid = new SequentialGuid(Guid.NewGuid());
guid++;

Dennis Gorelik said...

Mike, I'm glad it helped.

However even better solution would be not to use GUID as an identity key (or as a primary key) at all.
I figured it out somewhere in between I wrote this blog post (6 years ago) and now.

Int key works so much better:
- Less data.
- Shorter and faster indexes.
- Identical increment operations in SQL Server and C#.

The only real advantage GUID has is that it's hard to guess. But as soon as you start using sequential key generation that advantage disappears.

What is your reason to use GUID?

Followers

About Me

My Photo
Email me: blogger at dennisgorelik.com