SQLPort em Aveiro

(Portuguese only, sorry!)

No próximo dia 22 de Setembro irá ter lugar em Aveiro, nas instalações da Fusion Cowork (Fusion 2), o LXXXVI evento da Comunidade SQLPort, o primeiro dos quais em Aveiro!

Os oradores são do mais alto nível: Niko Neugebauer (@NikoNeugebauer) e Eduardo Piairo (@EdPiairo), irão falar sobre Performance Tuning e Database Source Control. O programa é o seguinte:

18:30 Abertura e recepção
19:00 “Performance Tuning” – Niko Neugebauer (MVP, Consultant – OH22)
20:10 Intervalo
20:30 “Database source control: migrations vs state” – Eduardo Piairo (DBA – Celfinet)  
21:40 Sorteio
22:00 Jantar livre

Podem ver mais informações aqui e registarem-se aqui. O evento é gratuito mas não dispensa inscrição!

A morada da Fusion 2 é:

Rua Carlos
Aleluia, 4, Edifício Aveiro
Centrum, Loja 3

3810-177 Aveiro

Aguardamo-vos lá! Winking smile

Profiling SQL Server in .NET

The main tool for profiling SQL Server is, of course, SQL Server Profiler. You can, however, achieve more or less the same results using .NET, I’ll show you how.

First, you need to have assemblies Microsoft.SqlServer.ConnectionInfo and Microsoft.SqlServer.ConnectionInfoExtended at hand, which are normally found in C:\Program Files\Microsoft SQL Server\120\SDK\Assemblies (for SQL Server 2014), if you installed the Shared Management Objects package from the SQL Server Feature Pack (again, this link is for SQL Server 2014, but there are feature packs for each version of SQL Server).

Then, write something like this:

var connInfo = new SqlConnectionInfo(@".\SQLEXPRESS");

//connInfo.UserName = "";

//connInfo.Password = "";

 

using (var reader = new TraceServer())

{

    reader.InitializeAsReader(connInfo, @"C:\Program Files (x86)\Microsoft SQL Server\120\Tools\Profiler\Templates\Microsoft SQL Server\120\Standard.tdf");

 

    while (reader.Read() == true)

    {

        for (var i = 1; i < reader.FieldCount; ++i)

        {

            Console.Write(reader.GetName(i));

            Console.Write(": ");

            Console.WriteLine(reader.GetValue(i));

        }

 

        Console.WriteLine();

    }

}

And this is it. You will get a new entry for each event that happens on the server, such as SQL being run, etc. Among others, you get the following information:

  • ApplicationName: the name of the application that is executing the SQL, as supplied in its connection string;
  • NTUserName: the Windows user name of the connected user;
  • LoginName: the SQL Server login name of the connected user;
  • Duration: the request duration;
  • SPID: the executing Server Process ID;
  • StartTime: the start timestamp of the request;
  • EndTime: the end timestamp;
  • TextData: textual data, like, SQL.

One thing to keep in mind is that you need to pass a valid SQL Server Profiler template for the InitializeAsReader method. You can add one of the existing templates to your Visual Studio project, set it to copy to the output folder, and reference it locally, if you prefer, otherwise, you will need to know the full path where it is installed (C:\Program Files (x86)\Microsoft SQL Server\120\Tools\Profiler\Templates\Microsoft SQL Server\120 is for SQL Server 2014).

Freetext Extension in Entity Framework Code First

I posted before a solution for adding custom SQL functions to Entity Framework Code First as extension methods. This time I am going to show how we can do something similar for the FREETEXT function of SQL Server. Please note that this example will only work if you have the Fulltext Search component installed and your table is indexed.

OK, so we want to have an extension method like this:

[DbFunction("CodeFirstDatabaseSchema", "FREETEXT")]

public static Boolean Freetext(this String column, String value)

{

    return column.Contains(value);

}

In order for Entity Framework to recognize it, we need to write our own convention, this is because Entity Framework only recognizes out of the box a number of SQL Server built-in functions. We can write one as this:

public class FreetextConvention : IStoreModelConvention<EdmModel>

{

    public static readonly FreetextConvention Instance = new FreetextConvention();


    public void Apply(EdmModel item, DbModel model)

    {

        var valueParameter = FunctionParameter.Create("column", this.GetStorePrimitiveType(model, PrimitiveTypeKind.String), ParameterMode.In);

        var formatParameter = FunctionParameter.Create("value", this.GetStorePrimitiveType(model, PrimitiveTypeKind.String), ParameterMode.In);

        var returnValue = FunctionParameter.Create("result", this.GetStorePrimitiveType(model, PrimitiveTypeKind.Boolean), ParameterMode.ReturnValue);


        var function = this.CreateAndAddFunction(item, "FREETEXT", new[] { valueParameter, formatParameter }, new[] { returnValue });

    }


    protected EdmFunction CreateAndAddFunction(EdmModel item, String name, IList<FunctionParameter> parameters, IList<FunctionParameter> returnValues)

    {

        var payload = new EdmFunctionPayload { StoreFunctionName = name, Parameters = parameters, ReturnParameters = returnValues, Schema = this.GetDefaultSchema(item), IsBuiltIn = true };

        var function = EdmFunction.Create(name, this.GetDefaultNamespace(item), item.DataSpace, payload, null);


        item.AddItem(function);


        return (function);

    }


    protected EdmType GetStorePrimitiveType(DbModel model, PrimitiveTypeKind typeKind)

    {

        return (model.ProviderManifest.GetStoreType(TypeUsage.CreateDefaultTypeUsage(PrimitiveType.GetEdmPrimitiveType(typeKind))).EdmType);

    }


    protected String GetDefaultNamespace(EdmModel layerModel)

    {

        return (layerModel.GlobalItems.OfType<EdmType>().Select(t => t.NamespaceName).Distinct().Single());

    }


    protected String GetDefaultSchema(EdmModel layerModel)

    {

        return (layerModel.Container.EntitySets.Select(s => s.Schema).Distinct().SingleOrDefault());

    }

}

This registers a FREETEXT function with two string parameters and returning a boolean. All is fine, we add it to the DbContext in OnModelCreating:

modelBuilder.Conventions.Add(FreetextConvention.Instance);

You might have noticed the usage of a Instance static field, this is because, since the FreetextConvention class is stateless, there’s no point in creating many of them, we can just use the same instance.

Now, if we issue a LINQ query as:

var customers = ctx.Customers.Where(x => x.Name.Freetext("ricardo")).ToList();

It will fail miserably, complaining about this SQL fragment:

WHERE ((FREETEXT(name, N'ricardo') = 1)

The “= 1” part is here because the function is prototyped as boolean, which maps to SQL Server’s BIT data type, and the value for true is 1. Apparently, SQL Server does not support comparisons of some functions with 1; but if we run it as:

WHERE ((FREETEXT(name, N'ricardo'))

without the explicit comparison, it works perfectly. So, all we have to do is get rid of “= 1”. Fortunately, Entity Framework, as of version 6, offers some very nice extensibility points. There are at least two ways by which we can achieve this:

  • By intercepting the command tree;
  • By intercepting the raw SQL.

Here we will use option #2 and leave command trees for another post.

We need to identity something with a format of “FREETEXT(something) = 1”. We can do it using a regular expression, and the interception of the SQL command can be achieved by implementing IDbCommandInterceptor (no reference documentation yet, but I have reported it and it will soon be fixed, hopefully) and registering one such instance in the DbInterception (same) static class. An IDbCommandInterceptor implementation might look like this:

public class FreetextInterceptor : IDbCommandInterceptor

{

    public static readonly FreetextInterceptor Instance = new FreetextInterceptor();


    private static readonly Regex FreetextRegex = new Regex(@"FREETEXT\(([^)]+\))\) = 1");


    public void NonQueryExecuted(DbCommand command, DbCommandInterceptionContext<Int32> interceptionContext)

    {

    }


    public void NonQueryExecuting(DbCommand command, DbCommandInterceptionContext<Int32> interceptionContext)

    {

    }


    public void ReaderExecuted(DbCommand command, DbCommandInterceptionContext<DbDataReader> interceptionContext)

    {

    }


    public void ReaderExecuting(DbCommand command, DbCommandInterceptionContext<DbDataReader> interceptionContext)

    {

        var matches = FreetextRegex.Matches(command.CommandText);


        if (matches.Count > 0)

        {

            command.CommandText = FreetextRegex.Replace(command.CommandText, "FREETEXT($1)");

        }

    }


    public void ScalarExecuted(DbCommand command, DbCommandInterceptionContext<Object> interceptionContext)

    {

    }


    public void ScalarExecuting(DbCommand command, DbCommandInterceptionContext<Object> interceptionContext)

    {

    }

}

You can see that the only method we’re interested in is ReaderExecuting (again, no documentation available), with is the one that will be called just before a SQL SELECT query is sent to the database. In here we analyze the CommandText property of the DbCommand and get rid of the “= 1” clause, using a regular expression. Finally, we need to register the interceptor before we issue the query, maybe in the static constructor of our DbContext:

DbInterception.Add(FreetextInterceptor.Instance);

And now we can finally execute our query:

var customers = ctx.Customers.Where(x => x.Name.Freetext("ricardo")).ToList();

And that’s it. Don’t forget that in order for this to work, you need to enable Full Text Search.

Entity Framework Pitfalls: Command Interceptors and Identity Keys

The IDbCommandInterceptor (sorry, no official documentation) interface was introduced in Entity Framework 6 as part of the new interception and logging API, and allows the interception of the SQL and its parameters that are sent to the database as the result of DbContext CRUD operations.

It offers two methods for each of the basic ADO.NET operations, those defined in DbCommand, one called before the operation is executed, and the other called afterwards:

  • ExecuteNonQuery (UPDATEs, INSERTs, DELETEs) –> NonQueryExecuting, NonQueryExecuted;
  • ExecuteScalar (SELECTs returning a single value) –> ScalarExecuting, ScalarExecuted;
  • ExecuteReader (SELECTs returning rows) –> ReaderExecuting, ReaderExecuted.

As usual, the methods ending with “ing” are executed before and those ending with “ed” are executed afterwards, always synchronously.

One might naively assume that INSERTs would always trigger a NonQueryExecuting/NonQueryExecuted call, and indeed it is so, unless we have an IDENTITY primary key, in which case, Entity Framework will instead call ReaderExecuting/ReaderExecuted. It’s easy to understand why: when we use an IDENTITY, we need to retrieve the generated value immediately after the INSERT, hence Entity Framework will generate code like:

   1: INSERT INTO [dbo].[SomeTable] (...)

   2: VALUES (...)

   3: SELECT [Id]

   4: FROM [dbo].[SomeTable]

   5: WHERE @@ROWCOUNT > 0 AND [Id] = SCOPE_IDENTITY()

The INSERT and SELECT are combined in the same command, which justifies the choice of ReaderExecuting/ReaderExecuted. Because in Entity Framework we always use IDENTITY – don’t we? –, it’s ReaderExecuting/ReaderExecuted that you should be implementing if you want to change the INSERT SQL or its parameters.

SQL Server Auditing

Probably the best mechanism for auditing changes to tables in SQL Server is Change Data Capture, which I already covered here. Unfortunately, it only works with the Enterprise editions of SQL Server, something that not everyone has access to.

I once had to implement a similar solution that should work on any edition of SQL Server, which is what I am going to talk about. Its only requirement was that for any changes made to target tables, their old values would be stored in a shared history table, with the indication of the timestamp and user who made the change.

Let’s imagine we have some table, anything will do:

   1: CREATE TABLE dbo.audit_target

   2: (

   3:     id INT IDENTITY PRIMARY KEY,

   4:     a NVARCHAR(50),

   5:     b INT

   6: )

As you can see, it’s just a regular table with a single-column primary key and a couple of other columns, any number and type will do, doesn’t really matter.

Now let’s create a table for holding the change history:

   1: CREATE TABLE dbo.history

   2: (

   3:     id INT IDENTITY PRIMARY KEY,

   4:     [table_name] NVARCHAR(30) NOT NULL,

   5:     table_id INT NOT NULL,

   6:     [changes] NVARCHAR(MAX),

   7:     [timestamp] SMALLDATETIME NOT NULL DEFAULT GETDATE(),

   8:     [user] NVARCHAR(30) NOT NULL

   9: )

The table_name column shall hold the name of the table whose changes we are tracking, table_id, the id of the updated row, changes will get a XML containing all the old values, timestamp the date and time the change occurred, and user, the person who made the change (more on this later).

OK, next we need a stored procedure to populate the changes table:

   1: CREATE PROCEDURE dbo.generic_audit_procedure

   2: (

   3:     @procid INT,

   4:     @xml XML

   5: )

   6: AS

   7: BEGIN

   8:     SET NOCOUNT ON

   9:     

  10:     -- check if there are changes, in which case, xml will not be empty

  11:     IF ((@xml IS NULL) OR (CAST(@xml AS NVARCHAR(MAX)) = N''))

  12:     BEGIN

  13:         RETURN 0

  14:     END

  15:     

  16:     -- get the current table id for the current process (trigger)

  17:     DECLARE @tablename NVARCHAR(30)

  18:  

  19:     SELECT @tablename = OBJECT_NAME(t.id)

  20:     FROM sys.sysobjects p

  21:     INNER JOIN sys.sysobjects t

  22:     ON p.parent_obj = t.id 

  23:     WHERE p.id = @procid

  24:     

  25:     IF ((@tablename IS NULL) OR (@tablename = N''))

  26:     BEGIN

  27:         RAISERROR(N'Could not get table name', 16, 16)

  28:         ROLLBACK TRAN

  29:         RETURN

  30:     END

  31:     

  32:     -- get the primary key column for the current table

  33:     DECLARE @idname NVARCHAR(30)

  34:  

  35:     SELECT @idname = name 

  36:     FROM sys.syscolumns

  37:     WHERE id = OBJECT_ID(@tablename)

  38:     AND colstat = 1

  39:  

  40:     IF ((@idname IS NULL) OR (@idname = ''))

  41:     BEGIN

  42:         RAISERROR(N'Could not get id column for the current table', 16, 16)

  43:         ROLLBACK TRAN

  44:         RETURN

  45:     END

  46:     

  47:     -- get the current user from the context

  48:     DECLARE @username NVARCHAR(30)

  49:  

  50:     SELECT @username = CONVERT(NVARCHAR(30), CONTEXT_INFO())

  51:  

  52:     IF ((@username IS NULL) OR (@username = ''))

  53:     BEGIN

  54:         RAISERROR(N'Could not get current username', 16, 16)

  55:         ROLLBACK TRAN

  56:         RETURN

  57:     END

  58:  

  59:     DECLARE @tmp TABLE([key] NVARCHAR(MAX), [value] XML)

  60:  

  61:     INSERT INTO @tmp SELECT q.[key], q.[value]

  62:     FROM

  63:     (

  64:         SELECT T.N.value(N'(*[local-name(.)=sql:variable("@idname")])[1]', N'INT') AS [key], T.N.query(N'*') AS [value]

  65:         FROM @xml.nodes('/row') AS T(N)    

  66:     ) q

  67:  

  68:     DECLARE @ROWS INT

  69:     SET @ROWS = 0

  70:  

  71:     -- loop all modified records

  72:     WHILE (1 = 1)

  73:     BEGIN

  74:         DECLARE @id NVARCHAR(MAX)

  75:         DECLARE @value XML

  76:         

  77:         -- pick the first record

  78:         SELECT TOP 1 @id = [key], @value = [value]

  79:         FROM @tmp

  80:         

  81:         IF (@@ROWCOUNT = 0)

  82:         BEGIN

  83:             BREAK

  84:         END

  85:                         

  86:         -- insert into the shared table

  87:         INSERT INTO dbo.history (table_name, table_id, [user], [changes])

  88:         VALUES (@tablename, @id, @username, CAST(@value AS NVARCHAR(MAX)))

  89:         

  90:         -- increment the updated rows

  91:         SET @ROWS = @ROWS + @@ROWCOUNT

  92:         

  93:         -- remove the processed record

  94:         DELETE FROM @tmp

  95:         WHERE [key] = @id

  96:     END

  97:  

  98:     RETURN @ROWS

  99: END

The generic_audit_procedure will get the current table from the current process id, as passed in the @@procid parameter, the current user from CONTEXT_INFO (again, more on this later), the changes from the @xml parameter and will insert everything into the history table.

And the final part, a trigger to “glue” together the changes made to a table to the stored procedure:

   1: CREATE TRIGGER dbo.generic_audit_trigger

   2: ON dbo.test -- replace this with other target tables' names

   3: AFTER UPDATE

   4: AS

   5: BEGIN

   6:     SET NOCOUNT ON

   7:     

   8:     -- get all changes

   9:     -- this needs to be done on the trigger

  10:     DECLARE @xml XML

  11:     SET @xml = (SELECT * FROM inserted FOR XML PATH('row'))

  12:     

  13:     -- call the generic stored procedure

  14:     EXEC dbo.generic_audit_procedure @@PROCID, @xml

  15: END

This trigger will run after changes are accepted (AFTER UPDATE) and will cast all changes (the INSERTED pseudo-table) as XML, and then invoke the generic_audit_procedure stored procedure with it as an argument, together with the current process id (@@PROCID). Of course, you need to call this for every table that you wish to audit.

The only thing left is, we need to provide a way for SQL Server to know the name of the current user. We do that by manually executing the following query (actually, it was an automated process that was executed by the data access library):

   1: DECLARE @username VARBINARY(128) = CAST(N'rjperes' AS VARBINARY(128))

   2: SET CONTEXT_INFO @username

So, whenever a change is made to a table, it will appear as this:

image

You can see that the changes column will contain an XML node with all the old values that were changed by the update made to the target table.

In a nutshell:

  1. We add a trigger that calls a stored procedure upon change to a number of target tables;
  2. Save the application user in CONTEXT_INFO;
  3. Modify the data in a table;
  4. Get the old values in an auditing table;

There are some limitations with this implementation:

  • Only single-column primary key target tables are supported;
  • All of the “old” values are recorded, not just those that changed;
  • It only keeps the “old” values, not the new ones, although this is by design and easy to change;
  • Binary columns could use a better treatment;
  • Doesn’t keep track of deleted or inserted records;
  • Doesn’t track schema changes (add/drop/modify columns, etc);
  • It is necessary to set CONTEXT_INFO explicitly with the application user that we want to record;
  • Although I didn’t implement it, it would be easy to add a version column to the history table, to keep track of how many changes were made to each target table.

However, it suited perfectly what I needed, maybe it will work for you too! Winking smile