Freetext Extension in Entity Framework Code First

I posted before a solution for adding custom SQL functions to Entity Framework Code First as extension methods. This time I am going to show how we can do something similar for the FREETEXT function of SQL Server. Please note that this example will only work if you have the Fulltext Search component installed and your table is indexed.

OK, so we want to have an extension method like this:

[DbFunction("CodeFirstDatabaseSchema", "FREETEXT")]

public static Boolean Freetext(this String column, String value)

{

    return column.Contains(value);

}

In order for Entity Framework to recognize it, we need to write our own convention, this is because Entity Framework only recognizes out of the box a number of SQL Server built-in functions. We can write one as this:

public class FreetextConvention : IStoreModelConvention<EdmModel>

{

    public static readonly FreetextConvention Instance = new FreetextConvention();


    public void Apply(EdmModel item, DbModel model)

    {

        var valueParameter = FunctionParameter.Create("column", this.GetStorePrimitiveType(model, PrimitiveTypeKind.String), ParameterMode.In);

        var formatParameter = FunctionParameter.Create("value", this.GetStorePrimitiveType(model, PrimitiveTypeKind.String), ParameterMode.In);

        var returnValue = FunctionParameter.Create("result", this.GetStorePrimitiveType(model, PrimitiveTypeKind.Boolean), ParameterMode.ReturnValue);


        var function = this.CreateAndAddFunction(item, "FREETEXT", new[] { valueParameter, formatParameter }, new[] { returnValue });

    }


    protected EdmFunction CreateAndAddFunction(EdmModel item, String name, IList<FunctionParameter> parameters, IList<FunctionParameter> returnValues)

    {

        var payload = new EdmFunctionPayload { StoreFunctionName = name, Parameters = parameters, ReturnParameters = returnValues, Schema = this.GetDefaultSchema(item), IsBuiltIn = true };

        var function = EdmFunction.Create(name, this.GetDefaultNamespace(item), item.DataSpace, payload, null);


        item.AddItem(function);


        return (function);

    }


    protected EdmType GetStorePrimitiveType(DbModel model, PrimitiveTypeKind typeKind)

    {

        return (model.ProviderManifest.GetStoreType(TypeUsage.CreateDefaultTypeUsage(PrimitiveType.GetEdmPrimitiveType(typeKind))).EdmType);

    }


    protected String GetDefaultNamespace(EdmModel layerModel)

    {

        return (layerModel.GlobalItems.OfType<EdmType>().Select(t => t.NamespaceName).Distinct().Single());

    }


    protected String GetDefaultSchema(EdmModel layerModel)

    {

        return (layerModel.Container.EntitySets.Select(s => s.Schema).Distinct().SingleOrDefault());

    }

}

This registers a FREETEXT function with two string parameters and returning a boolean. All is fine, we add it to the DbContext in OnModelCreating:

modelBuilder.Conventions.Add(FreetextConvention.Instance);

You might have noticed the usage of a Instance static field, this is because, since the FreetextConvention class is stateless, there’s no point in creating many of them, we can just use the same instance.

Now, if we issue a LINQ query as:

var customers = ctx.Customers.Where(x => x.Name.Freetext("ricardo")).ToList();

It will fail miserably, complaining about this SQL fragment:

WHERE ((FREETEXT(name, N'ricardo') = 1)

The “= 1” part is here because the function is prototyped as boolean, which maps to SQL Server’s BIT data type, and the value for true is 1. Apparently, SQL Server does not support comparisons of some functions with 1; but if we run it as:

WHERE ((FREETEXT(name, N'ricardo'))

without the explicit comparison, it works perfectly. So, all we have to do is get rid of “= 1”. Fortunately, Entity Framework, as of version 6, offers some very nice extensibility points. There are at least two ways by which we can achieve this:

  • By intercepting the command tree;
  • By intercepting the raw SQL.

Here we will use option #2 and leave command trees for another post.

We need to identity something with a format of “FREETEXT(something) = 1”. We can do it using a regular expression, and the interception of the SQL command can be achieved by implementing IDbCommandInterceptor (no reference documentation yet, but I have reported it and it will soon be fixed, hopefully) and registering one such instance in the DbInterception (same) static class. An IDbCommandInterceptor implementation might look like this:

public class FreetextInterceptor : IDbCommandInterceptor

{

    public static readonly FreetextInterceptor Instance = new FreetextInterceptor();


    private static readonly Regex FreetextRegex = new Regex(@"FREETEXT\(([^)]+\))\) = 1");


    public void NonQueryExecuted(DbCommand command, DbCommandInterceptionContext<Int32> interceptionContext)

    {

    }


    public void NonQueryExecuting(DbCommand command, DbCommandInterceptionContext<Int32> interceptionContext)

    {

    }


    public void ReaderExecuted(DbCommand command, DbCommandInterceptionContext<DbDataReader> interceptionContext)

    {

    }


    public void ReaderExecuting(DbCommand command, DbCommandInterceptionContext<DbDataReader> interceptionContext)

    {

        var matches = FreetextRegex.Matches(command.CommandText);


        if (matches.Count > 0)

        {

            command.CommandText = FreetextRegex.Replace(command.CommandText, "FREETEXT($1)");

        }

    }


    public void ScalarExecuted(DbCommand command, DbCommandInterceptionContext<Object> interceptionContext)

    {

    }


    public void ScalarExecuting(DbCommand command, DbCommandInterceptionContext<Object> interceptionContext)

    {

    }

}

You can see that the only method we’re interested in is ReaderExecuting (again, no documentation available), with is the one that will be called just before a SQL SELECT query is sent to the database. In here we analyze the CommandText property of the DbCommand and get rid of the “= 1” clause, using a regular expression. Finally, we need to register the interceptor before we issue the query, maybe in the static constructor of our DbContext:

DbInterception.Add(FreetextInterceptor.Instance);

And now we can finally execute our query:

var customers = ctx.Customers.Where(x => x.Name.Freetext("ricardo")).ToList();

And that’s it. Don’t forget that in order for this to work, you need to enable Full Text Search.

Entity Framework Pitfalls: Command Interceptors and Identity Keys

The IDbCommandInterceptor (sorry, no official documentation) interface was introduced in Entity Framework 6 as part of the new interception and logging API, and allows the interception of the SQL and its parameters that are sent to the database as the result of DbContext CRUD operations.

It offers two methods for each of the basic ADO.NET operations, those defined in DbCommand, one called before the operation is executed, and the other called afterwards:

  • ExecuteNonQuery (UPDATEs, INSERTs, DELETEs) –> NonQueryExecuting, NonQueryExecuted;
  • ExecuteScalar (SELECTs returning a single value) –> ScalarExecuting, ScalarExecuted;
  • ExecuteReader (SELECTs returning rows) –> ReaderExecuting, ReaderExecuted.

As usual, the methods ending with “ing” are executed before and those ending with “ed” are executed afterwards, always synchronously.

One might naively assume that INSERTs would always trigger a NonQueryExecuting/NonQueryExecuted call, and indeed it is so, unless we have an IDENTITY primary key, in which case, Entity Framework will instead call ReaderExecuting/ReaderExecuted. It’s easy to understand why: when we use an IDENTITY, we need to retrieve the generated value immediately after the INSERT, hence Entity Framework will generate code like:

   1: INSERT INTO [dbo].[SomeTable] (...)

   2: VALUES (...)

   3: SELECT [Id]

   4: FROM [dbo].[SomeTable]

   5: WHERE @@ROWCOUNT > 0 AND [Id] = SCOPE_IDENTITY()

The INSERT and SELECT are combined in the same command, which justifies the choice of ReaderExecuting/ReaderExecuted. Because in Entity Framework we always use IDENTITY – don’t we? –, it’s ReaderExecuting/ReaderExecuted that you should be implementing if you want to change the INSERT SQL or its parameters.

Custom Entity Framework Code First Convention for Discriminator Values

Since version 6, Entity Framework Code First allows the injection of custom conventions. These conventions define rules that will be applied by default to all mapped entities and properties, unless explicitly changed.

The conventions API includes a couple of interfaces: IConvention (marker only, should always be included), IConceptualModelConvention<T> (for the conceptual space of the model) and IStoreModelConvention<T> (for the store, or physical, side of the model). Worthy of mention, there is also a convenience class, Convention, that allows access to all mapped types and properties and doesn’t override any of the other conventions, and also TypeAttributeConfigurationConvention<T>, for tying a convention to a custom attribute. Some of the included attributes leverage these interfaces to configure some aspects of the mappings at design time, other configuration needs to be done explicitly in an override of OnModelCreating.

Entity Framework permits using a column for distinguishing between different types, when the Table Per Class Hierarchy / Single Table Inheritance pattern (please see Entity Framework Code First Inheritance for more information) is used for mapping a hierarchy of classes to a single table, as part of “soft delete” solutions, or, less known, for differentiating between multiple tenants. This column is called a discriminator.

In order to configure an entity to use a discriminator column, there is no out of the box attribute, so we must resort to code configuration:

   1: protected override void OnModelCreating(DbModelBuilder modelBuilder)

   2: {

   3:     modelBuilder.Entity<MyMultiTenantEntity>().Map(m => m.Requires("tenant_id").HasValue("first_tenant"));

   4:  

   5:     base.OnModelCreating(modelBuilder);

   6: }

Because there’s really no need to keep repeating this code, let’s implement an attribute for indicating a discriminator column in an entity:

   1: [Serializable]

   2: [AttributeUsage(AttributeTargets.Class, AllowMultiple = false, Inherited = true)]

   3: public sealed class DiscriminatorAttribute : Attribute

   4: {

   5:     public DiscriminatorAttribute(String columnName, Object discriminatorValue)

   6:     {

   7:         this.ColumnName = columnName;

   8:         this.DiscriminatorValue = discriminatorValue;

   9:     }

  10:  

  11:     public String ColumnName { get; private set; }

  12:  

  13:     public Object DiscriminatorValue { get; private set; }

  14:  

  15:     public override Boolean Equals(Object obj)

  16:     {

  17:         var other = obj as DiscriminatorAttribute;

  18:  

  19:         if (other == null)

  20:         {

  21:             return (false);

  22:         }

  23:  

  24:         return ((this.ColumnName == other.ColumnName) && (Object.Equals(this.DiscriminatorValue, other.DiscriminatorValue) == true));

  25:     }

  26:  

  27:     public override Int32 GetHashCode()

  28:     {

  29:         return (String.Concat(this.ColumnName, ":", this.DiscriminatorValue).GetHashCode());

  30:     }

  31: }

As you can see, the DiscriminatorAttribute attribute can only be applied to a class, at most once. This makes sense, because most likely you will only have a single discriminator column per entity:

   1: [Discriminator("tenant_id", "first_tenant")]

   2: public class MyMultiTenantEntity

   3: {

   4:     //...

   5: }

You need to specify both a column name and a discriminator value, which can be of any type, usually, a string or an integer.

Now, let’s write a custom convention that knows how to handle our custom attribute and perform the mapping:

WARNING! DYNAMICS AND REFLECTION AHEAD!

PROCEED WITH CAUTION!

   1: public sealed class DiscriminatorConvention : TypeAttributeConfigurationConvention<DiscriminatorAttribute>

   2: {

   3:     private static readonly MethodInfo entityMethod = typeof(DbModelBuilder).GetMethod("Entity");

   4:     private static readonly MethodInfo hasValueMethod = typeof(ValueConditionConfiguration).GetMethods().Single(m => (m.Name == "HasValue") && (m.IsGenericMethod == false));

   5:  

   6:     private readonly DbModelBuilder modelBuilder;

   7:     private readonly ISet<Type> types = new HashSet<Type>();

   8:  

   9:     public DiscriminatorConvention(DbModelBuilder modelBuilder)

  10:     {

  11:         this.modelBuilder = modelBuilder;

  12:     }

  13:  

  14:     public override void Apply(ConventionTypeConfiguration configuration, DiscriminatorAttribute attribute)

  15:     {

  16:         if (this.types.Contains(configuration.ClrType) == true)

  17:         {

  18:             //if the type has already been processed, bail out

  19:             return;

  20:         }

  21:  

  22:         //add the type to the list of processed types

  23:         this.types.Add(configuration.ClrType);

  24:  

  25:         dynamic entity = entityMethod.MakeGenericMethod(configuration.ClrType).Invoke(modelBuilder, null);

  26:  

  27:         Action<dynamic> action = arg =>

  28:         {

  29:             var valueConditionConfiguration = arg.Requires(attribute.ColumnName);

  30:             hasValueMethod.Invoke(valueConditionConfiguration, new Object[] { attribute.DiscriminatorValue });

  31:         };

  32:  

  33:         entity.Map(action);

  34:     }

  35: }

This class uses a bit of dynamics and reflection because types are not known at compile time, and hence we cannot use generics directly. Because the Apply method will be called multiple times, we need to keep track of which entities have already been processed by this convention, so as to avoid reprocessing them. We need to pass it the instance of DbModelBuilder, because otherwise our custom convention would have no way to apply the mapping, but I think it is a reasonable trade off.

Et voilà! In order to make use of it, we need to register the convention in OnModelCreating:

   1: protected override void OnModelCreating(DbModelBuilder modelBuilder)

   2: {

   3:     modelBuilder.Conventions.Add(new DiscriminatorConvention(modelBuilder));

   4:  

   5:     base.OnModelCreating(modelBuilder);

   6: }

And that’s it! Happy conventions! Winking smile

Entity Framework Pitfalls: Mapping Discriminator Columns

When you use a discriminator column, that is, a column that holds a value that tells Entity Framework what type the row refers to, or what restriction it shall use when querying the entity, you cannot map this column as a property in your entity.

For example, imagine you want to use “soft deletes”, that is, use a database column to represent the “deleted” state of a record instead of actually physically deleting it from the table; you might have an IS_DELETED column of some integer type, that would hold either a 1 or a 0, depending on whether the record is deleted (1) or not (0). In that case, you would build a mapping like this in OnModelCreating:

   1: modelBuilder.Entity<MySoftDeletableEntity>().Map<MySoftDeletableEntity>(m => m.Requires("is_deleted").HasValue(0));

However, you won’t be able to have a corresponding IsDeleted property in the MySoftDeletableEntity, because Entity Framework will throw an exception complaining about it. It is somewhat sad, because it might be useful to refer to it, but that’s the way it is.