Entity Framework: Be vigilant with Include method!

Include method is very easy to use and it’s great. But its implementation is not always the best one.

Why? The Include method allows you to get entity graph with only one SQL query. The idea is excellent but…

The problem is that to do it, EF has to do a join.

What does it implies?

It implies a query slower which returns too many bytes.

Instead of:

context.Categories.Include("Products")


It is often better to use this:


foreach (var p in context.Products.Where(p => p.Category != null)) ;
context.Categories


But the issue with this way is the fact that we do many DB accesses to have it. It’s a shame.



We can fix it using MutiResultSets.



So I use what I did on my last post and I add the following methods on ObjectContextExtension class.



public static IEnumerable<T1> Include<T1, T2>(this ObjectContext context, IQueryable<T1> query1, IQueryable<T2> query2, MergeOption mergeOption = MergeOption.AppendOnly)
    where T1 : class, new()
    where T2 : class, new()
{
    if (mergeOption == MergeOption.NoTracking)
        throw new NotImplementedException();
    foreach (T1 item in from kvp in Execute(context, query2, query1, mergeOption).SkipWhile(elt => elt.Key != query1)
                        select kvp.Value)
        yield return item;
}

public static IEnumerable<T1> Include<T1, T2, T3>(this ObjectContext context, IQueryable<T1> query1, IQueryable<T2> query2, IQueryable<T3> query3, MergeOption mergeOption = MergeOption.AppendOnly)
    where T1 : class, new()
    where T2 : class, new()
    where T3 : class, new()
{
    if (mergeOption == MergeOption.NoTracking)
        throw new NotImplementedException();
    foreach (T1 item in from kvp in Execute(context, query2, query3, query1, mergeOption).SkipWhile(elt => elt.Key != query1)
                        select kvp.Value)
        yield return item;
}


Now, we will run some performance tests on Northwind DB:



test1:
context.Categories.Include("Products")

test2:
context.Categories.Include("Products").Where(c => c.CategoryName == "Beverages")

test3:
context.Categories.Include("Products.OrderDetails")

test4:
context.Categories.Include("Products.OrderDetails").Where(c => c.CategoryName == "Beverages")

test5:
context.Categories.Include("Products.OrderDetails").Where(c => c.CategoryName == "Unexisting")



We can have the same result with many simple queries:



test1:
foreach (var p in context.Products.Where(p => p.CategoryID != null)) ;
context.Categories



test2:
foreach (var p in context.Products.Where(p => p.Category.CategoryName == "Beverages")) ;
context.Categories.Where(c => c.CategoryName == "Beverages")

test3:
foreach (var od in context.OrderDetails.Where(od => od.Product.CategoryID != null)) ;
foreach (var p in context.Products.Where(p => p.CategoryID != null)) ;
context.Categories

test4:
foreach (var od in context.OrderDetails.Where(od => od.Product.Category.CategoryName == "Beverages")) ;
foreach (var p in context.Products.Where(p => p.Category.CategoryName == "Beverages")) ;
context.Categories.Where(c => c.CategoryName == "Beverages")

test 5:
foreach (var od in context.OrderDetails.Where(od => od.Product.Category.CategoryName == "Unexisting")) ;
foreach (var p in context.Products.Where(p => p.Category.CategoryName == "Unexisting")) ;
context.Categories.Where(c => c.CategoryName == "Unexisting")



We also can use my Include method:



test 1:
context.Include(context.Categories, context.Products.Where(p => p.CategoryID != null))

test 2:
context.Include(
    context.Categories.Where(c => c.CategoryName == "Beverages"),
    context.Products.Where(p => p.Category.CategoryName == "Beverages"))

test 3:
context.Include(context.Categories,
    context.Products.Where(p => p.CategoryID != null),
    context.OrderDetails.Where(od => od.Product.CategoryID != null))

test 4 :
context.Include(
    context.Categories.Where(c => c.CategoryName == "Beverages"),
    context.Products.Where(p => p.Category.CategoryName == "Beverages"),
    context.OrderDetails.Where(od => od.Product.Category.CategoryName == "Beverages"))

test 5:
context.Include(
    context.Categories.Where(c => c.CategoryName == "Unexisting"),
    context.Products.Where(p => p.Category.CategoryName == "Unexisting"),
    context.OrderDetails.Where(od => od.Product.Category.CategoryName == "Unexisting"))



The results are the following:



test 1

Improvement ratio

test 2

Improvement ratio

test 3

Improvement ratio

test 4

Improvement ratio

test 5

Improvement ratio

Execution time (ms)

L2E Include

222

95

342

163

93

Many simple select

159

1,40

61

1,56

124

2,76

66

2,47

24

3,88

My Include

57

3,89

22

4,32

433

0,79

128

1,27

22

4,23

Bytes received from server

L2E Include

847110

132699

2.375347E7

4462675

1534

Many simple select

95335

8,89

12619

10,52

145134

163,67

22065

202,25

663

2,31

My Include

95335

8,89

12619

10,52

145134

163,67

22065

202,25

663

2,31



Note that my DB is local, else it would be worse for L2E Include else.



Now if we consider that Product.CategoryID is not nullable, we can improve test1 and test3 (except for L2E Include) because we can avoid SQL query join.



For test 3 and test 4, many simple select is more efficient than My Include. I don’t know why. The difference is on the DB as if SQL Server has some problem with muti-result set. If anybody can explain me why please, let me know.



The L2E Include can also be the best way with this query for example:



context.OrderDetails.Include("Product.Category").First()


To conclude, Include is very cool because very easy to use. However it sometimes is not the best way. As I illustrate with my test, we sometimes can really improve it significantly.

This entry was posted in 7671, 7674, 9104. Bookmark the permalink.

Leave a Reply

Your email address will not be published. Required fields are marked *


*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>