Nastiness with Data in the .NET Compact Framework and the spiral downward (With Updates!)

The First Try

As of late, if you for whatever reason on earth follow me on twitter, you might’ve picked up on the fact that I am working on a Windows Mobile project. Written in VB.NET. Here is my experience with working with a SQL database in the .NET Compact framework, and hopefully save someone some headaches.

I naturally wanted to use SQL CE for a database. It comes out of the box, right? It’s made by the SQL team, which I usually have no problems with. SQL Server has been a solid Microsoft product for years, so let’s use that, I mean Visual Studio was “nice” enough to install it for me.

Well, on a previous application we used SQL CE. And it immediately crashed the application when we ran a query. Let’s look at some numbers. The database contained 2 tables, and one foreign keyed to the other. TableA contained roughly 45,000 records, and TableB contained roughly 4 million records. TableB contained a foreign key to tableA. The database would be read only. No inserts, updates, deletes – nothing of the sort. TableA contained 2 columns, an identity column and a nvarchar lookup that was 250 characters wide (500 bytes for those counting). TableB contained an identity, a reference to TableA, and 5 nvarchar columns that were also 250 characters.

Not the most complex of datasets. Also note that I am generalizing names of tables and columns to avoid being specific about a project.

Now, the device was admittedly underpowered. We’re talking a 250 MHz ARM processor with 16 MB of RAM. 8 MB of the RAM was being used by the OS and a Bluetooth loader. We have 8 MB of memory. Not exactly a ton, right?

SQL CE 3.5 flat out out ran out of memory. Poof, gone. Well, we tweaked some settings in the connection string like pool size, max working set, etc. No dice. The database was already given to me the way it was, was TableA really needed? Let’s merge the two tables – the thought being that the JOIN was killing it. Nope, didn’t make things better. What was the query? A lookup by an ID:

   1: SELECT Id, FirstName, LastName, Location FROM TableB WHERE LookupId = @LookupId

Yeah, pretty simple. It should only return one record, too. It was accessed using a SqlDataReader to avoid the overhead of datasets, LINQ, etc. Here’s where things got interesting. LookupId was an nvarchar, but was really just a set of numbers, like 3487982429803758. Too big to be an Int, but small enough to be a BigInt. The data had leading zeros in them, but regardless the number would be unique. So let’s change it to a BigInt. Though after the conversion, it just really didn’t look right. Some numbers were flat out converted wrong. Here is how I converted it:

   1: ALTER TABLE [TableB]


   2:     ADD [LookupIdTemp] BIGINT NULL


   3: GO


   4: UPDATE [TableB] SET [LookupIdTemp] = CONVERT(BIGINT, [LookupId])


   5: GO


   6: ALTER TABLE [TableB]


   7:     DROP COLUMN [LookupId]


   8: GO


   9: ALTER TABLE [TableB]


  10:     ADD [LookupId] BIGINT NULL


  11: GO


  12: UPDATE [TableB] SET [LookupId] = [LookupIdTemp]


  13: GO


  14: ALTER TABLE [TableB]


  15:     DROP COLUMN [LookupIdTemp]


  16: GO


  17: ALTER TABLE [TableA]


  18:     ALTER COLUMN [LookupId] BIGINT NOT NULL


  19: GO

Not very fast, but I was able to execute it from a desktop machine and let it run for a few minutes (sp_rename doesn’t exist for SQL CE for those unaware). I tried it on some sample data, and it seemed to work fine. No so. Here is a sample input, and sample output of where things didn’t work and did work. See if you notice a pattern.

Input Output
5 5
10 10
150 150
1000 1000
05 0
050 5
001000 10
010 1
0000000010 0

See a pattern emerging? It’s the leading zeros. SQL CE does not convert a string to a BIGINT correctly if it has leading zeros. Simple as that. For every insignificant zero on the left, it chomps that many digits from the right.

Input Output
5 5
10 10
150 150
1000 1000
05 5
050 50
001000 1000
010 10
0000000010 00000001

See now? That’s pretty pathetic. As expected, if you do this with a full SQL engine, it works as expected. Also, it only seems to do this for a BIGINT. The conversion works just fine for INT, SMALLINT, TINYINT, you name it. I got in touch with the SQL CE team, and acknowledged the bug. They were also kind enough to provide a recent work around: put a sign in front of the digit. i.e. “+050” is properly converted to 50.

Even after all of that the query flat out didn’t run on the device. We started to worry that the device just wasn’t powerful enough. So we tested it on a slightly better device, a 550 MHz XScale with 64 MB of RAM. It ran, but was slow. Painfully slow. Now we really started getting worried.  Was our goal just too much for a CE device for a .NET Application? Were we going to have to move to something native like C++ eMbedded?

Well, what about indexes? Did we have our LookupId column indexed? You bet. That’s the first thing you do on a lookup column with a ton of rows. It didn’t seem to help much either. Usually indexes perform miracles. So what were we doing wrong?

We went through all sorts of suggestions, like using the Seek operation,  downgrading to SQL CE 2.0 (you never know) any nothing worked.

Well, let’s try SQLite first at the suggestion of a hardware vendor. We found a nice app on CodePlex that even converted it for us. SQLite is a native implementation, and we had a tricky time finding a managed wrapper for it, but we did here. the conversion took ages, but the managed wrapper was very clean and written in ADO.NET style. Let’s give it a try.

It came back instantly. I thought something went wrong or the conversion didn’t convert all of the rows. A count check confirmed that all of the millions of rows were there and came back. Though I tried a LookupId of “3”. What happens when I try to use 10000000000000?

Came back 10 seconds later. Ah but wait – we didn’t get the index pulled over. Let’s add the index in try again. The size of the database grew significantly, from 400 MB to 750 MB. The database was on a compact flash card, so we weren’t too concerned about that. Aside from the size, the query was always returning instantly (or at least sub-second).

I don’t know exactly why SQLite is infinitely better, but I think indexes are implemented in a strange way – the size of the database didn’t seem to change much in SQL CE without the index. If anyone wants to tell me how indexes work in SQL CE, I’d be happy to know. Or possibly tell me if I was doing something wrong. Some back research seemed to affirm my point with SQL CE. It’s very bloated for the Compact Framework.

One last little annoyance – the SQL CE Framework is not as simple as including an assembly in your project. It requires a full CAB installation. That isn’t exactly correct. A SQL CE MVP was kind enough to tell me that it can be distributed with the application. However, I was not able to find anywhere on MSDN, or much other places on the internet, on how to do that. What files do I need? Where do I get them? Microsoft wants you to use the MSI so they can update it with Windows Updates.

That left me with a pretty bad taste in my mouth for SQL CE. Given that the only difference between SQL CE and SQLite were a few key class names, I didn’t think I ever wanted to use SQL CE again.

The Second Try

I later was working on a project that needed to persist cached data somewhere. This is for a WPF-based Desktop application. SQLite came to mind, but it had one flaw that annoyed me. Different assemblies for x86 and x86-64. I could either have two different builds, or latebind, or do something clever, but I thought it was time to try SQL CE again, I don’t plan on having more than 100 rows in any given table anyway. SQL CE isn’t much better either. Again, it isn’t redistributable in any form other than an MSI. At least the documentation doesn’t recommend it. But at least I had the freedom to just distribute the SQL CE MSI for the correct platform and the app would just work.

You might be thinking, “Why not just force your application to run in x86 all the time and avoid the x86-64 problem?” Well, I am a 64-bit junky. If it’s a managed app, it better well damn work on the OS’s preferred architecture. It’s a pet peeve of mine I suppose. Unless there is a good reason to force x86 (like performance problems with 64-bit – remember the x86-64 Jitter is slower) or moving a legacy application to at least work properly on x86-64 as a starting point.

SQL CE is actually an OK database platform for a desktop. There are some quirky bugs with it, like the issue with BIGINT, but not bad. I suspect it will improve as well. However I will never use it for a Compact Application ever again, regardless of how big I think the data is going to be.

Unit Testing Insanity

Now, I do unit testing whenever I can. Sometimes the client prefers I don’t due to time constraints or don’t see the benefit. I push for it, but there is only so much I can push. The previous client was one of those, so I didn’t notice this issue until recently.

If you’ve never written a Compact Framework application, there are a few things to know. First, an application targeted for the Compact Framework is fully capable of being referenced by Full .NET Framework assemblies, which makes sense. the stranger one is that in a Compact Framework project you can reference a Full .NET Framework assembly, it just may not run properly. That in mind, here is a sample solution setup.

  • ProjectFoo.Data
  • ProjectFoo.Business
  • UnitTests.ProjectFoo.Data
  • UnitTests.ProjectFoo.Business

All of them are Compact Framework Class Libraries in VB.NET. The UnitTests projects need to be Compact Framework projects because for whatever reason, a Full VB.NET Project cannot reference a Compact Framework VB.NET Project (though C# can do it). My reasons for using VB.NET is worth another blog post later.

Regardless, things were going smoothly with this approach. NUnit has no problem working with the Compact Framework, my CI server runs the tests fine, Resharper runs them, seems like a good setup. Time to add some data! I went for SQLite since this is a Compact Framework application. I referenced it, added all of the native dependencies, at this point it’s like clockwork for me. My first test is to test a class that creates the database for the first time. Here was my result:

System.IO.FileLoadException: Could not load file or assembly ‘System.Data.SQLite, Version=1.0.65.0, Culture=neutral, PublicKeyToken=1fdb50b1b62b4c84, Retargetable=Yes’ or one of its dependencies. The given assembly name or codebase was invalid.

Huh? Lets check the Bin directory and make sure it’s there. Yep.

I knew this wasn’t going to be an easy one. I’ve dealt with some bad problems with loading assemblies in the past when I was writing an multi AppDomain project in the past, and knew the best place to start was the Fusion log. It happened to already be enabled from previous problems like this, which is lucky, I guess. Here’s what was in the log:

*** Assembly Binder Log Entry  (9/28/2009 @ 12:33:27 AM) ***

The operation failed.
Bind result: hr = 0x80131047. No description available.

Assembly manager loaded from:  C:\Windows\Microsoft.NET\Framework\v2.0.50727\mscorwks.dll
Running under executable  C:\Program Files (x86)\JetBrains\ReSharper\v4.5\Bin\JetBrains.ReSharper.TaskRunner.exe
--- A detailed error log follows. 

=== Pre-bind state information ===
LOG: User = VCSJONESDC\Kevin Jones
LOG: DisplayName = System.Data.SQLite, Version=1.0.65.0, Culture=neutral, PublicKeyToken=1fdb50b1b62b4c84, Retargetable=Yes
 (Fully-specified)
LOG: Appbase = file:///C:/development/Thycotic/SecretServer.WindowsMobile/trunk/UnitTests.SecretServer.WindowsMobile.Data/bin/Debug
LOG: Initial PrivatePath = NULL
LOG: Dynamic Base = NULL
LOG: Cache Base = NULL
LOG: AppName = UnitTests.SecretServer.WindowsMobile.Data
Calling assembly : SecretServer.WindowsMobile.Data, Version=1.0.3557.41832, Culture=neutral, PublicKeyToken=null.
===
LOG: This bind starts in default load context.
LOG: No application configuration file found.
LOG: Using machine configuration file from C:\Windows\Microsoft.NET\Framework\v2.0.50727\config\machine.config.
ERR: Failed to apply policy (hr = 0x80131047).
ERR: Unrecoverable error occurred during pre-download check (hr = 0x80131047).

Not too much help. The HR is the same as the except which I doubled checked like so:

   1: [System.Runtime.InteropServices.Marshal]::ThrowExceptionForHR(0x80131047)

Yes, I used a PowerShell script to test this out. Once you get use to it, I find it very handy for such uses.

So, let’s start digging into the IL of SQLite using our good friend ILDASM. Reflector doesn’t quite show one of the inner workings we want. If you look at the manifest of SQLite, it’s got this oddball in there:

   1: .module extern 'SQLite.Interop.065.DLL'

Reflector simply shows it as a comment at the top of the manifest disassembly. So what on earth is SQL.Interop.065.DLL?

Other assemblies contain the same .module reference, and ironically they can point to unmanaged libraries, like kernel32.dll. This is not the same as the unpopular .NET Modules. A .NET Module is, in short words, an assembly without a manifest. So what is SQL.Interop.065.DLL? It’s not a valid Win32 PE or PE+ file either. A look at it with the Dependency Finder tells us it’s compiled for the ARM architecture and it’s Subsystem is Windows CE 2.0+.

Here’s a screen grab.

ss

For those curious, coredll.dll is the equivalent of kernel32.dll on a Windows CE system, but since I am using Windows 7 it obviously won’t be found on my desktop.

What does this mean? Why is a managed assmebly referencing an unmanaged assembly? Well, it isn’t exactly a “reference”. What happens is, whenever you do platform invoke on a DLL, the DLL is added as a “module” in the manifest. Let’s try it. Make a new C# project and throw in this code:

   1: [DllImport("user32.dll")]


   2: [return: MarshalAs(UnmanagedType.LPTStr)]


   3: internal static extern string CharUpper


   4:     (


   5:         [In, MarshalAs(UnmanagedType.LPTStr)] string lpsz


   6:     );

Now look at our manifest in ILDASM, we have this:

   1: .module extern user32.dll



I’m not entirely sure why it was quoted in the first example and not in the second, or if it has any significance.



The reason at this point is pretty obvious: We’re trying to do Platform Invoke on a library that my processor is incapable of handling.



Makes sense, but I am a bit bummed. Is SQL CE a better solution? Since SQLite is retargetable, can I use different versions of SQLite depending on the platform? I’m not sure right now, but I’ll be sure to follow up. If you have a working solution for this I would love to hear it, too.

4 thoughts on “Nastiness with Data in the .NET Compact Framework and the spiral downward (With Updates!)”

  1. Author,

    How would this be true in MS Compact Database – “The database would be read only. No inserts, updates, deletes – nothing of the sort.” ?

    Aren’t MS Compact DBs read/write ? Or is that a setting? If so, where is that setting?

  2. Word dude!!! SQL CE 3.5 is dev chum. When you use it, you’re essentially asking for Management(the sharks) to come and bite your head off. Dealing with the same exercise of migrating to SQLite.

  3. @John when I say read only I meant as far as how the application would use the database. While we could write to the database, the application would not do it.

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>