The company I work for, Interactive Medica, provides SaaS business solutions for the pharmaceutical industry. Many of our clients receive data from third-parties which they need us to import into our system. This is pretty straight-forward and common in many industries. The data arrives in different formats, CSV files (or other delimited text files), Access database, Excel Spreadsheets, and so on. Sometimes the files are zipped and sometimes not. Sometimes the files are uploaded to us via FTP and at other times we need to pull the data, also usually via FTP.
Any developer have at some time or another been acquired to do something like the above, shuffling data from one source to another. Luckily, using ADO.Net, it’s not very hard to do so, as long as you have the correct connection string. Anyhow, when I first started at IM they only had a few of these import programs, which all worked but the problem was that they were all hard coded, by which I mean that any changes made to the file that needed to be imported meant that we had to open up the source code for the import program and make the changes, recompile and deploy it back to the server. This was more than tedious. Some of them were also rather old, written in VB6 in a VBScript style manner, meaning no Option Explicit, new variable names created at the fly and the developer seemed to think that reusing one variable more than once wasn’t necessary when you could create a new one, all with nifty names like Foo, Q, P, S1, M12, and so on that didn’t give a clue at what they contained or was used for. Global scooping also apparently seemed like a good idea, and of course the original developer had since long left the building (the company).
OK, I’m sorry since this wasn’t supposed to be yet-another-ranting about the perils of maintaining old proprietary code. When I was asked to create a new handful of import tools I was determent to make them easier to maintain. What I wanted to do was to create a number of classes that contained all method necessary to do the importing regardless of the data source. All information about the file and where it should be imported to would be in a config file, easily editable without the need to open up the source code. When all this was done, I educated my colleague, that are responsible for monitoring these tools, on how to use them. He isn’t a developer but he does know some VB, and using the framework I created he’s been able to create new import tools by himself.
What I didn’t realize then was how the number of necessary imports was going to grow, and grow fast. The problem is that I never created one program that could use different config files at different points in time. So the program had to be copied, sometimes changed a bit for a new customer import, and deployed as a new EXE file that run on the server at scheduled points in time. I think that we now have around 40 almost identical programs running on the server. So now they are also starting to be a mess to administrate and maintain. So it’s pretty obvious that something needs to be done.
Already two years ago when I first created this import tool framework I was thinking that this should really be done with some clever scripting language. Whenever a change is made in the file that we need to import we could simply open up the script in any text editor make the necessary changes and save it again without the need to launch Visual Studio on my local machine, check out the source code from our source control system, make the changes, test, build and compile it, check in the source and deploy the application on the server. Well, several choices are available. I could have used cool stuff like Python or even PHP but, no that would require their runtimes to be deployed on the server… Not a biggy you say, well it is when it’s going up on a live environment. It have to be lab tested first, with all kind of rigorous evaluation, new documentation have to be written and approved and all that. There was just not time for anything like that since our customer wanted the information yesterday. What we already have on the server is the .NET framework, live with it chump. OK, so how about scripting for .NET then? Well, we always have VSA (Visual Studio for Applications) but that requires a lot of work. Another quicker solution is to use System.CodeDom and reflection, so that’s the path I choose.
Creating the contract
The System.CodeDom.Compiler namespace contains the abstract class CodeDomProvider from which language specific providers inherits. In my case I’m interested in the VbCodeProvider which resides in the Microsoft.VisualBasic namespace, you could also use the CSharpCodeProvider if you wish. The namespace also contains the CompilerParameters class with which you would set, yes you guessed it, parameters necessary to compile the source into an assembly. I will get to these classes shortly, but first let’s have a look at how we would create a contract between a script file and the scripting engine that we will use to run the script. To keep it as simple as possible I created a new Class library project which I named Scripting. This library only contain an Interface with one single method called ScriptMain() which all scripts must implement.
Public Interface IScriptable Sub ScriptMain() End Interface
With that in place a minimal script that does absolutely nothing would look like this:
Public Class MyScript Implements Scripting.IScriptable Public Sub ScriptMain() Implements Scripting.IScriptable.ScriptMain 'script code goes here End Sub End Class
Using the CodeDom
Now for the fun part. How to run the script. For the sake of this article I’m going to create a small demo application, consisting of a Windows Form with one button and a multi-line textbox.
You would write the script in the textbox and hit the button to run it. To this project I also add a class we will name ScriptEngine. I also added a reference to the above mentioned class library containing our interface. The ScriptEngine class will import the System.CodeDom.Compiler and the System.Reflection namespaces plus our Scripting namespace containing the IScriptable interface.
As I mentioned earlier I’m going to use the VBCodeProvider to compile the script. By default the provider will compile the source to a .Net 2.0 assembly. To be able to use version 3.5 of the .Net framework we need to supply some provider options to the constructor of the class.
Dim providerOptions = New Collections.Generic.Dictionary(Of String, String) providerOptions.Add("CompilerVersion", "v3.5") Dim provider As New VBCodeProvider(providerOptions)
Now that we have our code provider we must also add some parameters. The compiler parameters tell the provider how we want our assembly to be built. In our case we do not want to create an executable file but instead just create an in memory assembly. We also want to add references to some other commonly used assemblies like System.Windows.Forms and others, most importantly we add a reference to our Scripting.dll which contains the IScriptable interface. To be able to add this class library it either have to be added to the GAC or exist in the same folder as our executable (I use the latter). We can also provide some compiler options like Option Explicit, note that the /OptionInfer flag can only be used if we use the v3.5 Framework compiler.
Dim parameters As New CompilerParameters With parameters .GenerateExecutable = False .GenerateInMemory = True .IncludeDebugInformation = False .ReferencedAssemblies.Add("System.dll") .ReferencedAssemblies.Add("System.Windows.Forms.dll") .ReferencedAssemblies.Add("Microsoft.VisualBasic.dll") .ReferencedAssemblies.Add("Scripting.dll") .CompilerOptions = "/OptionExplicit+ /OptionStrict- /OptionInfer+" End With
OK, we now have everything set up to compile an assembly out of our script. The CodeDom provider have a number of methods to do this, for example CompileAssemblyFromFile() or CompileAssemblyFromSource(). In this demo we don’t have any source file stored on our hard drive so we will use the latter of these two methods that accepts a string as the source. As a matter of fact it actually accepts a ParamArray of strings if we would have several sources of code that we wanted to compile into the same assembly.
Dim result As CompilerResults result = provider.CompileAssemblyFromSource(parameters, source)
As the first argument we pass the CompilerParameters we just created and then the string containing our source code. If the returned CompilerResults doesn’t contain any compiler errors we are ready to run our script.
If Not result.Errors.HasErrors Then Dim script As IScriptable = FindScriptable(result.CompiledAssembly) If script IsNot Nothing Then script.ScriptMain() End If End If
Here we call a little helper function named FindScriptable() that uses reflection to find a reference to our IScriptable interface that our script had to implement. If found we simply call the ScriptMain() method to run our script.
The full source code for our script engine will look as follows:
Imports System.CodeDom.Compiler Imports System.Reflection Imports Scripting
Public Class ScriptEngine Public Function RunScript(ByVal source As String) As Boolean Dim providerOptions = New Collections.Generic.Dictionary(Of String, String) providerOptions.Add("CompilerVersion", "v3.5") Dim provider As New VBCodeProvider(providerOptions) Dim parameters As New CompilerParameters With parameters .GenerateExecutable = False .GenerateInMemory = True .IncludeDebugInformation = True .ReferencedAssemblies.Add("System.dll") .ReferencedAssemblies.Add("System.Windows.Forms.dll") .ReferencedAssemblies.Add("Microsoft.VisualBasic.dll") .CompilerOptions = "/OptionExplicit+ /OptionStrict- /OptionInfer+" End With Dim result As CompilerResults result = provider.CompileAssemblyFromSource(parameters, source) If Not result.Errors.HasErrors Then Dim script As IScriptable = FindScriptable(result.CompiledAssembly) If script IsNot Nothing Then script.ScriptMain() Return True Else Return False
End If Else Return False End If End Function Private Function FindScriptable(ByVal assembly As Assembly) As IScriptable For Each t As Type In assembly.GetTypes() If t.GetInterface("IScriptable", True) IsNot Nothing Then Return DirectCast(assembly.CreateInstance(t.FullName), IScriptable) End If Next Return Nothing End Function End Class
Now all that is left is to call the RunScript() method from our little editor. So in the Click event of the button I have the following code:
Dim engine As New ScriptEngine engine.RunScript(TextBox1.Text)
In this simple demo the RunScript() method simply return true or false on success or failure. In a real application you probably want to get more information why the compilation of your script failed. In that case you can let the method return a CompilerErrorCollection instead. You get this collection from the CompilerResults class.
You can then loop through this collection and give information about each compiler error including the error message and the line the error occurred on.
Extending the script engine
You now probably want to add your own classes to the script engine. The easiest way of doing that is just to add your classes to the Scripting class library since we already add a reference to that in the script. I only used shared methods in my classes just to make the scripts easier to create, but you could do as you wish. To keep my scripts tidy of any extra “noise” I also simply required that the script contained a public sub called ScriptMain() so that it doesn’t even have the Class or the Implement IScriptable declaration in the script file. I simply add that to the source string before I pass it along to the CodeDomProvider.
I also used the ICSharpCode.TextEditor that is part of the SharpDevelop project to build my own custom IDE with syntax highlighting.