In the last post, I’ve shown how to parse SQL Server code with C# and get all tokens in it, showing their types. This is very nice, there is a lot you can do with that, but there is a pitfall: you don’t have the context the token was used. For example, you have an identifier, but you don’t know if it is a parameter or a variable declared in the procedure. You don’t know its type and where it is used.

Sometimes you need more information about the token and you can’t get it unless you analyze the code. Fortunately, Microsoft has already done that for us and provided a wonderful tool, so we can get the context of the tokens: you can create a new class that inherits from TSqlFragmentVisitor and override its methods, to visit the node types you want (if this doesn’t make sense to you right now, keep on reading, you’ll see it’s very simple).

The TSqlFragmentVisitor class is a massive class that uses the Visitor Pattern to analyze the SQL tree and visit all the nodes, so you can take action on the nodes that come to your attention. If you take a look at it, you will see that there are a huge number of Visit and ExplicitVisit overrrides, one for each kind of node you can have in a SQL Server procedure. Using Visit is similar to ExplicitVisit, with the difference that with ExplicitVisit you can control if the children nodes are also visited.

To use it, you must parse the tree in the same way you did in the last article and then create a visitor instance and pass it in the Accept method. This code shows how this is done:

static void Main(string[] args)
{
    var sql = "Select * from customer";
    var parsed = ParseSql(sql);
    if (parsed.errors.Any())
        return;
    var visitor = new SelectVisitor();
    parsed.sqlTree.Accept(visitor);
    Console.ReadLine();
}

private static (TSqlFragment sqlTree, IList<ParseError> errors) ParseSql(string procText)
{
    var parser = new TSql150Parser(true);
    using (var textReader = new StringReader(procText))
    {
        var sqlTree = parser.Parse(textReader, out var errors);

        return (sqlTree, errors);
    }
}

The visitor class is something like this:

internal class SelectVisitor : TSqlFragmentVisitor
{
    public override void Visit(SelectStatement node)
    {
        Console.WriteLine($"Visiting Select: {node}");
    }

    public override void Visit(QueryExpression node)
    {
        Console.WriteLine($"Visiting QueryExpression: {node}");
    }

    public override void Visit(QuerySpecification node)
    {
        Console.WriteLine($"Visiting QuerySpecification: {node}");
    }

    public override void Visit(SelectStarExpression node)
    {
        Console.WriteLine($"Visiting SelectStarExpression: {node}");
    }
}

Running this code, you will get something like this:

As you can see, the node and its children are visited and we have no control on that. If you want to have some control, you must use the ExplicitVisit:

internal class SelectVisitor : TSqlFragmentVisitor
{
    public override void ExplicitVisit(SelectStatement node)
    {
        Console.WriteLine($"Visiting Select: {node}");
    }

    public override void Visit(QueryExpression node)
    {
        Console.WriteLine($"Visiting QueryExpression: {node}");
    }

    public override void Visit(QuerySpecification node)
    {
        Console.WriteLine($"Visiting QuerySpecification: {node}");
    }

    public override void Visit(SelectStarExpression node)
    {
        Console.WriteLine($"Visiting SelectStarExpression: {node}");
    }
}

As you can see, the child nodes are not visited. To visit them, you must call the base method, like this:

internal class SelectVisitor : TSqlFragmentVisitor
{
    public override void ExplicitVisit(SelectStatement node)
    {
        Console.WriteLine($"Visiting Select: {node}");
        base.ExplicitVisit(node);
    }

    public override void ExplicitVisit(QueryExpression node)
    {
        Console.WriteLine($"Visiting QueryExpression: {node}");
    }

    public override void ExplicitVisit(QuerySpecification node)
    {
        Console.WriteLine($"Visiting QuerySpecification: {node}");
    }

    public override void ExplicitVisit(SelectStarExpression node)
    {
        Console.WriteLine($"Visiting SelectStarExpression: {node}");
    }
}

Nice, no? With this simple code, you can write something to check if any procedure in your database has a “select *”, which can be flagged as an error. Just override the Visit for the SelectStarExpression and flag the error if it’s visited.

Generating statistics from the database procedures

With this knowledge, we can write a program that generates statistics from the database procedures. We want to know how many inserts, deletes and updates there are in the procedures and what are the tables used with these commands. We also can know how many tables are created and dropped in the procedures.

For that, create a new console application and name it DataStats Add the NuGet package Microsoft.SqlServer.TransactSql.ScriptDom and add this code to Program.cs:

     static void Main(string[] args)
     {
         using (var con = new SqlConnection("Server=.;Database=WideWorldImporters;Trusted_Connection=True;"))
         {
             con.Open();
             var procTexts = GetStoredProcedures(con)
                 .Select(n => new { ProcName = n, ProcText = GetProcText(con, n) })
                 .ToList();
             var procTrees = procTexts.Select(p =>
             {
                 var processed = ParseSql(p.ProcText);
                 var visitor = new StatsVisitor();
                 if (!processed.errors.Any())
                     processed.sqlTree.Accept(visitor);
                 return new { p.ProcName, processed.sqlTree, processed.errors, visitor };
             }).ToList();
             foreach (var procTree in procTrees)
             {
                 Console.WriteLine(procTree.ProcName);
                 if (procTree.errors.Any())
                 {
                     Console.WriteLine("   Errors found:");
                     foreach (var error in procTree.errors)
                     {
                         Console.WriteLine($"     Line: {error.Line}  Col: {error.Column}: {error.Message}");
                     }
                 }
                 else
                 {
                     var visitor = procTree.visitor;
                     Console.WriteLine($"  Inserts: {visitor.Inserts}");
                     foreach (var table in visitor.InsertTables)
                     {
                         Console.WriteLine($"      {table}");
                     }

                     Console.WriteLine($"  Updates: {visitor.Updates}");
                     foreach (var table in visitor.UpdateTables)
                     {
                         Console.WriteLine($"      {table}");
                     }

                     Console.WriteLine($"  Deletes: {visitor.Deletes}");
                     foreach (var table in visitor.DeleteTables)
                     {
                         Console.WriteLine($"      {table}");
                     }

                     Console.WriteLine($"  Creates: {visitor.Creates}");
                     foreach (var table in visitor.CreateTables)
                     {
                         Console.WriteLine($"      {table}");
                     }

                     Console.WriteLine($"  Drops: {visitor.Drops}");
                     foreach (var table in visitor.DropTables)
                     {
                         Console.WriteLine($"      {table}");
                     }
                 }
             }
         }

         Console.ReadLine();
     }

     private static List<string> GetStoredProcedures(SqlConnection con)
     {
         using (SqlCommand sqlCommand = new SqlCommand(
             "select s.name+'.'+p.name as name from sys.procedures p " + 
             "inner join sys.schemas s on p.schema_id = s.schema_id order by name",
             con))
         {
             using (DataTable procs = new DataTable())
             {
                 procs.Load(sqlCommand.ExecuteReader());
                 return procs.Rows.OfType<DataRow>().Select(r => r.Field<String>("name")).ToList();
             }
         }
     }

     private static string GetProcText(SqlConnection con, string procName)
     {
         using (SqlCommand sqlCommand = new SqlCommand("sys.sp_helpText", con)
         {
             CommandType = CommandType.StoredProcedure
         })
         {
             sqlCommand.Parameters.AddWithValue("@objname", procName);
             using (var proc = new DataTable())
             {
                 try
                 {
                     proc.Load(sqlCommand.ExecuteReader());
                     return string.Join("", proc.Rows.OfType<DataRow>().Select(r => r.Field<string>("Text")));
                 }
                 catch (SqlException)
                 {
                     return null;
                 }
             }
         }
     }

     private static (TSqlFragment sqlTree, IList<ParseError> errors) ParseSql(string procText)
     {
         var parser = new TSql150Parser(true);
         using (var textReader = new StringReader(procText))
         {
             var sqlTree = parser.Parse(textReader, out var errors);

             return (sqlTree, errors);
         }
     }
 }

This code is very similar to the one in the last post, but it will create a visitor and will call the Accept method to visit its nodes. The visitor’s code is:

internal class StatsVisitor : TSqlFragmentVisitor
{
    public int Inserts { get; private set; }
    public int Updates { get; private set; }
    public int Deletes { get; private set; }
    public int Creates { get; private set; }
    public int Drops { get; private set; }
    public List<string> InsertTables { get; }
    public List<string> UpdateTables { get; }
    public List<string> DeleteTables { get; }
    public List<string> CreateTables { get; }
    public List<string> DropTables { get; }

    public StatsVisitor()
    {
        InsertTables = new List<string>();
        UpdateTables = new List<string>();
        DeleteTables = new List<string>();
        CreateTables = new List<string>();
        DropTables = new List<string>();
    }

    public override void Visit(InsertStatement node)
    {
        Inserts++;
        InsertTables.Add((node.InsertSpecification.Target as NamedTableReference)?.
            SchemaObject.BaseIdentifier.Value);
    }

    public override void Visit(UpdateStatement node)
    {
        Updates++;
        UpdateTables.Add((node.UpdateSpecification.Target as NamedTableReference)?.
            SchemaObject.BaseIdentifier.Value);
    }

    public override void Visit(DeleteStatement node)
    {
        Deletes++;
        UpdateTables.Add((node.DeleteSpecification.Target as NamedTableReference)?.
            SchemaObject.BaseIdentifier.Value);
    }

    public override void Visit(CreateTableStatement node)
    {
        Creates++;
        CreateTables.Add(node.SchemaObjectName.BaseIdentifier.Value);
    }

    public override void Visit(DropTableStatement node)
    {
        Drops++;
        DropTables.AddRange(node.Objects.Select(o => o.BaseIdentifier.Value));
    }

}

When there is an Insert, Delete, Update, Create Table or Drop Table, the corresponding method will be called and the properties referring to the node will be updated. At the end you will have the statistics for all the procedures in the database:

There is just a minor glitch in this code: as you can see in the Inserts, there is a blank line. That is due to the fact that the insert target isn’t a table, but a variable. This code solves the issue:

public override void Visit(InsertStatement node)
{
    Inserts++;
    var targetName = (node.InsertSpecification.Target as NamedTableReference)?.
        SchemaObject.BaseIdentifier.Value ??
      (node.InsertSpecification.Target as VariableTableReference)?.Variable.Name;
    InsertTables.Add(targetName);
}

Conclusions

As you can see, Microsoft has put a lot of work in this parser, you can have a lot of power when working with SQL Server code in C# with this parser. The options are endless: check quality of the code, refactor code, make statistics, and so on. In the next post we will see a simple way to reformat your SQL Server code, so you can ensure that the code follows your standards. See you then.

The code for this article is in https://github.com/bsonnino/SqlParsing

Leave a Reply

Your email address will not be published. Required fields are marked *

Post Navigation