Catching up ..

It has been a long time since I have posted anything. It has been a busy past year, I graduate from college, got into graduate school, and got married. Hopefully now that things are starting to settle down I can work on sharing more on here and finish a few open ended projects. 


I did want to mention that I have been following Ed Hickey’s learning adventure at http://blogs.msdn.com/edhickey/. I have known Ed for several year now and I am glad to see that he sharing his experiences with us.

GFN on DLR

Over the past few weeks I have been developing a version of Joel Pobar’s GFN language that utilizes the DLR framework. I have modified Joel’s grammar for the GFN language to make it similar to that of Visual Basic grammar. Here is the updated grammar


<stmt> := Dim <ident> = <expr>
    | <ident> = <expr>
    | For <ident> = <expr> To <expr> <stmt> End
    | Read <ident>
    | Print <expr>


<expr> := <string>
    | <int>
    | <bin_expr>
    | <ident>


<bin_expr> := <expr> <bin_op> <expr>
<bin_op> := + | – | * | /


<ident> := <char> <ident_rest>*
<ident_rest> := <char> | <digit>


<int> := <digit>+
<digit> := 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9


<string> := ” <string_elem>* ”
<string_elem> := <any char other than “>


 


Through this process I have discovered that developing a language is both an easy and difficult task. I say that it was a difficult task because of the complexity of most examples available on the internet such as IronPython. My rationale behind developing this version of the GFN language is to provide a more simple example of how to develop a language using the Dynamic Language Runtime.


In the coming weeks I will exploring and implementing new features such as user-defined functions, variable scoping, and classes. For those of you who are interested Kathy Kam has already outlined how to incorporate BCL calls from the original GFN source in her blog post Augmenting to the Good For Nothing Compiler.


 


The solution which contains both a C# and VB.Net version of Joel’s code and mine can be found here.


Edit – 1/09/2009 updated source code

DLR Visual Basic Scanner

In my research of how to develop a language I came across several different compilers on Codeplex. In my development of a compiler I am using the following design


image


In order to keep things simple I am following the grammar of the Visual Basic .NET language. I have started development and have managed to the scanner written, source code below; I based it upon Joel Pobar’s Nua. I am still fine turning and learning more about the proper way to develop a compiler.


 


Imports Microsoft.Scripting
Imports Microsoft.Scripting.Runtime

''' <summary>
''' A lexical analyzer for GFN. It produces a stream of lexical tokens.
''' </summary>
Public Class Scanner
    ' Buffer used to extract and process tokens
    Private _buffer As TokenizerBuffer
    ' Used to track errors
    Private _errors As ErrorSink
    ' Source code to be read.
    Private _source As SourceUnit
    ' Token being processed
    Private _token As Token

    ''' <summary>
    ''' Fetches the current token without advancing the stream position
    ''' </summary>
    ''' <returns>The current token.</returns>
    Public ReadOnly Property Peek() As Token
        Get
            If (Me._token Is Nothing OrElse Me._token.Type = TokenType.None) Then
                Me._token = ReadToken()
            End If

            Return Me._token
        End Get
    End Property ' Peek

    ''' <summary>
    ''' Constructs a scanner for the specified TextReader.
    ''' </summary>
    ''' <param name="source">Represents the source code to be processed.</param>
    Public Sub New(ByVal errors As ErrorSink, ByVal source As SourceUnit)
        Me._errors = errors
        Me._source = source

        If (Me._source Is Nothing) Then Throw New ArgumentNullException("SourceUnit")

        ' multiEolns - Whether to allow multiple forms of EOLN If false only '\n' is treated as a line separator otherwise '\n', '\r\n' and '\r' are treated as separators
        Me._buffer = New TokenizerBuffer(Me._source.GetReader(), SourceLocation.MinValue, 1024, True)
    End Sub ' New

    ''' <summary>
    ''' Reads the next token available in the stream.
    ''' </summary>
    ''' <returns>Next token available in the stream.</returns>
    Public Function Read() As Token
        Dim readToken As Token = Nothing

        If (Me._token.Type = TokenType.None) Then
            readToken = Me.ReadToken()
            Me._token = Me.ReadToken()
            Return readToken
        End If

        readToken = Me._token
        Me._token = Me.ReadToken()
        Return readToken
    End Function ' Read

    ''' <summary>
    ''' Read the next avaliable token in the stream.
    ''' </summary>
    ''' <returns>Next avaliable token.</returns>
    Private Function ReadToken() As Token
        Dim token As Token = Nothing
        Dim nchr As Char = Nothing

        ' Discard any white spaces.
        While (Me._buffer.Peek <> -1 AndAlso Char.IsWhiteSpace(ChrW(Me._buffer.Peek)))
            Me._buffer.Read()
            ' Buffer can drop current token.
            Me._buffer.DiscardToken()
        End While

        ' Has the end of the buffer been reached?
        If (Me._buffer.Peek = -1) Then
            Me._buffer.MarkTokenEnd(True)
            token = New EndOfStreamToken(New SourceSpan(Me._buffer.TokenStart, Me._buffer.TokenEnd))
            ' Buffer can drop current token.
            Me._buffer.DiscardToken()
            Return token
        End If

        ' Read the first character avaliable in the buffer. 
        nchr = ChrW(Me._buffer.Peek())

        If (Char.IsLetter(nchr) OrElse nchr = "_"c) Then
            Return Me.ScanKeywordOrIdentifier()
        ElseIf Char.IsDigit(nchr) Then
            Return Me.ScanNumericLiteral()
        ElseIf (nchr = """"c) Then
            Return Me.ScanStringLiteral()
        ElseIf (nchr = "="c) Then
            Me._buffer.Read()
            Me._buffer.MarkTokenEnd(False)
            token = New PunctuatorToken(New SourceSpan(Me._buffer.TokenStart, Me._buffer.TokenEnd), TokenType.Equals)
            ' Buffer can drop current token.
            Me._buffer.DiscardToken()
            Return token
        ElseIf ("+*/^(){}".IndexOf(nchr) <> -1) Then
            Dim type As TokenType = TokenType.None

            ' Single-character punctuation.
            Select Case Me._buffer.Read()
                Case AscW("+")
                    type = TokenType.Plus
                Case AscW("*")
                    type = TokenType.Star
                Case AscW("/")
                    type = TokenType.ForwardSlash
                Case AscW("^")
                    type = TokenType.Caret
                Case AscW("(")
                    type = TokenType.LeftParenthesis
                Case AscW(")")
                    type = TokenType.RightParenthesis
                Case AscW("{")
                    type = TokenType.LeftCurlyBrace
                Case AscW("}")
                    type = TokenType.RightCurlyBrace
            End Select

            Me._buffer.MarkTokenEnd(False)
            token = New PunctuatorToken(New SourceSpan(Me._buffer.TokenStart, Me._buffer.TokenEnd), type)
            ' Buffer can drop current token.
            Me._buffer.DiscardToken()
            Return token
        Else
            ' An invalid character has been discovered. 
            Me._buffer.MarkTokenEnd(False)
            token = New ErrorToken(SyntaxErrorType.InvalidCharacter, New SourceSpan(Me._buffer.TokenStart, Me._buffer.TokenEnd))
            ' Buffer can drop current token.
            Me._buffer.DiscardToken()
            Return token
        End If
    End Function ' ReadNextToken 

    ''' <summary>
    ''' Scans in all the digits in the numeric literal.
    ''' </summary>
    ''' <param name="acc"></param>
    Private Sub ScanDigit(ByRef acc As Text.StringBuilder)
        Dim nchr = ChrW(Me._buffer.Peek)
        If (Me._buffer.Peek = -1 OrElse Not Char.IsDigit(nchr)) Then
            Me._errors.Add(Me._source, "Expected digits", New SourceSpan(Me._buffer.TokenStart, Me._buffer.TokenEnd), 0, Severity.FatalError)
            ' Invalid numeric constant. 
            acc.Append("0")
        Else
            While (Me._buffer.Peek <> -1 AndAlso Char.IsDigit(nchr))
                acc.Append(Chr(Me._buffer.Read()))
                nchr = ChrW(Me._buffer.Peek)
            End While
        End If
    End Sub ' ScanDigit

    ''' <summary>
    ''' Scans in the identifier.
    ''' </summary>
    ''' <returns>Token associated with the identifier.</returns>
    Private Function ScanIdentifier() As Token
        Dim acc As New Text.StringBuilder()
        Dim nchr As Char = Nothing
        Dim token As Token = Nothing


        nchr = ChrW(Me._buffer.Peek())
        While (Me._buffer.Peek <> -1 AndAlso (Char.IsLetterOrDigit(nchr) OrElse nchr = "_"c))
            acc.Append(ChrW(Me._buffer.Read()))
            nchr = ChrW(Me._buffer.Peek())
        End While

        Me._buffer.MarkTokenEnd(False)
        token = New IdentifierToken(acc.ToString(), New SourceSpan(Me._buffer.TokenStart, Me._buffer.TokenEnd), TokenType.Identifier)
        ' Buffer can drop current token.
        Me._buffer.DiscardToken()
        Return token
    End Function ' ScanIdentifier

    ''' <summary>
    ''' Identifies the keyword or identifier.
    ''' </summary>
    ''' <returns>Token associated with the given keyword or identifier.</returns>
    Private Function ScanKeywordOrIdentifier() As Token
        Dim identifier As IdentifierToken = Me.ScanIdentifier()
        Dim token As Token = Nothing

        Me._buffer.MarkTokenEnd(False)
        token = New IdentifierToken(identifier.Identifier, New SourceSpan(Me._buffer.TokenStart, Me._buffer.TokenEnd), IdentifierToken.TokenTypeFromString(identifier.Identifier))
        ' Buffer can drop current token.
        Me._buffer.DiscardToken()
        Return token
    End Function ' ScanKeywordOrIdentifier

    ''' <summary>
    ''' Scans in a numeric literal.
    ''' </summary>
    ''' <returns>Token associated with the numeric literal.</returns>
    Private Function ScanNumericLiteral() As Token
        Dim acc As New Text.StringBuilder()
        Dim token As Token = Nothing

        Me.ScanDigit(acc)
        Me._buffer.MarkTokenEnd(False)
        token = New IntegerLiteralToken(Integer.Parse(acc.ToString()), New SourceSpan(Me._buffer.TokenStart, Me._buffer.TokenEnd))
        ' Buffer can drop current token.
        Me._buffer.DiscardToken()
        Return token
    End Function ' ScanNumericLiteral

    ''' <summary>
    ''' Scans in a string literal.
    ''' </summary>
    ''' <returns>Token associated with the string literal.</returns>
    Private Function ScanStringLiteral() As Token
        Dim acc As New Text.StringBuilder()
        Dim nchr As Char = Nothing
        Dim token As Token = Nothing

        ' Discard the initial quote. 
        Me._buffer.Read()

        ' Has the end of the buffer been reached?
        If (Me._buffer.Peek = -1) Then
            Me._errors.Add(Me._source, "Unterminated string literal", New SourceSpan(Me._buffer.TokenStart, Me._buffer.TokenEnd), 0, Severity.FatalError)
            Me._buffer.MarkTokenEnd(False)
            token = New ErrorToken(SyntaxErrorType.InvalidStringLiteral, New SourceSpan(Me._buffer.TokenStart, Me._buffer.TokenEnd))
            ' Buffer can drop current token.
            Me._buffer.DiscardToken()
            Return token
        End If

        nchr = ChrW(Me._buffer.Peek)

        ' Read until the terminating quote is read. 
        While (Not nchr = """"c)
            acc.Append(ChrW(Me._buffer.Read()))

            ' Has the end of the buffer been reached?
            If (Me._buffer.Peek = -1) Then
                Me._errors.Add(Me._source, "Unterminated string literal", New SourceSpan(Me._buffer.TokenStart, Me._buffer.TokenEnd), 0, Severity.FatalError)
                Me._buffer.MarkTokenEnd(False)
                token = New ErrorToken(SyntaxErrorType.InvalidStringLiteral, New SourceSpan(Me._buffer.TokenStart, Me._buffer.TokenEnd))
                ' Buffer can drop current token.
                Me._buffer.DiscardToken()
                Return token
            Else
                nchr = ChrW(Me._buffer.Peek)
            End If
        End While

        ' Discard the terminating quote.
        Me._buffer.Read()

        Me._buffer.MarkTokenEnd(False)
        token = New StringLiteralToken(acc.ToString(), New SourceSpan(Me._buffer.TokenStart, Me._buffer.TokenEnd))
        ' Buffer can drop current token.
        Me._buffer.DiscardToken()
        Return token
    End Function ' ScanStringLiteral
End Class' Scanner

Dynamic Language Runtime

One of the things I have wanted to do is develop a programming language. To me it was one of those things that seemed extremely complicated, and that it would ultimately be something I do when I am much older. Recently my interest got peeked and I started to ask around what all needed to be done in order to develop a compiler. Paul Vick suggested that I look at IronPython and IronRuby, both of which implement a new framework being developed by Microsoft known as the Dynamic Language Runtime (DLR). The DLR enables developers to more readily develop dynamic languages for the .NET platform


Over the next several weeks I will be studying the DLR and working on implementing to build a basic compiler. Keep on eye out for updates.


 

Generics in Visual Basic 2005

Summary: Provides an overview of generics, constraints and their implementation.

 Generics in Visual Basics 2005 make it possible to re-use code and still have strong typing.  With Visual Basics 2003 a different collection had to be defined for each data-type causing repetition of code. Generics allows for classes and methods to facilitate unrelated types. A major benefit of Generics is that one collection can be created to handle all data-types needed, thus cutting the amount of code needed.
Using Object as the item for the collection is another way that one collection can handle all data-types needed. However, with the use of Object the performance would be adversely affected because the types would require casting and type validation. Whereas, with the use of Generics the collection is strongly typed for all data-types; meaning that there is automatic type validation and other processes are avoided that would affect performance.  Defining Generic Types 
A generic type is considered to be a complete generic class. When defining a generic class the Of keyword followed by a character(s) is added to class definition. 

This is a definition of a regular class:


Public Class DemoClass    ‘ Some methods and types here!End Class 

And now this is a definition of a generic class:


Public Class DemoClass(Of t)    ‘ Some methods and types here!End Class 

Notice the difference between the regular class definition and the generic class definition. The generic class has the keyword Of followed by a t in parentheses at the end of the declaration line. The Of keyword is telling the compiler that there will be a type named “t” that will be filled in later. That type is defined by whatever is passed to it, when it is implemented. The character(s) that follows the Of keyword are known as a generic type parameter. Any combination of letters can be used, not just a single letter.  â€œt” would be considered the generic parameter in this case.

 Using Generic Type Parameters 

The use of generic types within methods allows the class to contain more flexible methods. The benefit of using these types within methods can clearly be seen through a strongly typed collection example.

 

Without the use of generics a simple add method within a collection would look like

 Public Class DemoClass    Inherits System.Collections.CollectionBase     Public Function Add(ByVal value As Person) As Int32        Me.InnerList.Add(value)    End FunctionEnd Class 

However, with generics the collection is able to handle any type passed.

  Public Class GenericCollection(Of t)    Inherits System.Collections.CollectionBase       Public Function Add(ByVal value As t) As Int32        Me.InnerList.Add(value)    End FunctionEnd Class 

The amount of coding is cut due to the fact that one strongly typed collection can be created to handle all types needed. In the code example above the generic parameter is used as the type of the parameter for the Add function.

 

When defining the class for use a type must be passed to define the generic parameter, for that instance of the class.

 Public Class DemoClass    Public Sub RandomMethod()        Dim gc As GenericCollection(Of Person)    End SubEnd Class Notice that the Person object has been used for the generic parameter in this case. In theory the code above creates a new instance of the GenericCollection class and passes the Person object as the generic parameter. What this means is that this instance of the GenericCollection is strongly typed for a Person object. 

Creating Generic Methods 

Creating generic methods is similar to the creation of a regular method. 

This is a definition of a regular method.


Public Class DemoClass    Public Sub RegularMethod()        ‘ Some code goes here!    End SubEnd Class
This is a definition of a generic method. 

Public Class DemoClass    Public Sub GenericMethod(Of t)()        ‘ Some code goes here!    End Sub

End Class

 

Generic methods can used to reduce the amount of code needed to complete several similar tasks. For example consider obtaining data from a database. One generic method can be written to handle all type within that database.

 Public Function GetValues(Of t)(ByVal sql As String) As List(Of t)    Dim cmd As New OleDbCommand(sql, connection)    Dim reader As OleDbDataReader = cmd.ExecuteReader()    Dim list As New List(Of t)     While reader.Read()        list.Add(CType(reader(0), t))    End While     Return listEnd Function Why use List(Of t) instead of CollectionBase? 

When creating a new collection class List(Of t) should be used as the base class over CollectionBase. This is the case because CollectionBase is a non-generic class, and there is no efficient way to switch between non-generic and generic code. List(Of t) provides the full functionality of any give collection in generic form.

 What if I want to limit generic types? 

In several cases the generic type parameter needs to be limited to a specific type. Generic types need to be limited in these cases to ensure that the particular type can perform the task needed. These types are limited through the implementation of constraints. Considering a generic collection that contains a sort method, that generic type would have to be limited to types that implement the IComparable interface.


Public Class GenericCollection(Of t As IComparable(Of t))    Inherits System.Collections.Generic.List(Of t)    Public Sub New()        ‘ Initialize a new instance of this object.        MyBase.New()    End SubEnd Class 

Notice when the generic type is defined the constraint is also put into place. The constraint is defined by the use of the As keyword directly after the generic type parameter. It is important that the same character(s) be used in defining both the generic type parameter and the generic constraint.

 

IComparable is an interface that is implemented upon types that can be sorted. This interface defines a general method that tells the application how types should be compared. All collections can be sorted by a value within that collection. Notice that the code below does not implement the IComparable interface.

 Public Class Person    Private _firstName As String    Private _lastName As String     Public Property FirstName() As String        Get            Return _firstName        End Get        Set(ByVal value As String)            _firstName = value        End Set    End Property     Public Property LastName() As String        Get            Return _lastName        End Get        Set(ByVal value As String)            _lastName = value        End Set    End PropertyEnd Class 

When the Person object, whose code is above, is used as the generic type for the GenericCollection class an error occurs.

 Public Class DeomClass    Public Sub DemoMethod()        Dim gc As New GenericCollection(Of Person)    End SubEnd Class 

The error states that the Person object does not implement the IComparable interface. IComparable is used as a constraint to ensure that the type can be sorted. To correct the error simply implement the IComparable interface.

 Public Class Person    Implements IComparable(Of Person)     ‘ The rest of the code for the object.     Public Function CompareTo(ByVal other As Person) As Integer Implements System.IComparable(Of Person).CompareTo     End Function

End Class

 What if I need multiple constraints? 

In some cases generic types need to be constrained by more than one constraint. Multiple constraints are defined just like single constraints with a small exception.

 Public Class GenericCollection(Of t As {IComparable(Of t), New})    Inherits System.Collections.Generic.List(Of t)    Public Sub New()        ‘ Initialize a new instance of this object.        MyBase.New()    End SubEnd Class 

Notice that when defining multiple constraints they are enclosed in { } and separated by a comma.

 

Special Constraints

 

There are a few special constraints such as classes and new. These constraints are rather important to use because it ensures that the generic parameter has the capability to accomplish the task necessary. For instance when using a generic parameter that needs to be initialized, the constraint new needs to be used.

 Public Class DemoClass(Of t As New)    Public Function GetItem() As t        Dim newType As New t         ‘ Some more code here.         Return newType    End Function

End Class

 

The code above is used just to illustrate a purpose, and has no real functionality as it is. New is used as a constraint to ensure that the generic parameter has a default constructor to be called upon. Classes are used as constraints to check if the type inherits from that particular class. The use of classes as a constraint is a good idea because it ensures that the type will be able to perform the functions within the class.

Generics and constraints reduce the amount of work that the program used to have to do. In the past a programmer had to create several collections that all served the same purpose and perhaps had to spend a great deal of time debugging that code. Now with generics that headache has become a thing of the past.