Don't Break The Build .Net Tutorials

16Feb/120

Learning CIL Series – Part 1

This begins part 1 of a many part series on the Common Intermediate Language (formely known as MSIL) which I will refer to as CIL from now on. CIL is the assembly language of the .NET platform and is what all .NET languages ultimately get boiled down to. My reason for doing these tutorials is simple, I want to learn more about CIL and the .NET internals and what better way to do that than blogging and teaching others. Why do I want to learn about .NET internals as well as CIL? I’m creating a programming language that I want to target the .NET framework and I don’t want to be constrained by C#.

Before we get started, you will need a few things:

  •  ilasm.exe : this comes with the .NET framework and is located in the .NET folder. For me, it is located at C:\Windows\Microsoft.NET\Framework\v2.0.50727
  •  a text editor : there are no nifty IDEs, that I could find, for CIL so I used Notepad++. You are free to use any kind of text editor you want as long as it isn’t Microsoft Word.
  •  some basic command line knowledge : without an IDE it means we have to build our applications using ilasm.exe at the command line. PLEASE NOTE: If you are planning on using the plain command line (and not Visual Studio Command Line) you will need to add the path for ilasm to your PATH variable
  • -basic idea of what a stack is

Getting Started:

Now, let’s get started with the ceremonial “Hello World” application, shall we? First I’ll post the code and we can go over it line by line:

.assembly extern mscorlib {}
.assembly HelloWorld {}

.method public static void main() cil managed
{
    .entrypoint

    ldstr “Hello World”
    call void [mscorlib]System.Console::WriteLine(string)
    ret
}

We can save it as HelloWorld.il and compile it by running:
ilasm HelloWorld.il

That will generate an executable, HelloWorld.exe, which we can then run by just typing “HelloWorld.exe” in the command prompt and BAM! it should spit out “Hello World”. But why?

Breaking It Down:

We start with two lines:

.assembly extern mscorlib {}
.assembly HelloWorld {}

The first line states that we are going to be including the external assembly mscorlib in our application. What is mscorlib? It is the .dll of the System namespace. Basically it is the CIL way to do: using System; The curly braces have a use, however, currently we don’t need to worry ourselves with it. We put properties about the assembly in there if we want/need to, we currently don’t but I will cover that in a later post. The next line declares our assembly as HelloWorld, again with blank properties.

Following comes the long line of:

.method public static void main() cil managed

If you didn’t know already, this line is declaring a public static method, named “main” that takes no arguments and is managed by the .NET framework. This is similar to the C#:
public class Program1
{
    public static void Main()
    {
        // stuf goes here
    }
}

An astute reader may notice that in our CIL file, we didn’t declare a class. How does our program work then? You certain can’t have a method with no class in C#. The answer is this: while CIL has an object oriented focus, it allows you to act procedurally as well. The rule of no global methods is not a CIL rule but a C# rule. Now you may start to see what I mean when I said I didn’t want the limitations of C# as a target language.

We are now to the meat of the source, the stuff that actually DOES something. The first thing you’ll notice is:

.entrypoint

This tells the framework that this function is the entry point of the application. Just like how you can only have one public static void main in a C# application, you can only have one entry point in a CIL application. You will need to specify one method with .entrypoint in order for your application to run. The only time you don’t need it is if you are writing a library (.dll).
ldstr “Hello World”

This instruction pushes a string object onto the stack. .NET is a stack based virtual machine, which means that if we want to call a method that takes two arguments, we better make sure those two arguments are on top of the stack. If not, things won’t work. So we push our message “Hello World” onto the stack so that we can call Console.WriteLine:
call void [mscorlib]System.Console::WriteLine(string)

We make the function call to System.Console.WriteLine by using the call instruction followed by the return type. We then have to specify in which assembly the System.Console class is located in, which we put in brackets. We then call the class and use :: to point to the method we are calling. Since we are writing a string to the screen, we make sure that the framework knows we are expecting a string by putting the type in parenthesis.

Something to note: Whenever we call a method that takes an argument, that argument is taken off the top of the stack, popped, and used in the function. We lose access to that data permanently. There are ways around that, which we will cover in future lessons.

Back to the code. So, we just called the WriteLine method of the Console class and that means we are finished, right? I mean, we did everything that we wanted to do. Why not pack up and call it a day? Not so fast. We missed one thing. The return statement. We need to specify that our code is done running so we return using the ret instruction. After that, we close our braces and are done.

Fin.

Congratulations, you made it through the first tutorial. I hope that this has sparked an interest in learning CIL and more about the internals of the .NET framework. If you have any questions/comments/concerns, please feel free to leave them in the comments. I’m not sure what is on tap for the next tutorial, so please stay tuned.

Check out Part 2 of the series where creating our own methods is covered.


Tagged as: , Leave a comment
Comments (0) Trackbacks (0)

No comments yet.


Leave a comment

(required)


*

No trackbacks yet.