I’d like to introduce my understanding of “Instruction Set Architecture” aka ISA here, and close off with a pretty cool example we learned in class. To kick it off, here’s a question that floated around in my head but sort of took for granted:

Do different programming languages compile into fundamentally different programs?

And the answer to this is no. As in, once source code is compiled, the resulting program is a common computer language called Assembly. (There’s different versions of assembly, and I think this depends on the compiler/computer.) So no matter your programming language, ultimately the compiled program is written in the one language that your computer can actually read and run.

What does this language do? Our programs have tons of instructions that do things which seemingly never affect the actual, real computer that it runs on. If I define a variable with ‘int x = 0;’, what happens? Assembly makes it clear what each line will do, and what actual pieces of hardware it manipulates.

This is where Instruction Set Architecture comes in. ISA is the line between software and hardware in a computer: it lets you communicate between the program and the components of a computer. I’ll introduce the different components in a later post, but first understand this: every program instruction changes something in the computer. Be it defining a variable, adding two numbers, or calling a function, something physical has to happen to excecute it.

So what happens? Glad you asked. As it’s name implies, there ISA is a set of instructions, sort of like function names. If I want to add two numbers, I might call the “add two numbers” instruction, and the computer will run some stuff through its memory and ALU (arithmetic and logic unit) and add two numbers for me. The actual implementation of this adding instruction is found in something called the “microarchitecture”. The microarchitecture essentially contains the implementations of the functions named in the ISA.

There may not be an “add two numbers” instruction that does exactly that in the ISA. Instead, there are instructions that are capable of adding two numbers, but may not do only that. Let’s put that more clearly. Every programming language has thousands upon thousands of basic commands and functions. When compiled, all of this has to be transformed into a bunch of ISA instructions. So we know the ISA has enough instructions to perform everything you would want to do. But instead of creating carbon copies of the 1000’s of commands, the ISA uses far fewer, along with a smart compiler that can express all sorts of commands as a combination of these instructions.

How much fewer? Let’s get to my cool example (yay!). What is the smallest number of instructions that we would need in order to express every single C++, Java, and other programming language’s operations? It may seem overwhelming, but often operations can be grouped together.Let’s break down the different types of tasks we want to do:
1. Arithmetic. Adding, subtracting, and the like are all important. Note that once we can add, however, we can do everything: subtracting is easy, and all other operations can be approximated in some way by it.
2. Access memory. Everything has to be stored somewhere. If we want to reference a variable, we need to read it from somewhere. If we performed an operation, we need to store the result somewhere.
3. Conditional logic. The ones that probably immediately come to mind are if(), for(), and while() loops, but even things as simple as calling a function involves jumping around in the code. We need ways to do all of these.
And that’s it! Every single operation we ever write can be coded into one of the above. (Side note, when learning this our teacher never mentioned input/output such as displaying text, but I assume this is somewhere else in the architecture, since things like display are part of the hardware.)
So how many instructions do we really need? If you’re ballsy you might say 3, one for each category. But actually, we only need one! That’s right, one instruction can successfully compile every single operation out there. And it looks like this:
subleq(a,b,c){
Mem[b] = Mem[b] – Mem[a];
if(Mem[b] <= 0) goto c
}
Subtraction is easy to see here. We can access memory by letting b be where we access and a be 0. We can jump to something by putting a negative number into memory and letting b reference it.
That’s not fair, you say! That function clearly uses things like subtraction and an if conditional in it. Where’s the implementation for those? Well this whole thing is all one instruction, you see. The microarchitecture that performs each part is still all one instruction and inseparable. Obviously it’s not the most efficient way to do anything, but it’s the smallest! (If it seems like a loophole, better suck it up, because CS is all about loopholes!)

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s