Technical Article

Assembly vs. C: Why Learn Assembly?

September 20, 2019 by Colin Walls

This article discusses two programming languages, namely, C and Assembly, and presents the need to know Assembly language for programming embedded systems.

Assembly Language and the Rise of Inexpensive Memory

Currently, most embedded systems programming is done in C; if not C, then another high-level language like C++.

It was not always like this. In the early days of embedded systems, code was all written in assembly language; that was the only option. In those days, memory was extremely limited, so very tight control of its use was essential and assembly provided that control. But, apart from that, no high-level language tools were available.

It was some years before tools arrived on the market and quite a few more years before their quality was really good enough for serious code development. The tools came along at just the right time, as processors were becoming more powerful (16-bit and 32-bit devices became viable), memory was getting cheaper and denser and application complexity was increasing.

So, what about today? We have hugely powerful processors which may be provided with enormous amounts of memory, running extremely complex applications, which are developed by large teams of programmers.

Where do assembly language skills fit in?


Why Learn Assembly? Embedded Systems Programming Skills

There are really two skills, each of which may be valuable: the ability to read/understand assembly language and the ability to write it.


Why You Should Know How to Read Assembly Language

Most embedded software developers should have some ability to read assembly language. This is needed for two reasons.

First, the efficiency of code in an embedded system is almost always critical. Modern compilers generally do a really great job of optimizing code. However, being able to understand what great things the compiler has done is important. Otherwise, there may be confusion while debugging.

Compilers tend not to just translate C to assembly language. A good, modern compiler takes an algorithm expressed in C and outputs a functionally equivalent algorithm expressed in assembly. Not the same thing. This is why debugging can be challenging.

It is also possible the compiler did not do a perfect job—perhaps the C code was not written in the clearest way—and the developer needs to be able to understand what has gone awry. Inspection of compiler-generated code should be a routine part of the development process. This gives the opportunity to ensure that the compiler output really does what the programmer intended and has not been misinterpreted by an overly-zealous optimizer.

The second reason why some developers need to be able to read assembly is that it is essential when coding “close to the hardware”. Drivers are not necessarily written in 100% assembly nowadays, but some assembly language content is almost inevitable. Being able to understand what a driver is doing, in detail, is necessary to use it most effectively and to perform troubleshooting.


Why You Should Know How to Write Assembly Language

What about writing assembly language? Nowadays, it would be very unusual for an entire application to be written in assembly language; most of the code, at least, is written in C. So, C programming skills are the key requirement for embedded software development. However, a few developers need to have a grasp of assembly language programming. Of course, this skill is specific to a particular processor; however, if a designer has mastered the assembly language of one CPU, migrating to another need not be too challenging.

There are two reasons to write assembly language. The first and most important reason is to implement some functionality that is not possible to express in C. A simple example might be disabling interrupts. This might be achieved by writing an assembly language subroutine and calling it as if it were a C function. To do that, the call/return protocol of the C compiler in use must be known, but this is generally easy to figure out. You could just look at compiler-generated code, for example.

The other way to implement assembly language code is to insert it inline into the C code, normally using the asm extension keyword. This makes particular sense when a single or just a few assembler instructions are needed, as the call/return overhead is eliminated. The implementation of this extension varies from one compiler to another, but commonly an asm statement takes this kind of form:

asm(" trap #0");

Typically, the only places where functionality that cannot be expressed in C is required are start-up code and device drivers. This part of the development of an embedded software application involves a small number of developers. Thus, the need for assembly-writing skills is, as mentioned above, limited to a select group of engineers.

Some developers feel that they need to know how to write assembly language in order to implement code in a “more efficient” way than the compiler will manage. It is possible that, on some very rare occasions, they may be right. However, most modern compilers do a remarkable job of optimizing and generating efficient code (keep in mind that “efficient” can mean fast or compact—you choose, though sometimes you can get both).

Here is an example:

#define ARRAYSIZE 4

char aaa[ARRAYSIZE];

int main()
    int i;
    for (i=0; i

This looks like a simple loop that sets each element of the array to zero. If you compile this with a reasonable amount of optimization activated and try to debug the code, you will get an odd result: it would jump straight through the loop (i.e., behaves as if there were no loop at all). This is because the compiler determines that a 32-bit move of zero into the array would do the job much more efficiently than a loop.

The resulting code (in this case for an Arm processor) looks something like this:

<code> mov r3, #0
  ldr r2, .L3
  mov r0, r3
  str r3, [r2]
  bx lr
  .word .LANCHOR0

Tweaking the value of ARRAYSIZE produces some interesting results. Setting it to 5 gives this:

<code>  mov r3, #0
  ldr r2, .L3
  mov r0, r3
  str r3, [r2]
  strb r3, [r2, #4]

Still no loop. Making it 8 carries on in this vein:

<code>  mov r3, #0
  ldr r2, .L3
  mov r0, r3
  str r3, [r2]
  str r3, [r2, #4]

Then, building this code for a 64-bit CPU gets even better:

<code> mov w0, 0
  str xzr, [x1, #:lo12:.LANCHOR0]


And so it continues. Larger array sizes result in efficient loops or maybe just calling a library function like memset(), a standard C library function that can be called from assembly.

The bottom line is that assembly language skills are far from obsolete, but many highly skilled and very productive embedded software developers may be limited to competent assembly code reading.



If you'd like to learn more about the other side of this concept, check out Robert Keim's article on C language for embedded programming.

Share your thoughts and experiences regarding the utility of assembly language in the comments below.

  • Lo_volt September 23, 2019

    I believe it was the MOVEM command in the Motorola MC68000 assembly language that allowed my coworker to move a block of data far faster than could be achieved using c++ alone.  MOVEM allowed the programmer to set a addresses for the data locations (from and to) and a counter.  Once the MOVEM command was executed the processor moved the data without any further commands.  c++ implementation would have required a loop and several repeated commands to complete it.  Execution time using assembly was a small fraction of the compiled c++ code time.

    Like. Reply
    • 42BS September 27, 2019
      MOVEM => move multiple. You could load/store multiple registers. But no looping. Max. was movem.l d0-d7/a0-a6,(a7)+, so store 17*4 bytes at once.
      Like. Reply
  • M
    MIS42N September 27, 2019

    Long ago I looked at a program written in FORTRAN that did some serious array manipulation. The computer was in use on business days, the program took too long to run overnight so it ran only on weekends. This seriously impacted the research the program was used for. I did a calculation how long the array manipulation should take if written in assembler, and came up with around 5 hours. The compiler was able to print out the assembler equivalent of the instructions it generated, to my surprise it was about 10% slower than a hand coded equivalent. It turned out the time was being wasted in moving array elements to and from disk. A few changed lines and the disk I/O was done 1000 elements at a time and the program could easily be run overnight or multiple times on the weekend. Although there was no assembler in the final program, it was an appreciation of assembler that led firstly to thinking the program was not running efficiently, and the ability to hunt down the real culprit.

    Like. Reply