•  Assembly language programs consist of mnemonics, thus they should be translated into machine code. 
  • A program that is responsible for this conversion is known as assembler (converts assembly language into machine code)
  • Assembly language is often termed as a low-level language because it directly works with the internal structure of the CPU. 
  • To program in assembly language, a programmer must know all the registers of the CPU.
  • Different programming languages such as C, C++, Java and various other languages are called high-level languages because they do not deal with the internal details of a CPU.

  • In contrast, an assembler is used to translate an assembly language program into machine code (sometimes also called object code or opcode). 
    • Similarly, a compiler translates a high-level language into machine code. For example, to write a program in C language, one must use a C compiler to translate the program into machine language.

    Structure of Assembly Language

    • An assembly language program is a series of statements, which are either assembly language instructions such as ADD and MOV, or statements called directives.
    • An instruction tells the CPU what to do, while a directive (also called pseudo-instructions) gives instruction to the assembler. For example, ADD and MOV instructions are commands which the CPU runs, while ORG and END are assembler directives. 
    • The assembler places the opcode to the memory location 0 when the ORG directive is used, while END indicates to the end of the source code. A program language instruction consists of the following four fields −
    • [ label: ]   mnemonics  [ operands ]   [;comment ]
    • A square bracket ( [ ] ) indicates that the field is optional.
    • The label field allows the program to refer to a line of code by name. The label fields cannot exceed a certain number of characters.
    • The mnemonics and operands fields together perform the real work of the program and accomplish the tasks. 
    • Statements like ADD A,C & MOV C,#68, where ADD and MOV are the mnemonics (opcodes), "A,C" and "C,#68" are operands. These two fields could contain directives. 
    • Directives do not generate machine code and are used only by the assembler, whereas instructions are translated into machine code for the CPU to execute.
    • The comment field begins with a semicolon which is a comment indicator.


    Assembling and Running an 8051 Program

    • Here we will discuss about the basic form of an assembly language. The steps to create, assemble, and run an assembly language program are as follows −
    • First, we use an editor to type in a program. Editors like MS-DOS EDIT program that comes with all Microsoft operating systems can be used to create or edit a program. The Editor must be able to produce an ASCII file. The "asm" extension for the source file is used by an assembler in the next step.
    • The "asm" source file contains the program code created in Step 1. 
    • It is fed to an 8051 assembler. 
    • The assembler then converts the assembly language instructions into machine code instructions and produces an .obj file (object file) and a .lst file (list file). 
    • It is also called as a source file, that's why some assemblers require that this file have the "src" extensions. 
    • The "lst" file is optional. It is very useful to the program because it lists all the opcodes and addresses as well as errors that the assemblers detected.
    • Assemblers require a third step called linking. 
    • The link program takes one or more object files and produces an absolute object file with the extension "abs".
    • Next, the "abs" file is fed to a program called "OH" (object to hex converter), which creates a file with the extension "hex" that is ready to burn in to the ROM.


    8051 data types and directives

    • The 8051 microcontroller has only one data type. 
    • It is 8 bits, and the size of each register is also 8 bits. 
    • It is the job of the programmer to break down data larger than 8 bits (00 to FFH, or 0 to 255 in decimal) to be processed by the CPU. 
    • The data types used by the 8051 can be positive or negative.


    DB (define byte)

    The DB directive is the most widely used data directive in the assembler. It is used to define the 8-bit data. When DB is used to define data, the numbers can be in decimal, binary, hex, or ASCII formats. For decimal, the “D” after the decimal number is optional, but using “B” (binary) and “H” (hexadecimal) for the others is required. To indicate ASCII, simply place the characters in quotation marks (‘like this’). Following are some DB examples:


    Assembler Directives

    • The assembler directives give the directions to the CPU. The 8051 microcontroller consists of various kinds of assembly directives to give the direction to the control unit. The most useful directives are: 
    • ORG
    • EQU
    • END

    • ORG (origin): This directive indicates the start of the program. This is used to set the register address during assembly. For example; ORG 0000h tells the compiler all subsequent code starting at address 0000h.
    • Syntax: ORG 0000h

    EQU (equivalent)

    • This is used to define a constant without occupying a memory location. The EQU directive does not set aside storage for a data item but associates a constant value with a data label so that when the label appears in the program, its constant value will be substituted for the label. The following uses EQU for the counter constant and then the constant is used to load the R3 register. 

    • When executing the instruction “MOV R3, #COUNT”, the register R3 will be loaded with the value 25 (notice the # sign). 
    • Assume that there is a constant (a fixed value) used in many different places in the program, and the programmer wants to change its value throughout. 
    • By the use of EQU, the programmer can change it once and the assembler will change all of its occurrences, rather than search the entire program trying to find every occurrence.

    END

    • Another important pseudocode is the END directive. 
    • This indicates to the assembler the end of the source (asm) file. 
    • The END directive is the last line of an 8051 program, meaning that in the source code anything after the END directive is ignored by the assembler. 
    • Some assemblers use “. END” (notice the dot) instead of “END”. 

    Compiler

    In computing, a compiler is a computer program that translates computer code written in one programming language (the source language) into another language (the target language). 
    The name "compiler" is primarily used for programs that translate source code from a high-level programming language to a lower level language (e.g., assembly language, object code, or machine code) to create an executable program.

    Types of Compiler

    There are many different types of compilers. 
    If the compiled program can run on a computer whose CPU or operating system is different from the one on which the compiler runs, the compiler is a cross-compiler. 
    A bootstrap compiler is written in the language that it intends to compile. 
    A program that translates from a low-level language to a higher level one is a decompiler


    Functions of Compiler

    • A compiler is likely to perform many or all of the following operations: 
    • Pre-processing (preprocessor is a program that processes its input data to produce output that is used as input to another program)
    • Lexical analysis (lexing or tokenization is the process of converting a sequence of characters into a sequence of tokens)
    • Parsing (Parsing is is the process of analyzing a string of symbols)
    • Semantic analysis (It usually includes type checking, or makes sure a variable is declared before use)
    • Conversion of input programs to an intermediate representation
    • Code optimization (a computer program may be optimized so that it executes more rapidly, or to make it capable of operating with less memory storage or other resources, or draw less power.)  
    • Code generation [code generator converts some intermediate representation of source code into a form (e.g., machine code) that can be readily executed by a machine.]

    Multiplication Program

    MOV B,#06H 
    MOV A,#02H
    MUL  AB                  
    MOV R4,A
    RET
    END


    Addition Program

    MOV R6,#25H                   
    MOV A,#05H
    ADD A, R6
    RET
    END


    Division Program

    MOV B,#02H 
    MOV A,#16H
    DIV AB                  
    MOV R0,A
    RET
    END

    REFERENCES

    Muhammad Ali Mazidi, ‘The 8051 Microcontroller and Embedded Systems’ , Second Edition , Pearson Prentice Hall
    Kenneth J. Ayala, ‘The 8051 microcontroller’, Cengage Learning, 2004
    8051 Microcontrollers: MCS51 family and its variants by Satish Shah, Oxford University Press.
     Programming and Customizing the 8051 Microcontroller by Myke Predko Tata McGraw Hill.