L1. Introduction to assembly and hybrid programming
Assembly languages
Language instructions corresponding to processor instructions
Symbolic (alphanumeric) notation
instructions, arguments readable for humans
machine language uses binary words to express instructions
Every processors' family has their own assembly language
For some architectures, there is more than one assembly language
Assembly mnemonics
Old style assemblers - every machine instruction had its unique mnemonics
Modern assemblers - mnemonics describe function of an instruction, several diff erent machine instructions may perform the same operation with different arguments
Assembly notation
The program consists of directives (instructions for a compiler) and actual processor instructions
Every instruction or directive occupies a single line of text
Elements of assembly notation (in order):
- Label/name
- Instruction/directive with arguments
Directive names usually start with a dot - Comment
Labels and other names
Labels is a symbolic representation of an address
Label definition associates the label with an address of an object defined immediately after the label
In many assembly languages label definition must be followed by a colon
Directives
Assembler commands not being symbolic names of processor's machine instructions
Used for:
- Defining and declaring data
- Selecting memory sections in which the following data and instructions will be places
- Enforcing address values
- Declaring public and external symbols
Assembly notation
Label/name/symbol | Proc instruction/as directive | Arguments | Comment
Assembly directives
Assembler commands not being symbolic names of processor's machine instructions
- Defining and declaring data
- Selecting the memory sections for the data
- Enforcing address values
- Declaring public and external symbols
- Aligning data or instructions
Applications of assembly programming
Programming microcontrollers with limited resources or architectures not suited for HLLs
Time-critical routines (computational kernels)
Access to resources not visible to HLLs
- special registers
- instructions not usable for HLL compilers (BCD arithmetic)
- vector units
Exception processing
OS kernel routines
Code optimization by hand
Hybrid Programming
Using procedures written in different languages in a single program, ex. C and assembler
Not time-critical routines programmed in HLL
Control passing mechanisms for HLL
Agreed on control passing mechanisms for interfacing HLL code to assembly code
Application program in an OS env
OS and startup module (being a part of an application) are responsible for preparing program working env:
- Program loading
- Memory allocation
- Stack initialization
- I/O
- File system
The programmer only creates the body of the application
The above refers to program running on a g.p. computer
C standard refers to it as 'hosted environment'
Standalone program
Program running on a bare computer, without an OS or with minimal OS
It must be able to setup its working enviro by itself - Memory allocation, stack
Examples:
- OS kernel
- OS loader
- Startup module of a program running in an OS env
Phases of program creation
Compilation
- The compiler translates a HLL program into assembly program or directly into object file
- Every source file is translated into a separate output (assembly or object)
Assembly
- Assembler source file written by programmer or temporary assemble file created by compiler is translated by assembler into an object file
- Each assemble source file is translated into a separate object file
Linking
- Linker joins object files and library files into executable file which is later loaded into computer's memory
Object file
Contains binary executable code with some references marked as undefined
Usually executable program consists of many linked modules
Module may reference variables and functions defined in other modules
Object file contains the description of external references - symbols used but not defined in the module and globals
The symbols are linked using their names
There can be only a single definition of the module/symbol in all of the object files
Usual object file names:
- Windows file.obj
- Unix file.o
External symbols
Symbols referenced in a given module, defined in other modules
The symbols must be declared as external in a way depending on the programming language used:
- C:
externkeyword- functions not defined in the module are treated by external by default
- Assembler:
- Explicit declaration always required -
extern/extrn
- Explicit declaration always required -
Public symbols
Defined in a given module which may be used by other modules
Must be declared public:
- C:
- Every symbol defined on external level is considered public if not declared with
staticsymbol
- Every symbol defined on external level is considered public if not declared with
- Assembler:
- Explicit declaration required - keyword
public,global,globl
- Explicit declaration required - keyword
Single module program
Short and simple programs written in assembly may consist of only one module with no external references
Linking of such program means simple conversion from object to exec form
Some assemblers are capable of generating exec form directly (ex. NASM in DOS env)
Libraries
Library is a collection of object modules which may be linked to other modules written by programmer
Lib file contains archived .obj files created from several source modules
Libs containing standard library functions for a given HLL are usually supplied with a compiler
Programmer may create his own libraries which may be subsequently used by many programs
Library files
- Unix/Linux - .a extension
- Windows - .lib extension