Basic Assembly Tutorial, part 3
February 2016, Cremona, Italy
Back to Homepage
Back to ASM index
GOTO lesson 4
Here we are to part III.
Things are starting to get interesting!
This time we see macros and basic operations.
Constants with EQU
This is a pretty simple topic.
You can assign a constant with the EQU directive.
The assembler will substitute the label you choose with the given value.
my_length equ 13
One useful way to use equ is defining the length of a data section,
such as a string.
section .data
msg db 'This is a message.'
len equ $ - msg
The meaning is simple:
remember that the name 'msg' is just a label for a memory address, right?
The assembler stores data in order.
One piece of data after the other, consecutively.
The '$' symbol means 'here' and you subtract the position labeled
'msg', thus we obtain the length of our stored message.
Macros
Macros are groups of instruction that can be very handy.
In NASM, you specify the start and end of a macro with the directives
"%macro" and "%endmacro".
Parameters to the macro are passed just like you do for the mov
directive, separed by commas.
Here's the syntax:
%macro name-of-macro number-of-parameters
[instructions]
%endmacro
Note that an extensive use of macros will result in a pretty ugly code.
For sets of instructions of more than a few lines long, one should better use
subroutines, that can be stored in another file if you like, making everything
more clean.
"Yes, but one can include an extern file containing macros" you're saying..
Macros are ugly and subroutines are good.
Remember this :-)
Example: macros for printing and for exiting
;;;;;; Macro print_msg ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
%macro print_msg 2
; This macro has 2 parameters: the message address
; and its length.
; The macro fills the registers and calls sys_write
mov eax, 4 ; sys_write
mov ebx, 1 ; stdout
mov ecx, %1 ; passing first parameter to ecx
mov edx, %2 ; passing second parameter to edx
int 0x80 ; perform system call
%endmacro
;;;;;; Macro exit ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
%macro exit 1
; This macro exits with a specified error status
mov eax, 1 ; sys_exit
mov ebx, %1 ; parameter, error status
int 0x80
%endmacro
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
section .data
msg db 'Here is my message.', 0xA ; 0xA is newline
len equ $ - msg
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
section .text
global _start
_start:
; Printing message
print_msg msg, len ; 'msg' and 'len' are the parameters
; Now exit
exit 0 ; 0 is the error status
Note that:
- I have defined 'len' with the equ directive, it is
thus a constant.
- I have defined the macro *before* calling it in the .text section.
- Into the macro, you refer to parameters with the '%' symbol
Basic Instructions
Time to introduce some basic instructions.
Instruction |
Meaning |
1st Operand |
2nd Operand |
INC |
Increments by 1 register or variable |
register / variable in memory |
- |
DEC |
decrements by 1 register or variable |
register / variable in memory |
- |
ADD |
binary addition, adds value to 1st operand |
register / variable to increment |
quantity to add |
SUB |
binary subtraction, subtracts value from 1st operand |
register / variable to decrement |
quantity to subtract |
Example 1
This example:
- saves in memory the decimal value 65, that is ASCII character 'A'
- increments it to 66 (that is ASCII 'B') and prints it
The output of the program is simply 'AB'.
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
section .data
val1 db 65 ; that is ASCII character 'A'
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
section .text
global _start
_start:
; Print value at address val1
mov eax, 4 ; sys_write
mov ebx, 1 ; stdout
mov ecx, val1 ; value in memory
mov edx, 1 ; 1 byte long
int 0x80
; increment val1. New value should be 66, that is 'B'
; in ascii code.
; I move the value at val1 to ah rgister, since has
; the same dimensions as my data
mov ah, [val1] ; moving value into ah
inc ah ; increment ah
mov [val1], ah ; move value at ah into val1
; Print value at val1
mov eax, 4
mov ebx, 1
mov ecx, val1
mov edx, 1
int 0x80
; exit
mov eax, 1
mov ebx, 0
int 0x80
Example 2: horsing around with numbers
Well, since sys_write interprets its input as an ASCII code, printing the result
of operations between numbers is not trivial.
Since all we can do is entering characters from keyboard, it's useful to see a way to convert the
ASCII character '1' to the actual number 1.
In this example we enter a number (a single digit, from 0 to 9) and sum it to another digit.
In order to do this:
- we take a character as keyboard input (say, '1', that is ASCII 0x31, decimal 49)
- we convert it to the corresponding digit (say, 1)
- we perform the operation
- we transform the result back to ASCII and print it
The result must be smaller than 9 or the program won't work... :-(
Take a look at the ascii table: since the number 0 corresponds to decimal 48 and subsequent
numbers follow in order, one can convert the ASCII value to the actual value by subtracting
the value 48.
PLEASE, SEE CAVEATS, RIGHT AFTER THE CODE!
;;;;;; MACROS ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
; ---- print_msg macro ----
%macro print_msg 2
mov edx, %2 ; 2nd argument, message length
mov ecx, %1 ; 1st argument, memory position
mov ebx, 1 ; stdout
mov eax, 4 ; sys_write
int 0x80 ; perform system call
%endmacro
; ---- get_key macro ----
%macro get_key 1
mov edx, 2 ; length: 1 byte
mov ecx, %1 ; destination
mov ebx, 0 ; stdin
mov eax, 3 ; sys_read
int 0x80
%endmacro
; ---- exit macro ----
%macro exit 1
mov ebx, %1 ; error status
mov eax, 1 ; sys_exit
int 0x80 ; perform system call
%endmacro
;;;;;; Data Section ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
section .data
msg1 db 'Enter first digit: '
len1 equ $ - msg1
msg2 db 'You have entered: '
len2 equ $ - msg2
msg3 db 0xA, 0xA, 'Enter second digit: '
len3 equ $ - msg3
msg4 db 0xA, 0xA, 'The sum is: '
len4 equ $ - msg4
;;;;;; bss Section ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
section .bss
; Please note that if you enter a character and then
; press ENTER, sys_read will get TWO characters (also
; ENTER is included!!!) Thus digit1 and digit2 must be
; 2 bytes long.
digit1 resb 2 ; 1 byte is *NOT* enough!
digit2 resb 2 ; 1 byte is *NOT* enough, again.
sumresult resb 1 ; result of sum
;;;;;; text Section ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
section .text
global _start
_start:
; Print message: 'Enter a digit:' and read keyboard
print_msg msg1, len1
get_key digit1
; Print message: 'You have entered:' and print digit
print_msg msg2, len2
print_msg digit1, 1
; Print message: 'Now enter another one:' and read keyboard
print_msg msg3, len3
get_key digit2
; Print message: 'You have entered:' and print digit
print_msg msg2, len2
print_msg digit2, 1
; SUM THE DIGITS.
; Since we are dealing with 1 digit numbers (saved in 1 byte of
; memory), using the ah accumulator is enough (it is 1 byte
; long).
; STEP 1) put data into registers
; NOTE THAT I RESERVED 2 BYTES FOR digit1 AND digit2
; BUT I NOW PUT THEM INTO 1 BYTE REGISTER! THIS IS OK
mov ah, [digit1]
mov bh, [digit2]
; STEP 2) subtract decimal 48 to data since it is in ASCII form
; note that subtracting character '0' is the same since
; its ascii value is precisely 48
sub ah, '0'
sub bh, '0'
; STEP 3) sum the results into the accumulator ah for example
add ah, bh
; STEP 4) sum again the value decimal 48 so it becomes ASCII
; again
add ah, 48
; STEP 5) save it to variable
mov [sumresult], ah
; Print the variable
print_msg msg4, len4
print_msg sumresult, 1
; exit
exit 0
There are some CAVEATs in the code above.
- First of all, we are dealing with 1 byte long numbers (digits from 0 to 9..), why
are we reserving 2 bytes at addresses digit1 and digit2?
The reason is the same why in the get_key subroutine we are specifying
to sys_read to read 2 bytes: when entering a character and pressing enter,
you are giving *two* characters, the former being the character itself and the latter
being ENTER.
So, the first byte reserved at digit1 will be filled with the character,
while the second byte will be filled with the ASCII code for enter... and we will
just leave it there, without using it.
If you'd have read only 1 byte with sys_read, the ENTER ascii code would
stay pendant and would come in into the following call to sys_read, thus
invalidating it. You can try it.
- The next things apparently weird is that we put the value at digit1 into
ah, that is a 1 byte register.
Didn't we reserve 2 bytes from digit1 on?
Yes, but as I said the first byte stores the actual character, while the second byte
stores the ENTER ascii code, so we do not care about it.
We put into the register only the interesting byte, which is located at digit1
(and not at digit1 + 1)