Basic Assembly Tutorial - Digression over executable files

February 2016, Cremona, Italy

Back to Homepage
Back to ASM index

Digression over executable files

When we've assembled programs with NASM, we've specified the output format as ELF64. Let's take a closer look at options and have an insight on what that means.
To see available options, run:
nasm -hf
You should see something like: * bin flat-form binary files (e.g. DOS .COM, .SYS) ith Intel hex srec Motorola S-records aout Linux a.out object files aoutb NetBSD/FreeBSD a.out object files coff COFF (i386) object files (e.g. DJGPP for DOS) elf32 ELF32 (i386) object files (e.g. Linux) elf64 ELF64 (x86_64) object files (e.g. Linux) elfx32 ELFX32 (x86_64) object files (e.g. Linux) as86 Linux as86 (bin86 version 0.3) object files obj MS-DOS 16-bit/32-bit OMF object files win32 Microsoft Win32 (i386) object files win64 Microsoft Win64 (x86-64) object files rdf Relocatable Dynamic Object File Format v2.0 ieee IEEE-695 (LADsoft variant) object file format macho32 NeXTstep/OpenStep/Rhapsody/Darwin/MacOS X (i386) object files macho64 NeXTstep/OpenStep/Rhapsody/Darwin/MacOS X (x86_64) object files dbg Trace of all info passed to output stage elf ELF (short name for ELF32) macho MACHO (short name for MACHO32) win WIN (short name for WIN32) As we can see, we can choose various output formats, such as ELF, flat and pure binary, .OUT etc.. If you've been horsing around a little with C/C++, for sure you've come across the famous a.out file, that is the default name that the gcc compiler will give to your compiled program..

What does this all mean?
If you want to write machine code from scratch and make it run on your operating system, basically you have to write the actual program in machine code and then put it into a file with a certain syntax (a certain format) that specifies some information for the OS: without this information, the Operating System wouldn't know how to load the file in memory etc. This is what the various out, elf, and-so-on formats are about.

Back in 1992, the first Linux distributions (such as SLS) were using the a.out format, standing for assembler output, but soon switched to the elf format. Nowadays compilers (such as gcc) chose by default the output name of compiled sources as a.out although it's not in the original a.out format.

Imagine you weren't working on GNU/Linux but you could pass a raw binary file directly to the CPU, then you would need no elf headers and stuff!

What kind of file is this file?

Let's say we have a file and don't know what it is because its name has no extension. What can we do??
There's a very very very useful utility that allows us to know what type of file we are dealing with: it's called file.
Let's analyze my assembled Hello World file, called "hello".
file ./hello
./hello: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), statically linked, not stripped The file utility has various methods to know what kind of file have been passed, one of whom is looking at the "magic number", a number that is saved into the header of the file. In case of ELF files, the magic number is: 0x7F 'E' 'L' 'F', that is 7f 45 4c 46.

The ELF format

When you're dealing with the elf format, you can use the commands readelf and elfedit (provided by the GNU project).
Here's as an example of readelf on an assembled "Hello World":
readelf -a ./hello
ELF Header: Magic: 7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00 Class: ELF64 Data: 2's complement, little endian Version: 1 (current) OS/ABI: UNIX - System V ABI Version: 0 Type: EXEC (Executable file) Machine: Advanced Micro Devices X86-64 Version: 0x1 Entry point address: 0x4000b0 Start of program headers: 64 (bytes into file) Start of section headers: 272 (bytes into file) Flags: 0x0 Size of this header: 64 (bytes) Size of program headers: 56 (bytes) Number of program headers: 2 Size of section headers: 64 (bytes) Number of section headers: 6 Section header string table index: 3 Section Headers: [Nr] Name Type Address Offset Size EntSize Flags Link Info Align [ 0] NULL 0000000000000000 00000000 0000000000000000 0000000000000000 0 0 0 [ 1] .text PROGBITS 00000000004000b0 000000b0 0000000000000027 0000000000000000 AX 0 0 16 [........]