Guide to x86 Assembly (2023)

Contents: Registers | Memory and Addressing | Instructions | Calling Convention

This is a version adapted by Quentin Carbonneaux from David Evans'original document. The syntax was changed from Intel to AT&T,the standard syntax on UNIX systems, and the HTML code was purified.

This guide describes the basics of 32-bit x86 assembly languageprogramming, covering a small but useful subset of the availableinstructions and assembler directives. There are several differentassembly languages for generating x86 machine code. The one we will usein CS421 is the GNU Assembler (gas) assembler. We will usesthe standard AT&T syntax for writing x86 assembly code.

The full x86 instruction set is large and complex (Intel's x86instruction set manuals comprise over 2900 pages), and we do not coverit all in this guide. For example, there is a 16-bit subset of the x86instruction set. Using the 16-bit programming model can be quitecomplex. It has a segmented memory model, more restrictions on registerusage, and so on. In this guide, we will limit our attention to moremodern aspects of x86 programming, and delve into the instruction setonly in enough detail to get a basic feel for x86 programming.


Modern (i.e 386 and beyond) x86 processors have eight 32-bit generalpurpose registers, as depicted in Figure 1. The register names aremostly historical. For example, EAX used to be called theaccumulator since it was used by a number of arithmetic operations, andECX was known as the counter since it was used to hold a loopindex. Whereas most of the registers have lost their special purposes inthe modern instruction set, by convention, two are reserved for specialpurposes — the stack pointer (ESP) and the base pointer(EBP).

For the EAX, EBX, ECX, andEDX registers, subsections may be used. For example, the leastsignificant 2 bytes of EAX can be treated as a 16-bit registercalled AX. The least significant byte of AX can beused as a single 8-bit register called AL, while the mostsignificant byte of AX can be used as a single 8-bit registercalled AH. These names refer to the same physicalregister. When a two-byte quantity is placed into DX, theupdate affects the value of DH, DL, andEDX. These sub-registers are mainly hold-overs from older,16-bit versions of the instruction set. However, they are sometimesconvenient when dealing with data that are smaller than 32-bits(e.g. 1-byte ASCII characters).

Figure 1. x86 Registers

Memory and Addressing Modes

Declaring Static Data Regions

You can declare static data regions (analogous to global variables) inx86 assembly using special assembler directives for this purpose. Datadeclarations should be preceded by the .datadirective. Following this directive, the directives .byte, .short, and .long can be used to declare one, two, and four bytedata locations, respectively. To refer to the address of the data created,we can label them. Labels are very useful and versatile in assembly, theygive names to memory locations that will be figured out later by the assembleror the linker. This is similar to declaring variables by name, but abides bysome lower level rules. For example, locations declared in sequence will belocated in memory next to one another.

Example declarations:

.byte64 /* Declare a byte, referred to as location var, containing the value 64. */
.byte10 /* Declare a byte with no label, containing the value 10. Its location is var + 1. */
.short42 /* Declare a 2-byte value initialized to 42, referred to as location x. */
.long30000 /* Declare a 4-byte value, referred to as location y, initialized to 30000. */

Unlike in high level languages where arrays can have many dimensions andare accessed by indices, arrays in x86 assembly language are simply anumber of cells located contiguously in memory. An array can be declaredby just listing the values, as in the first example below.For the special case of an array of bytes, string literals can be used.In case a large area of memory is filled with zeroes the .zero directive can be used.

Some examples:

.long 1, 2, 3 /* Declare three 4-byte values, initialized to 1, 2, and 3.
The value at location s + 8 will be 3. */
.zero 10 /* Declare 10 bytes starting at location barr, initialized to 0. */
.string "hello" /* Declare 6 bytes starting at the address str initialized to
the ASCII character values for hello followed by a nul (0) byte. */

Addressing Memory

Modern x86-compatible processors are capable of addressing up to232 bytes of memory: memory addresses are 32-bits wide. Inthe examples above, where we used labels to refer to memory regions,these labels are actually replaced by the assembler with 32-bitquantities that specify addresses in memory. In addition to supportingreferring to memory regions by labels (i.e. constant values), the x86provides a flexible scheme for computing and referring to memoryaddresses: up to two of the 32-bit registers and a 32-bit signedconstant can be added together to compute a memory address. One of theregisters can be optionally pre-multiplied by 2, 4, or 8.

The addressing modes can be used with many x86 instructions(we'll describe them in the next section). Here we illustrate some examplesusing the mov instruction that moves databetween registers and memory. This instruction has two operands: thefirst is the source and the second specifies the destination.

Some examples of mov instructionsusing address computations are:

mov (%ebx), %eax /* Load 4 bytes from the memory address in EBX into EAX. */
mov %ebx, var(,1) /* Move the contents of EBX into the 4 bytes at memory address var.
(Note, var is a 32-bit constant). */
mov -4(%esi), %eax /* Move 4 bytes at memory address ESI + (-4) into EAX. */
mov %cl, (%esi,%eax,1) /* Move the contents of CL into the byte at address ESI+EAX. */
mov (%esi,%ebx,4), %edx /* Move the 4 bytes of data at address ESI+4*EBX into EDX. */

Some examples of invalid address calculations include:

mov (%ebx,%ecx,-1), %eax /* Can only add register values. */
mov %ebx, (%eax,%esi,%edi,1) /* At most 2 registers in address computation. */

Operation Suffixes

In general, the intended size of the of the data item at a given memoryaddress can be inferred from the assembly code instruction in which itis referenced. For example, in all of the above instructions, the sizeof the memory regions could be inferred from the size of the registeroperand. When we were loading a 32-bit register, the assembler couldinfer that the region of memory we were referring to was 4 byteswide. When we were storing the value of a one byte register to memory,the assembler could infer that we wanted the address to refer to asingle byte in memory.

However, in some cases the size of a referred-to memory region isambiguous. Consider the instruction mov $2, (%ebx).Should this instruction move the value 2 into thesingle byte at address EBX? Perhapsit should move the 32-bit integer representation of 2 into the 4-bytesstarting at address EBX. Since eitheris a valid possible interpretation, the assembler must be explicitlydirected as to which is correct. The size prefixes b, w,and l serve this purpose,indicating sizes of 1, 2, and 4 bytes respectively.

For example:

movb $2, (%ebx) /* Move 2 into the single byte at the address stored in EBX. */
movw $2, (%ebx) /* Move the 16-bit integer representation of 2 into the 2 bytes starting at the address in EBX. */
movl $2, (%ebx) /* Move the 32-bit integer representation of 2 into the 4 bytes starting at the address in EBX. */


Machine instructions generally fall into three categories: datamovement, arithmetic/logic, and control-flow. In this section, we willlook at important examples of x86 instructions from each category. Thissection should not be considered an exhaustive list of x86 instructions,but rather a useful subset. For a complete list, see Intel'sinstruction set reference.

We use the following notation:

<reg32>Any32-bit register (%eax,%ebx,%ecx,%edx,%esi,%edi,%esp, or%ebp)
<reg16>Any16-bit register (%ax,%bx,%cx, or%dx)
<reg8>Any8-bit register (%ah,%bh,%ch,%dh,%al,%bl,%cl, or%dl)
<reg>Any register
<mem>A memory address (e.g., (%eax), 4+var(,1),or (%eax,%ebx,1))
<con32>Any 32-bit immediate
<con16>Any 16-bit immediate
<con8>Any 8-bit immediate
<con>Any 8-, 16-, or 32-bit immediate

In assembly language, all the labels and numeric constants used as immediate operands(i.e. not in an address calculation like 3(%eax,%ebx,8))are always prefixed by a dollar sign. When needed, hexadecimal notation canbe used with the 0x prefix(e.g. $0xABC). Without the prefix, numbers areinterpreted in the decimal basis.

(Video) Intro to x86 Assembly Language (Part 1)

Data Movement Instructions

mov — Move

The mov instruction copies the data item referred to byits first operand (i.e. register contents, memory contents, or a constantvalue) into the location referred to by its second operand (i.e. a register ormemory). While register-to-register moves are possible, direct memory-to-memorymoves are not. In cases where memory transfers are desired, the source memorycontents must first be loaded into a register, then can be stored to thedestination memory address.

mov <reg>, <reg>
mov <reg>, <mem>
mov <mem>, <reg>
mov <con>, <reg>
mov <con>, <mem>

mov %ebx, %eax — copy the value in EBX into EAX
movb $5, var(,1) — store the value 5 into thebyte at location var

push — Push on stack

The push instruction places its operand ontothe top of the hardware supported stack in memory. Specifically, push first decrements ESP by 4, then places itsoperand into the contents of the 32-bit location at address (%esp). ESP(the stack pointer) is decremented by push since the x86 stack growsdown — i.e. the stack grows from high addresses to lower addresses.

push <reg32>
push <mem>
push <con32>

push %eax — push eax on the stack
push var(,1) — push the 4 bytes ataddress var onto the stack

pop — Pop from stack

The pop instruction removes the 4-byte dataelement from the top of the hardware-supported stack into the specifiedoperand (i.e. register or memory location). It first moves the 4 byteslocated at memory location (%esp) into thespecified register or memory location, and then increments ESP by 4.

pop <reg32>
pop <mem>

pop %edi — pop the top element of the stack into EDI.
pop (%ebx) — pop the top element of thestack into memory at the four bytes starting at location EBX.

lea — Load effective address

The lea instruction places the address specified by its first operandinto the register specified by its second operand.Note, the contents of the memory location are notloaded, only the effective address is computed and placed into the register.This is useful for obtaining a pointer into a memory region or to perform simplearithmetic operations.

lea <mem>, <reg32>

lea (%ebx,%esi,8), %edi — the quantity EBX+8*ESI is placed in EDI.
lea val(,1), %eax — the value val is placed in EAX.

Arithmetic and Logic Instructions

add — Integer addition

The add instruction addstogether its two operands, storing the result in its secondoperand. Note, whereas both operands may be registers, at most oneoperand may be a memory location.

add <reg>, <reg>
add <mem>, <reg>
add <reg>, <mem>
add <con>, <reg>
add <con>, <mem>

add $10, %eax — EAX is set to EAX + 10
addb $10, (%eax) — add 10 to thesingle byte stored at memory address stored in EAX

sub — Integer subtraction

The sub instruction stores in the value ofits second operand the result of subtracting the value of its firstoperand from the value of its second operand. As with add, whereas both operands may be registers, at mostone operand may be a memory location.

sub <reg>, <reg>
sub <mem>, <reg>
sub <reg>, <mem>
sub <con>, <reg>
sub <con>, <mem>

sub %ah, %al — AL is set to AL - AH
sub $216, %eax — subtract 216 from thevalue stored in EAX

inc, dec — Increment, Decrement

The inc instruction incrementsthe contents of its operand by one. The decinstruction decrements the contents of its operand by one.

inc <reg>
inc <mem>
dec <reg>
dec <mem>

dec %eax — subtract one from the contents of EAX
incl var(,1) — add one to the32-bit integer stored at location var

imul — Integer multiplication

The imul instruction has two basic formats:two-operand (first two syntax listings above) and three-operand (lasttwo syntax listings above).

The two-operand form multiplies its two operands together and stores the resultin the second operand. The result (i.e. second) operand must be aregister.

(Video) x86 Assembly: Hello World!

The three operand form multiplies its second and third operands togetherand stores the result in its last operand. Again, the result operandmust be a register. Furthermore, the first operand is restricted tobeing a constant value.

imul <reg32>, <reg32>
imul <mem>, <reg32>
imul <con>, <reg32>, <reg32>
imul <con>, <mem>, <reg32>


imul (%ebx), %eax — multiply the contentsof EAX by the 32-bit contents of the memory at location EBX. Storethe result in EAX.

imul $25, %edi, %esi — ESI is set to EDI * 25

idiv — Integer division

The idiv instruction divides thecontents of the 64 bit integer EDX:EAX (constructed by viewing EDX asthe most significant four bytes and EAX as the least significant fourbytes) by the specified operand value. The quotient result of thedivision is stored into EAX, while the remainder is placed in EDX.

idiv <reg32>
idiv <mem>


idiv %ebx — divide the contents ofEDX:EAX by the contents of EBX. Place the quotient in EAX and theremainder in EDX.

idivw (%ebx) — divide thecontents of EDX:EAS by the 32-bit value stored at the memory location inEBX. Place the quotient in EAX and the remainder in EDX.

and, or, xor — Bitwise logicaland, or, and exclusive or

These instructions perform the specified logical operation (logicalbitwise and, or, and exclusive or, respectively) on their operands, placing theresult in the first operand location.

and <reg>, <reg>
and <mem>, <reg>
and <reg>, <mem>
and <con>, <reg>
and <con>, <mem>

or <reg>, <reg>
or <mem>, <reg>
or <reg>, <mem>
or <con>, <reg>
or <con>, <mem>

xor <reg>, <reg>
xor <mem>, <reg>
xor <reg>, <mem>
xor <con>, <reg>
xor <con>, <mem>

and $0x0f, %eax — clear all but the last 4bits of EAX.
xor %edx, %edx — set the contents of EDXto zero.

not — Bitwise logical not

Logically negates the operand contents (that is, flips all bit values inthe operand).

not <reg>
not <mem>

not %eax — flip all the bits of EAX

neg — Negate

Performs the two's complement negation of the operand contents.

neg <reg>
neg <mem>

neg %eax — EAX is set to (- EAX)

shl, shr — Shift left and right

(Video) Assembly Language in 100 Seconds

These instructions shift the bits in their first operand's contentsleft and right, padding the resulting empty bitpositions with zeros. The shifted operand can be shifted up to 31 places. Thenumber of bits to shift is specified by the second operand, which can beeither an 8-bit constant or the register CL. In either case, shifts counts ofgreater then 31 are performed modulo 32.

shl <con8>, <reg>
shl <con8>, <mem>
shl %cl, <reg>
shl %cl, <mem>

shr <con8>, <reg>
shr <con8>, <mem>
shr %cl, <reg>
shr %cl, <mem>


shl $1, eax — Multiply the value of EAXby 2 (if the most significant bit is 0)

shr %cl, %ebx — Store in EBX the floor ofresult of dividing the value of EBX by 2n where n is thevalue in CL. Caution: for negative integers, it is different from the Csemantics of division!

Control Flow Instructions

The x86 processor maintains an instruction pointer (EIP) register that isa 32-bit value indicating the location in memory where the currentinstruction starts. Normally, it increments to point to the nextinstruction in memory begins after execution an instruction. The EIPregister cannot be manipulated directly, but is updated implicitly byprovided control flow instructions.

We use the notation <label> to refer tolabeled locations in the program text. Labels can be inserted anywherein x86 assembly code text by entering a labelname followed by a colon. For example,

 mov 8(%ebp), %esibegin: xor %ecx, %ecx mov (%esi), %eax

The second instruction in this code fragment is labeled begin. Elsewhere in the code, we can refer to thememory location that this instruction is located at in memory using themore convenient symbolic name begin. Thislabel is just a convenient way of expressing the location instead of its32-bit value.

jmp — Jump

Transfers program control flow to the instruction at the memorylocation indicated by the operand.

jmp <label>

jmp begin — Jump to the instructionlabeled begin.

jcondition — Conditional jump

These instructions are conditional jumps that are based on the status ofa set of condition codes that are stored in a special register calledthe machine status word. The contents of the machine statusword include information about the last arithmetic operationperformed. For example, one bit of this word indicates if the lastresult was zero. Another indicates if the last result wasnegative. Based on these condition codes, a number of conditional jumpscan be performed. For example, the jzinstruction performs a jump to the specified operand label if the resultof the last arithmetic operation was zero. Otherwise, control proceedsto the next instruction in sequence.

A number of the conditional branches are given names that areintuitively based on the last operation performed being a specialcompare instruction, cmp (see below). For example, conditional branchessuch as jle and jne are based on first performing a cmp operationon the desired operands.

je <label> (jump when equal)
jne <label> (jump when not equal)
jz <label> (jump when last result was zero)
jg <label> (jump when greater than)
jge <label> (jump when greater than or equal to)
jl <label> (jump when less than)
jle <label> (jump when less than or equal to)


cmp %ebx, %eaxjle done

If the contents of EAX are less than or equal to the contents of EBX,jump to the label done. Otherwise, continue to the nextinstruction.

cmp — Compare

Compare the values of the two specified operands, setting the conditioncodes in the machine status word appropriately. This instruction isequivalent to the sub instruction, except theresult of the subtraction is discarded instead of replacing the firstoperand.

cmp <reg>, <reg>
cmp <mem>, <reg>
cmp <reg>, <mem>
cmp <con>, <reg>

cmpb $10, (%ebx)
jeq loop

If the byte stored at the memory location in EBX is equal to theinteger constant 10, jump to the location labeled loop.

call, ret — Subroutine call and return

These instructions implement a subroutine call and return.The call instruction first pushes the currentcode location onto the hardware supported stack in memory (see the push instruction for details), and then performsan unconditional jump to the code location indicated by the labeloperand. Unlike the simple jump instructions, the call instruction saves the location to return towhen the subroutine completes.

The ret instruction implements a subroutinereturn mechanism. This instruction first pops a code location off thehardware supported in-memory stack (see the pop instruction for details). It then performs anunconditional jump to the retrieved code location.

(Video) x86 Assembly Crash Course

call <label>

Calling Convention

To allow separate programmers to share code and develop libraries foruse by many programs, and to simplify the use of subroutines in general,programmers typically adopt a common calling convention. Thecalling convention is a protocol about how to call and return fromroutines. For example, given a set of calling convention rules, aprogrammer need not examine the definition of a subroutine to determinehow parameters should be passed to that subroutine. Furthermore, given aset of calling convention rules, high-level language compilers can bemade to follow the rules, thus allowing hand-coded assembly languageroutines and high-level language routines to call one another.

In practice, many calling conventions are possible. We will describe thewidely used C language calling convention. Following this conventionwill allow you to write assembly language subroutines that are safelycallable from C (and C++) code, and will also enable you to call Clibrary functions from your assembly language code.

The C calling convention is based heavily on the use of thehardware-supported stack. It is based on the push, pop, call, and retinstructions. Subroutine parameters are passed on the stack. Registersare saved on the stack, and local variables used by subroutines areplaced in memory on the stack. The vast majority of high-levelprocedural languages implemented on most processors have used similarcalling conventions.

The calling convention is broken into two sets of rules. The first setof rules is employed by the caller of the subroutine, and the second setof rules is observed by the writer of the subroutine (the callee). Itshould be emphasized that mistakes in the observance of these rulesquickly result in fatal program errors since the stack will be left inan inconsistent state; thus meticulous care should be used whenimplementing the call convention in your own subroutines.

Stack during Subroutine Call

[Thanks to James Peterson for finding and fixing the bug inthe original version of this figure!]

A good way to visualize the operation of the calling convention is todraw the contents of the nearby region of the stack during subroutineexecution. The image above depicts the contents of the stack during theexecution of a subroutine with three parameters and three localvariables. The cells depicted in the stackare 32-bit wide memory locations, thus the memory addresses of the cellsare 4 bytes apart. The firstparameter resides at an offset of 8 bytes from the base pointer. Abovethe parameters on the stack (and below the base pointer), the call instruction placed the return address, thusleading to an extra 4 bytes of offset from the base pointer to the firstparameter. When the ret instruction is usedto return from the subroutine, it will jump to the return address storedon the stack.

Caller Rules

To make a subrouting call, the caller should:

  1. Before calling a subroutine, the caller shouldsave the contents of certain registers that are designatedcaller-saved. The caller-saved registers are EAX, ECX, EDX.Since the called subroutine is allowed to modify these registers, if thecaller relies on their values after the subroutine returns, the callermust push the values in these registers onto the stack (so they can berestore after the subroutine returns.
  2. To pass parameters to the subroutine, push them onto the stackbefore the call. The parameters should be pushed in inverted order(i.e. last parameter first). Since the stack grows down, the firstparameter will be stored at the lowest address (this inversion ofparameters was historically used to allow functions to be passed avariable number of parameters).
  3. To call the subroutine, use the callinstruction. This instruction places the return address on top of theparameters on the stack, and branches to the subroutine code. Thisinvokes the subroutine, which should follow the callee rules below.

After the subroutine returns (immediately following the call instruction), the caller can expect to findthe return value of the subroutine in the register EAX. To restore themachine state, the caller should:

  1. Remove the parameters from stack. This restores the stack to itsstate before the call was performed.
  2. Restore the contents of caller-saved registers (EAX, ECX, EDX) bypopping them off of the stack. The caller can assume that no otherregisters were modified by the subroutine.


The code below shows a function call that follows the caller rules. Thecaller is calling a function myFunc that takes three integerparameters. First parameter is in EAX, the second parameter is theconstant 216; the third parameter is in the memory location stored in EBX.

push (%ebx) /* Push last parameter first */push $216 /* Push the second parameter */push %eax /* Push first parameter last */call myFunc /* Call the function (assume C naming) */add $12, %esp

Note that after the call returns, the caller cleans up the stack usingthe add instruction. We have 12 bytes (3parameters * 4 bytes each) on the stack, and the stack grows down. Thus,to get rid of the parameters, we can simply add 12 to the stack pointer.

The result produced by myFunc is now available for use in theregister EAX. The values of the caller-saved registers (ECX and EDX),may have been changed. If the caller uses them after the call, it wouldhave needed to save them on the stack before the call and restore themafter it.

Callee Rules

The definition of the subroutine should adhere to the following rules atthe beginning of the subroutine:

  1. Push the value of EBP onto the stack, and then copy the value of ESPinto EBP using the following instructions:
     push %ebp mov %esp, %ebp
    This initial action maintains the base pointer, EBP. The basepointer is used by convention as a point of reference for findingparameters and local variables on the stack. When a subroutine isexecuting, the base pointer holds a copy of the stack pointer value fromwhen the subroutine started executing. Parameters and local variableswill always be located at known, constant offsets away from the basepointer value. We push the old base pointer value at the beginning ofthe subroutine so that we can later restore the appropriate base pointervalue for the caller when the subroutine returns. Remember, the calleris not expecting the subroutine to change the value of the basepointer. We then move the stack pointer into EBP to obtain our point ofreference for accessing parameters and local variables.
  2. Next, allocate local variables by making space on thestack. Recall, the stack grows down, so to make space on the top of thestack, the stack pointer should be decremented. The amount by which the stackpointer is decremented depends on the number and size of local variablesneeded. For example, if 3 local integers (4 bytes each) were required,the stack pointer would need to be decremented by 12 to make space forthese local variables (i.e., sub$12,%esp).As with parameters, local variables will be located at known offsetsfrom the base pointer.
  3. Next, save the values of the callee-saved registers thatwill be used by the function. To save registers, push them onto thestack. The callee-saved registers are EBX, EDI, and ESI (ESP and EBPwill also be preserved by the calling convention, but need not be pushedon the stack during this step).

After these three actions are performed, the body of thesubroutine may proceed. When the subroutine is returns, it must followthese steps:

  1. Leave the return value in EAX.
  2. Restore the old values of any callee-saved registers (EDI and ESI)that were modified. The register contents are restored by popping themfrom the stack. The registers should be popped in the inverseorder that they were pushed.
  3. Deallocate local variables. The obvious way to do this might be toadd the appropriate value to the stack pointer (since the space wasallocated by subtracting the needed amount from the stack pointer). Inpractice, a less error-prone way to deallocate the variables is tomove the value in the base pointer into the stack pointer: mov%ebp,%esp. This works because thebase pointer always contains the value that the stack pointer contained immediatelyprior to the allocation of the local variables.
  4. Immediately before returning, restore the caller's base pointervalue by popping EBP off the stack. Recall that the first thing we did onentry to the subroutine was to push the base pointer to save its oldvalue.
  5. Finally, return to the caller by executing a ret instruction. This instruction will find andremove the appropriate return address from the stack.

Note that the callee's rules fall cleanly into two halves that arebasically mirror images of one another. The first half of the rulesapply to the beginning of the function, and are commonly saidto define the prologue to the function. The latter half of therules apply to the end of the function, and are thus commonly said todefine the epilogue of the function.


Here is an example function definition that follows the callee rules:

 /* Start the code section */ .text /* Define myFunc as a global (exported) function. */ .globl myFunc .type myFunc, @functionmyFunc: /* Subroutine Prologue */ push %ebp /* Save the old base pointer value. */ mov %esp, %ebp /* Set the new base pointer value. */ sub $4, %esp /* Make room for one 4-byte local variable. */ push %edi /* Save the values of registers that the function */ push %esi /* will modify. This function uses EDI and ESI. */ /* (no need to save EBX, EBP, or ESP) */ /* Subroutine Body */ mov 8(%ebp), %eax /* Move value of parameter 1 into EAX. */ mov 12(%ebp), %esi /* Move value of parameter 2 into ESI. */ mov 16(%ebp), %edi /* Move value of parameter 3 into EDI. */ mov %edi, -4(%ebp) /* Move EDI into the local variable. */ add %esi, -4(%ebp) /* Add ESI into the local variable. */ add -4(%ebp), %eax /* Add the contents of the local variable */ /* into EAX (final result). */ /* Subroutine Epilogue */ pop %esi /* Recover register values. */ pop %edi mov %ebp, %esp /* Deallocate the local variable. */ pop %ebp /* Restore the caller's base pointer value. */ ret

The subroutine prologue performs the standard actions of saving asnapshot of the stack pointer in EBP (the base pointer), allocatinglocal variables by decrementing the stack pointer, and saving registervalues on the stack.

In the body of the subroutine we can see the use of the basepointer. Both parameters and local variables are located at constantoffsets from the base pointer for the duration of the subroutinesexecution. In particular, we notice that since parameters were placedonto the stack before the subroutine was called, they are always locatedbelow the base pointer (i.e. at higher addresses) on the stack. Thefirst parameter to the subroutine can always be found at memory location(EBP+8), the second at (EBP+12), the third at (EBP+16). Similarly,since local variables are allocated after the base pointer is set, theyalways reside above the base pointer (i.e. at lower addresses) on thestack. In particular, the first local variable is always located at(EBP-4), the second at (EBP-8), and so on. This conventional use of thebase pointer allows us to quickly identify the use of local variablesand parameters within a function body.

The function epilogue is basically a mirror image of the functionprologue. The caller's register values are recovered from the stack,the local variables are deallocated by resetting the stack pointer, thecaller's base pointer value is recovered, and the ret instruction isused to return to the appropriate code location in the caller.

Credits: This guide was originally created by Adam Ferrari many years ago,
and since updated by Alan Batson, Mike Lack, and Anita Jones.
It was revised for 216 Spring 2006 by David Evans.
It was finally modified by Quentin Carbonneaux to use the AT&T syntax for Yale's CS421.

(Video) Assembly x86 GUIDE for beginners: Asm and Disasm


What is x86 assembly instructions? ›

x86 assembly language is the name for the family of assembly languages which provide some level of backward compatibility with CPUs back to the Intel 8008 microprocessor, which was launched in April 1972. It is used to produce object code for the x86 class of processors.

How many instructions are in x86 assembly? ›

x86 integer instructions. Below is the full 8086/8088 instruction set of Intel (81 instructions total). Most if not all of these instructions are available in 32-bit mode; they just operate on 32-bit registers (eax, ebx, etc.) and values instead of their 16-bit (ax, bx, etc.) counterparts.

What is '%' in assembly language? ›

$ signifies a constant (integer literal). Without it the number is an absolute address. % denotes a register.

What are the most important x86 instructions? ›

The most popular instruction is MOV (35% of all instructions). Note that PUSH is twice more common than POP. These instructions are used in pairs for preserving EBP, ESI, EDI, and EDX registers across function calls, and PUSH is also used for passing arguments to functions; that's why it is more frequent.

Is x86 assembly faster than C? ›

Actually, the short answer is: Assembler is always faster or equal to the speed of C. The reason is that you can have assembly without C, but you can't have C without assembly (in the binary form, which we in the old days called "machine code").

How do you say hello world in x86 assembly? ›

A “hello world” program writes to stdout (calling write ) then exits (calling exit ). The assembly program hello.
Armed with this, we know to:
  1. put the system call number 1 in rax.
  2. put the fd argument in rdi.
  3. put the buf argument in rsi.
  4. put the count argument in rdx.
  5. finally, call syscall.
Mar 10, 2018

What is the longest x86 instruction? ›

The x86 instruction set (16, 32 or 64 bit, all variants/modes) guarantees / requires that instructions are at most 15 bytes. Anything beyond that will give an "invalid opcode". You can't achieve that without using redundant prefixes (e.g. multiple 0x66 or 0x67 prefixes, for example).

Is assembly still used today? ›

How Are Assembly Languages Used Today? Though considered lower level languages compared to more advanced languages, assembly languages are still used. Assembly language is used to directly manipulate hardware, access specialized processor instructions, or evaluate critical performance issues.

Is x86 based on RISC or CISC? ›

The x86 lineage began in 1978 with the 16-bit 8086 microprocessor. They are known as CISC - Complex Instruction Set Computing - processors. Unlike RISC, CISC instructions can perform complex tasks that take more than one cycle to execute.

Is Python an assembly language? ›

Python is an example of a high-level language; other high-level languages you might have heard of are C++, PHP, and Java. As you might infer from the name high-level language, there are also low-level languages, sometimes referred to as machine languages or assembly languages.

Is assembly just binary? ›

Assembly is basically binary code written in a form that humans can read. The assembler then takes the assembly code and translates it line by line to the corresponding bit code.

Is assembly like C? ›

Assembler is a lower level programming language than C,so this makes it a good for programming directly to hardware. Hardware programming can be done directly in either language. The only things you can't do in C are accessing stack pointers and condition registers etc, of the CPU core itself.

Is x86 more powerful than ARM? ›

ARM chips, by design, are much more power-efficient than x86 CPUs. They're RISC processors, so they're simpler in design.

Is x86 always 32-bit? ›

x86 refers to a 32-bit CPU and operating system while x64 refers to a 64-bit CPU and operating system.

Why do people still use x86? ›

All x86 processors are developed from the CISC (Complex Instruction Set Computers) architecture. The x86 processors allow you to perform several activities at the same time from a single instruction. Also, they can perform numerous simultaneous tasks without any of them being affected.

Is x86 becoming obsolete? ›

As long as Windows 10 keeps being supported (so until 2025) x86 will not become irrelevant. Probably even until 2030 we'll still have some support.

Is Python faster than assembly? ›

Not surprisingly, Python was quick to code, but the performance of this language was ten times slower than C++ and assembly.

How hard is coding in assembly? ›

Programming in assembly language is hard work; it's slow, tedious and needs a lot of concentration. You have no variables, just registers and memory locations. Throw away any aversion to using Goto because the JMP instruction (Goto's equivalent in assembly language) gets used quite a bit.

What is 0000h in assembly language? ›

This is used to set the register address during assembly. For example; ORG 0000h tells the compiler all subsequent code starting at address 0000h. DB(define byte): The define byte is used to allow a string of bytes.

Which assembly language should I learn first? ›

We start off with Inline Assembly, and that is Assembly code wrote inside C/C++. I choose to do so because I want to start simple and then move on to real.

Is it easy to learn assembly? ›

Assembly language is not difficult, in the sense that there is no hard concept to grasp. The main difficulty is: memorizing the various instructions, addressing modes, etc... when programming, having enough short term memory to remember what you are using the various registers for.

Is x86 actually RISC? ›

Intel's x86's do NOT have a RISC engine “under the hood.” They implement the x86 instruction set architecture via a decode/execution scheme relying on mapping the x86 instructions into machine operations, or sequences of machine operations for complex instructions, and those operations then find their way through the ...

Why does x86 have so few registers? ›

The memory that registers use is really expensive to engineer in the CPU. Aside from the design difficulties in doing so, increasing the number of available registers make CPU chips more expensive. Even if more where introduced, you still need to update the instruction set and have compilers modified to use.

Are x86 processors still made? ›

As of June 2022, most desktop and laptop computers sold are based on the x86 architecture family, while mobile categories such as smartphones or tablets are dominated by ARM. At the high end, x86 continues to dominate computation-intensive workstation and cloud computing segments.

Why do hackers use assembly language? ›

Assembly language helps a hacker manipulate systems straight up at the architectural level. It is also the most appropriate coding language to build malware like viruses and trojans. Assembly is also the go-to choice if you want to reverse engineer a piece of software that has already been compiled.

Is assembly faster than C? ›

Actually, the short answer is: Assembler is always faster or equal to the speed of C. The reason is that you can have assembly without C, but you can't have C without assembly (in the binary form, which we in the old days called "machine code").

Is C++ an assembly language? ›

C++ code does not "consist" of assembly code; it consists of, well, C++ code. A compiler translates this C++ code, ultimately into executable machine code that can be run on a computer (usually under the direction of an operating system). Assembly code is a human-readable symbolic representation of machine code.

Is Apple M1 a RISC? ›

Usually, general-purpose computers use Complex Instruction Set Computing (CISC) architecture for CPUs. Still, the Apple M1 platform is based on ARM, in Reduced Instruction Set Computing (RISC).

Is x86-64 bit or 32-bit? ›

For a 32-bit version operating system, it will say X86-based PC. For a 64-bit version, you'll see X64-based PC.

Is x86 Intel or ARM? ›

ARM is RISC (Reduced Instruction Set Computing) based while Intel (x86) is CISC (Complex Instruction Set Computing). Arm's CPU instructions are reasonably atomic, with a very close correlation between the number of instructions and micro-ops.

Is SQL an assembly language? ›

Absolutely not. Assembly language is a very low level language where you instruct the processor exactly what to do, including what registers you want to use etc.

Is Python more OOP or functional? ›

C++ and Python are languages that support object-oriented programming, but don't force the use of object-oriented features. Functional programming decomposes a problem into a set of functions.

What are the 3 levels of programming languages? ›

Programming Languages:
  • Machine Language.
  • Assembly Language.
  • High level Language.

Does NASA use assembly language? ›

HAL/S (High-order Assembly Language/Shuttle) is a real-time aerospace programming language compiler and cross-compiler for avionics applications used by NASA and associated agencies (JPL, etc.).

Does Apple use assembly language? ›

Summary. The new Apple M1 Macintoshes are running ARM processors as part of all that Apple Silicon and you can run standard ARM 64-bit Assembly Language. LLVM is a standard open source development tool which contains an Assembler that is similar to the GNU Assembler.

Do engineers use assembly language? ›

Assembly language concepts are fundamental for the understanding of many areas of computer engineering/science.

Does assembly have OOP? ›

OOP is just a paradigm. It is not strictly necessary to use an object-oriented programming language to do object-oriented programming. You can do OOP for instance in C but you can even do it in assembly language.

Should I learn assembly or C? ›

Nowadays, it would be very unusual for an entire application to be written in assembly language; most of the code, at least, is written in C. So, C programming skills are the key requirement for embedded software development. However, a few developers need to have a grasp of assembly language programming.

What is the closest language to assembly? ›

An assembler is a program that takes assembly language code and translates it into the machine language code that can be run on the computer. Assembly language is called a low-level language because it is so close to machine language.

Why is x86 so inefficient? ›

The main disadvantage of x86 is the variable length instruction encoding. That means that each instruction depends on the one before it. On most ARM flavors, instructions are 32 bits long, so to decode 3 instructions you fetch 96 bits.

What are the drawbacks of x86? ›

The disadvantages of x86 are increased power consumption and heat generation. Except for that these processors are also too complicated and intricate of the commands due to the long history of development.

Will ARM make x86 obsolete? ›

As more compatibility is added, more users will switch to ARM because of speed, reliability, security and price. More people will leave x86 CPUs with their glaring vulnerabilities, and replace them with ARM powered devices. Of course this won't happen in a year or even two, but it will eventually happen.

Why is x86 not called x32? ›

x86 is the name of the architecture that it's built to run on (the name comes from a series of old Intel processors, the names of which all ended in 86, The first of which was the 8086). Although x86 was originally a 16-bit architecture, the version in use today is the 32-bit extension.

Will there be a 128 bit operating system? ›

As of 2022, there are no 128-bit computers on the market. A 128-bit processor may never occur because there is no practical reason for doubling the basic register size.

Are 32-bit computers obsolete? ›

This generation of personal computers coincided with and enabled the first mass-adoption of the World Wide Web. While 32-bit architectures are still widely-used in specific applications, their dominance of the PC market ended in the early 2000s.

Is M1 better than x86? ›

Fortunately, due to its speed, the M1 will still outperform older Intel chips in most scenarios, even with legacy x86 apps. Also, some teething problems are to be expected, as exotic tools and applications might not run out of the box, or they may incur a performance penalty.

Does Apple still use x86? ›

All Apple apps included with macOS Big Sur are compatible with x86-64 and ARM architectures. Many third-party apps are similarly being made dual-platform, including prominent software packages such as Adobe Photoshop and Microsoft Word.

Can Intel beat ARM? ›

Intel processors are more powerful and speedier than ARM processors. ARM chips, on the other hand, are more mobile-friendly than Intel processors (in most cases).

What is the difference between x86 and x64 instruction set? ›

What is the difference between x86 and x64? As you guys can already tell, the obvious difference will be the amount of bit of each operating system. x86 refers to a 32-bit CPU and operating system while x64 refers to a 64-bit CPU and operating system.

What does System type x86 mean? ›

X86 based PC means the Windows currently installed is 32 bit. Right Click This PC and select Properties. Locate System Type. If the line states. 32-bit Operating system and 64-bit processor.

Why is x86 used for 32-bit? ›

That was firstly introduced for 16-bit machines later it was converted to 32-bit machines. Due to its designed quality and popularity in their architecture and size it was expanded and kept 86 at the end of model number so, that's why Windows x86 was called according to processor series as x86.

Why does x86 mean 32? ›

The x86 moniker comes from the 32bit instruction set. So all x86 processors (without a leading 80 ) run the same 32 bit instruction set (and hence are all compatible). So x86 has become a defacto name for that set (and hence 32 bit).

Why is it called x86 and not x32? ›

The term "x86" came into being because the names of several successors to Intel's 8086 processor end in "86", including the 80186, 80286, 80386 and 80486 processors. Partly. For some advanced features, x86 may require license from Intel; x86-64 may require an additional license from AMD.

Is x86 still being used? ›

For the most part, the vast majority of computers are x86 even today, despite the architecture being several decades old. However, a new competitor has begun to arise in recent years.

How do I know if I want x64 or x86? ›

Left-click on System. There will be an entry under System called System Type listed. If it lists 32-bit Operating System, than the PC is running the 32-bit (x86) version of Windows. If it lists 64-bit Operating System, than the PC is running the 64-bit (x64) version of Windows.

Is Windows x86 32 or 64-bit? ›

If the value that corresponds to Processor starts with x86, the computer is running a 32-bit version of Windows. If the value that corresponds to Processor starts with ia64 or AMD64, the computer is running a 64-bit version of Windows.

Is x86 32-bit or 64bit? ›

In the right pane, look at the System Type entry. For a 32-bit version operating system, it will say X86-based PC. For a 64-bit version, you'll see X64-based PC.

Is x86 slower than x64? ›

x64 processes large files by mapping the entire file into the process's address space. Faster than x86 due to its faster parallel processing, 64-bit memory and data bus, and larger registers.

Why can 32-bit only use 4GB? ›

Every byte of RAM requires its own address, and the processor limits the length of those addresses. A 32-bit processor uses addresses that are 32 bits long. There are only 4,294,967,296, or 4GB, possible 32-bit addresses. There are workarounds to these limitations, but they don't really apply to most PCs.

Why is x86 more powerful than ARM? ›

The key difference is that the x86 processors are prioritized for maximum performance, while ARM processors are prioritized for high power efficiency. And the reason for this is that it's not yet possible to combine both of these properties in one type of processor.

What is 64 bits called? ›

Alternatively called WOW64 and x64, 64-bit is a CPU architecture that transfers 64-bits of data per clock cycle. It is an improvement over previous 32-bit processors. The number "64" represents the size of the basic unit of data the CPU can process.


1. You Can Learn x86 Assembly in 10 Minutes
(Low Level Learning)
2. x86 Assembly Tutorial
3. Assembly Language Programming with ARM – Full Tutorial for Beginners
4. x86-64 Assembly Programming Part 1: Registers, Data Movement, and Addressing Modes
(Gedare Bloom)
5. 13 x86 Assembly Basics Study Guide
(Sushank Kaushik)
6. x86 Assembly - Register and Instructions
(Amrita InCTF Junior)
Top Articles
Latest Posts
Article information

Author: Nicola Considine CPA

Last Updated: 03/04/2023

Views: 5977

Rating: 4.9 / 5 (49 voted)

Reviews: 88% of readers found this page helpful

Author information

Name: Nicola Considine CPA

Birthday: 1993-02-26

Address: 3809 Clinton Inlet, East Aleisha, UT 46318-2392

Phone: +2681424145499

Job: Government Technician

Hobby: Calligraphy, Lego building, Worldbuilding, Shooting, Bird watching, Shopping, Cooking

Introduction: My name is Nicola Considine CPA, I am a determined, witty, powerful, brainy, open, smiling, proud person who loves writing and wants to share my knowledge and understanding with you.