The first step in Linux boot: Start the assembly to call the main function

To clarify the principle, we first introduce the C-function calling mechanism, then the assembly call C-function.

1. C Function Call Mechanism

For functions in an assembly, it uses a stack to pass parameters and store temporary variables.The frame pointer EBP is used to locate parameters and temporary variables, while the stack pointer esp is used to assist with all storage, including ebp, various return jump addresses, parameters, and temporary variables.
Its schematic diagram is as follows:


Stack Frame Structure

Next, we analyze the example.
First, give the C language source code:

int add(int a, int b)
{
    int c;
    c=a+b;
    return c;
}

int main()
{
    int a, b, c;
    a=1; b=2;
    c=add(a, b);
    return c;
}

Assemble:

gcc -Wall -S -o test.s test.c

Get the assembly source:

    .file   "test.c"
    .text
.globl add
    .type   add, @function
add:
    pushl   %ebp
    movl    %esp, %ebp
    subl    $16, %esp
    movl    12(%ebp), %eax
    movl    8(%ebp), %edx
    leal    (%edx,%eax), %eax
    movl    %eax, -4(%ebp)
    movl    -4(%ebp), %eax
    leave
    ret
    .size   add, .-add
.globl main
    .type   main, @function
main:
    pushl   %ebp
    movl    %esp, %ebp
    subl    $24, %esp
    movl    $1, -12(%ebp)
    movl    $2, -8(%ebp)
    movl    -8(%ebp), %eax
    movl    %eax, 4(%esp)
    movl    -12(%ebp), %eax
    movl    %eax, (%esp)
    call    add
    movl    %eax, -4(%ebp)
    movl    -4(%ebp), %eax
    leave
    ret
    .size   main, .-main
    .ident  "GCC: (GNU) 4.4.7 20120313 (Red Hat 4.4.7-18)"
    .section    .note.GNU-stack,"",@progbits

You can see that the modern version of GCC allocates more space to local variables.Twenty-four bytes are allocated in the main function, but twelve are enough; 16 bytes are allocated in the add function, but four are enough.I modified the assembly source to have minimal space and found it works as well.
Its corresponding stack frame diagram is as follows:


Stack frame structure before and after call of add function

Of these, three directives are particularly critical:

call: Returns the instruction stack and jumps to the location of the calling function
ret: Pop up the address at the top of the stack and jump to its address
leave: 
  movl %ebp, %esp
  popl %ebp

2. Assembly Calls C Functions

Understanding the C function call mechanism, the next step is simpler.
The above examples can be analyzed.
First, extract the assembly source, 1.s:

    .text
.globl main
    .type   main, @function
main:
    pushl   %ebp
    movl    %esp, %ebp
    subl    $24, %esp
    movl    $1, -12(%ebp)
    movl    $2, -8(%ebp)
    movl    -8(%ebp), %eax
    movl    %eax, 4(%esp)
    movl    -12(%ebp), %eax
    movl    %eax, (%esp)
    call    add
    movl    %eax, -4(%ebp)
    movl    -4(%ebp), %eax
    leave
    ret
    .size   main, .-main
    .ident  "GCC: (GNU) 4.4.7 20120313 (Red Hat 4.4.7-18)"
    .section    .note.GNU-stack,"",@progbits

Then, the corresponding C function, 2.c:

int add(int a, int b)
{
    int c;
    c=a+b;
    return c;
}

Next, compile the link using gcc:

gcc -o test 1.s 2.c

In this way, we know how the main function is called by the Linux boot assembly, and we have a starting point.
Of course, you also need to know the protection mode of x86, which I recommend Teacher Li Zhong's "x86 Assembly Language: From Real Mode to Protection Mode".
Quick code, time-consuming drawing, I hope you can approve.

Tags: Red Hat Linux Assembly Language C

Posted on Sat, 14 Mar 2020 09:36:01 -0700 by mimilaw123