How does the Linux kernel "listen" to the C library? -
i'm trying build "big picture" of how things work in linux kernel , userspace, , i'm quite confused. know userspace make use of system calls "talk" kernel, don't know how. tried read c library , kernel source codes complex , not easy understand. i've read several books regarding conceptual facts operating systems, managing processes, memory, devices, don't make "transition" (userspace->kernel) clear. so, transition between userspace , kernel space happens? how c library run code that's inside linux kernel running in machine?
to make analogy: imagine there house. house locked. key open house inside house itself. there's 1 person inside house, kernel. userspace trying enter house. question be: how kernel knows there's outside house wanting key, , mechanism allows house opened key?
that's quiet easy - person can use doorbell let kernel know waiting outside. , doorbell in our case special cpu exception, software interrupt or dedicated instruction userspace allowed use , kernel can handle.
so procedure this:
- first need know system call number. each syscall has it's unique number , there table inside of kernel maps numbers specific functions. each architecture has different table same number on 2 different architectures may map different syscall.
- then setup arguments. architecture specific not different passing arguments between usual functions calls. usually, put arguments in specific cpu register. described in abi of architecture.
- then enter syscall. depending on architecture may mean causing exception or using dedicated cpu instruction.
- the kernel has special handler function runs in kernel mode when syscall called. pause process execution, storing information specific process (this called
context switch
), read syscall number , arguments , call proper syscall routine. make sure put return value in proper place userspace read , schedule process when syscall routine done (restoring it's context).
as example, let kernel know want call syscall on x86_64 can use sysenter
instruction syscall number in %rax
register. arguments passed using registers (if remember correctly) %rdi
, %rsi
, %rdx
, %rcx
, %r8
, %r9
.
you use older way used on 32 bit x86 cpus - software interrupt number 0x80 (int 0x80
instruction). again, syscall number specified in %rax
register , arguments go (again, if i'm not mistaken) %ebx
, %ecx
, %edx
, %esi
, %edi
, %ebp
.
arm similar - use "supervisor call" instruction (svc #0
). syscall number go r7
register, arguments go registers r0-r6
, return value of syscall stored in r0
.
other architectures , operating systems use similar techniques. details may vary - software interrupt numbers may different, arguments may passed using different registers or using stack core idea same.
Comments
Post a Comment