- Using ABI - dependent on system or kernel ABI
- imported ABI contamination - place where a library's ABI is incorporated in another library or executable
-
programming language
- names
- allowed names & character encoding determined by language
- scope determines accessibility of name
- global
- class
- function/method
- block
- may be nested
- classes
- namespaces
- visibility
- within compilation unit only
- visible to other compilation units via declarations
- types
- can be named
- data
- fundamental types
- integers
- encoding
- signed/unsigned/other
- size
- floating point
- encoding
- size
- integers
- derived types
- processor specific vector types
- arrays
- class/struct/record
- collection of named data objects & methods
- derived class
- extends another class with data and methods
- polymorphic (virtual) class
- derived class
- virtual methods determined at runtime
- union
- collection of named data objects & methods
- each shares the same the same memory
- pointers
- enums
- stored as some int type
- has enumerates that map name to constant
- bit fields
- complex number
- type qualifiers
- const
- volatile
- atomic
- thread local
- fundamental types
- function/method
- return type
- optional type qualifiers
- methods have an implicit parameter pointing to the object
- parameters
- typed parameters
- optional variadic parameters
- type and number may be fixed (e.g. open)
- type and number determined at runtime (e.g. printf)
- language runtime support library
- dynamic memory management
- malloc, std::allocator
- runtime type information
- exceptions
- atexit functionality
- dynamic memory management
- object
- has a type and size
- located in memory or register(s)
- represents a
- named variable
- temporary object
- parameter
- dynamically allocated
- function
- has a type
- statements define the function's data transformations and calls to other functions
- declaration
- maps a name to a type
- definition
- also a declaration
- of a type
- of a variable
- is a named object
- of a function
- macros
- has a name
- has substitution text
- optionally parameterized
- name in source code is textually replace by substitution text before compilation
- names
-
source code
- written in a programming language
- textual description of a program
- names defined
- types
- functions
- variables
- macros
- names declared
- types
- functions
- variables
- macros
-
architecture/processor
- the processor that the software will run on
- has an instruction set architecture (ISA) that is the encoding and semantics of instructions
- may have levels where new instructions are added or removed
- directs mapping of types and properties
- based on efficient hardware support for size and encoding
- alignment requirements based on hardware requirements and efficiency
- use of emulation software if not available
-
API
- software interface that allows a program to use libraries
- declarations present in a library for use by external code
- functions
- types
- data objects
- public API is a stable interface to a libraries functions and data allowing the implementation to be enhanced and bugs fixed
- no modification required to user code if
- types in API are changed to compatible types
- additional functions and objects can be added
- semantics of functions and data described in documentation
-
ABI
- conventions that allow for the interoperability and compatibility of binary code objects to form an executable that can be run on an OS and architecture.
- allows interoperability between
- compiler vendors and versions
- kernel versions
- library implementations and versions
- allows interoperability between
- categories
- kernel/OS ABI
- interface to make system calls to perform OS system calls from a user space process
- typically different from the function call ABI in the system ABI
- user code will rarely directly use this ABI, instead accessing system calls via a library such as libc
- for each OS syscall
- syscall number
- semantics of functionality
- number of parameters & return value
- each has a data type
- may be input or output
- parameter and return type semantics
- syscall ABI calling convention
- how privileged call to OS is made
- how syscall number is passed
- how parameters are passed
- how results are returned
- modification to process state the syscall is allowed to make
- stack
- registers
- flags
- system ABI
- conventions to call functions, use libraries, begin/end execution
- library/executable format
- mapping of programming language types to architecture types
- function call interface
- how parameters are passed
- registers
- stack (memory)
- how results are returned
- requirements of the caller
- stack alignment
- flags/mode requirements
- requirements of the callee
- flag/mode requirements
- register/memory modifications
- modification allowed by the callee to CPU state & stack
- mechanism to transferred control to function
- stack usage
- find address of functions entry point
- direct call if in same object and not exported
- if to external object or exported
- GOT, PLT
- how parameters are passed
- register usage
- loading
- how kernel loads executable into memory to begin execution
- initial state setup
- optionally loads loader
- transfers control to object or loader to complete initialization
- initial process state
- CPU flags
- registers
- meanings
- stack
- meaning of data
- data from user space and OS
- argv array
- environment variables
- auxiliary vector
- rules for signals handlers
- mapping of hardware exceptions to signals
- threading
- thread local data
- exception handling
- throwing, catching
- stack unwinding
- CPU, stack, memory conventions
- special registers
- GOT
- frame/base pointer
- linkage
- stack
- red zone
- signal
- special registers
- library ABI
- created by linking together object files into a library that were created by compiling source code
- manifestation of library API as realized in the object code using the system ABI
- exported typed functions
- exported typed data
- kernel/OS ABI
- conventions that allow for the interoperability and compatibility of binary code objects to form an executable that can be run on an OS and architecture.
-
compilation
- transforms source file(s) to object file
- generally source files are compiled separately (a compilation unit)
- generates code
- use target processor's ISA
- using ABI conventions for external functions calls/definitions
- imported ABI contamination of symbol use if from another library
- accessing data using ABI conventions for the type
- imported ABI contamination if type is from another library
- functios internal to this object file can be optimized by not following system ABI
- for imported-inlined functions
- imported ABI contamination if from another library
- non-static local variable
- generated by code for the function at runtime
- imported ABI contamination if type from another library
- inlined runtime support library functions
- imported ABI contamination from language runtime
- generates data
- using ABI conventions for
- types
- size
- layout
- alignment
- imported ABI contamination if definition from another library
- global variable data
- visibility: outside object
- per executable or per thread
- static function data
- visibility: local to this object
- per executable or per thread
- using ABI conventions for
- maps source code names to object file symbols using ABI
- maps source code types to code using ABI
- imported ABI contamination if from another library
- linkage information
- external symbols referenced
- how to patch object to allow symbol to be used
- synthesizes function/code needed at runtime
- implicitly generated functions such constructors/destructors
- template instantiation
- exception handlers
- initialization/finalization
- multiple versions of functions or blocks
- parallelization
- architecture feature enabled
- resolver functions to select "best" implementation at runtime
- partial inlining
- synthesizes runtime necessary data structures
- RTTI
- exceptions
- debug data if needed
- data to support link time optimization if needed
- macros
- textual replacement of a word in the source file
- happens before processing by compiler
- possibly parameterized
- value can be conditionally determined at compile time
- names are not present in object file, just artifacts of expansion
- imported ABI contamination if macor from another library is exanded
- textual replacement of a word in the source file
-
object file
- compiled source code
- linking information
- symbols defined
- have course type: function or object (data)
- have scope: local or global (exported)
- may be versioned
- symbols required from other objects/libraries
- where in memory to patch is symbols address
- symbols defined
- no need for types other than symbol names
- debug information
- link time optimization (LTO) data
-
linker
- creates executable or library from object files and external libraries
- relocates object references to function and data
- combines from each object file
- code
- data
- debug information
- LTO data
- external symbols
- determines list of symbols to export
- symbols can be versioned based on object file symbols and map file
- external versioned symbols based on external library passed to linker
-
executable
- file that the OS kernel can load into memory to form a process
- may also be a library (rare)
- contains
- required symbols
- external libraries using SONAME to load to resolve needed symbols
-
library
- dynamic library or executable
- single object
- has an SONAME (library name and ABI version)
- static library
- collection of object files
- contains
- list of exported symbols
- required symbols
- external libraries using SONAME to load to resolve needed symbols
- dynamic library or executable
-
loader
- used if dynamic linking is required
- loads required libraries recursively based on needed SONAMEs
- initializes dynamic linking data structures
- patches data using resolved symbols and linkage data
- computes runtime determined symbol values
- transfers control to executable's entry point
-
problems
- if ABI doesn't cover feature of programming language or compiler
- if ABI doesn't cover architecture feature
- if tool chain does not follow system or kernel ABI correctly
-
issues
- information to validate library ABI is neither in the libraries or executable as it is discarded by the compiler and linker
- needed symbols do not include the SONAME of the library that they should be provided by; symbols may be inadvertently provided by the incorrect library