Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feature/elf add dynamic segment parser #560

Conversation

RisinT96
Copy link

This partly addresses issue #557

I decided to start slow, added some lower level support to parsing the dynamic segment of loaded ELFs.

Let me know what you guys think.

@philipc
Copy link
Contributor

philipc commented Jul 11, 2023

I think it's useful to add something at approximately this level.

Can you add an example that uses everything from this that you need, so that I have some way of testing it?

@RisinT96
Copy link
Author

RisinT96 commented Jul 11, 2023

So here's a simple usage of parsing from a regular elf file:

use object::{
    read::elf::{Dynamic, ElfFile64},
    Endianness, ObjectSymbol, ObjectSymbolTable, RelocationTarget,
};

fn main() {
    let file = std::fs::read("<path to some elf>").unwrap();
    let elf = ElfFile64::<Endianness>::parse(file.as_slice()).unwrap();
    let dynamic = Dynamic::new(elf.raw_header(), file.as_slice()).unwrap();

    let symbols = dynamic.symbols().symbols();
    for symbol in symbols {
        let name = symbol.name().unwrap();
    }

    let plt_relocations = dynamic.plt_relocations().unwrap().unwrap();
    for (addr, rel) in plt_relocations {
        if let RelocationTarget::Symbol(index) = rel.target() {
            let name = dynamic
                .symbols()
                .symbol_by_index(index)
                .unwrap()
                .name()
                .unwrap();
        }
    }

    let dyn_relocations = dynamic.dynamic_relocations().unwrap().unwrap();
    for (addr, rel) in dyn_relocations {
        if let RelocationTarget::Symbol(index) = rel.target() {
            let name = dynamic
                .symbols()
                .symbol_by_index(index)
                .unwrap()
                .name()
                .unwrap();
        }
    }
}

And here's some logic for parsing a loaded elf:

use core::{
    ffi::c_void,
    slice::{self},
};

use nix::libc;
use object::{
    elf::{FileHeader64, ProgramHeader64},
    read::elf::{Dynamic, FileHeader, ProgramHeader},
    Endianness, ObjectSymbol, ObjectSymbolTable, RelocationTarget,
};

fn main() {
    let mut info = libc::Dl_info {
        dli_fname: core::ptr::null(),
        dli_fbase: core::ptr::null_mut(),
        dli_sname: core::ptr::null(),
        dli_saddr: core::ptr::null_mut(),
    };

    let result = unsafe { libc::dladdr(main as *const c_void, &mut info) };

    assert_ne!(0, result);

    let base = info.dli_fbase as usize;

    println!("Found own elf at address: {:p}", info.dli_fbase);

    let header = unsafe {
        slice::from_raw_parts(
            base as *const u8,
            core::mem::size_of::<FileHeader64<Endianness>>(),
        )
    };
    let header = <FileHeader64<Endianness> as FileHeader>::parse(header).unwrap();
    let endian = header.endian().unwrap();

    let page_headers = header.e_phoff(endian) as usize + base;
    let page_headers_num = header.e_phnum(endian);

    println!("The elf contains {page_headers_num} page headers!");

    let page_headers = unsafe {
        slice::from_raw_parts(
            page_headers as *const ProgramHeader64<Endianness>,
            page_headers_num as usize,
        )
    };

    let size = page_headers
        .iter()
        .map(|h| {
            let vaddr = h.p_vaddr(endian) as usize;
            let memsz = h.p_memsz(endian) as usize;
            vaddr + memsz
        })
        .max()
        .unwrap();

    println!("The elf's total size is {size:#x}");

    let data = unsafe { slice::from_raw_parts(base as *const u8, size) };

    let dyns = page_headers
        .iter()
        .find_map(|ph| ph.dynamic_loaded(endian, data).transpose())
        .unwrap()
        .unwrap();

    println!("Found dynamic segment!");

    let dynamic = Dynamic::new_loaded(base, header, data, Some(dyns))
        .map_err(|e| println!("{e}"))
        .unwrap();

    println!("Parsed dynamic segment!");

    let symbol_table = dynamic.symbols();

    let plt_relocations = dynamic.plt_relocations().unwrap().unwrap();
    for (offset, rel) in plt_relocations {
        let addr = offset as usize + base;
        if let RelocationTarget::Symbol(index) = rel.target() {
            let name = symbol_table.symbol_by_index(index).unwrap().name().unwrap();

            println!("\tFound PLT relocation: {:p} {}", addr as *const (), name);

            // On android the rust compiler seems to favor `.rela.plt`.
            if name == "memset" {
                let memset = unsafe { *(addr as *const usize) };
                if memset == libc::memset as usize {
                    println!("\t\tmemset address matches the PLT! (obviously)");
                }
            }
        }
    }

    let dynamic_relocations = dynamic.dynamic_relocations().unwrap().unwrap();
    for (offset, rel) in dynamic_relocations {
        let addr = offset as usize + base;
        if let RelocationTarget::Symbol(index) = rel.target() {
            let name = symbol_table.symbol_by_index(index).unwrap().name().unwrap();

            println!(
                "\tFound dynamic relocation: {:p} {}",
                addr as *const (), name
            );

            // On linux the rust compiler seems to favor `.rela.dyn`.
            if name == "memset" {
                let memset = unsafe { *(addr as *const usize) };
                if memset == libc::memset as usize {
                    println!("\t\tmemset address matches the GOT! (obviously)");
                }
            }
        }
    }
}

There seems to be some kind of bug with parsing the GNU Hash table on android, but the code works if I only parse the regular Hash table.

I'll try to figure out the bug if I have time, which I probably won't have for the next two weeks.

@RisinT96
Copy link
Author

I finally have some free time, I can add some self contained unit tests to the repository, which should cover all the changes.

The GNU Hash table bug should probably be investigated separately though.

@philipc
Copy link
Contributor

philipc commented Aug 18, 2023

Thanks for the example usage, but I haven't had time to look at this in detail yet. There will be changes required, but I want to improve my understanding before making suggestions.

@philipc
Copy link
Contributor

philipc commented Aug 27, 2023

I've finally had time to look into this some more, and I think that this is depending too much on both the internals of the dynamic loading, and the specific ELF files:

  • The ELF header and the program headers do not need to be part of a PT_LOAD segment.
  • File offsets do not need to be equal to address offsets relative to the load address.
  • The dynamic loader is free to modify the data that it uses for loading.

Examples of the problems that are caused by these are the GNU hash table bug, and the differences between linux and android.

This goes back to how I said that I didn't think this sort of thing was possible for ELF, and nothing I've learnt has changed my view of that.

As a result, anything that uses parsing of the contents of a loaded PT_DYNAMIC section (and the things it points to, such as hash tables) is inherently system specific and unreliable. It may be possible to implement a solution that works in a limited set of circumstances that is good enough for your needs, but I don't want to be maintaining such a solution in this crate.

@RisinT96
Copy link
Author

Hi,

All valid points.

Perhaps it is indeed better if I take this to a separate crate.

I'll work on it when I have the time, maybe some changes will still be necessary to this crate (to better expose some lower level APIs or something similar), but I'll try to keep those to a minimum.

I guess if such changes are necessary I'll open a separate PR.

Thank you for your time :)

@RisinT96 RisinT96 closed this Oct 26, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants