Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement ASLR #707

Open
jounathaen opened this issue Jun 12, 2024 · 7 comments
Open

Implement ASLR #707

jounathaen opened this issue Jun 12, 2024 · 7 comments

Comments

@jounathaen
Copy link
Member

As Hermit is relocatable, we should load the Kernel to a random address in the virtual address space. The starting point for this is here:

uhyve/src/vm.rs

Lines 182 to 200 in 9ba3c9d

// TODO: should be a random start address, if we have a relocatable executable
let kernel_start_address = object.start_addr().unwrap_or(0x400000) as usize;
let kernel_end_address = kernel_start_address + object.mem_size();
self.offset = kernel_start_address as u64;
if kernel_end_address > self.mem.memory_size - self.mem.guest_address.as_u64() as usize {
return Err(LoadKernelError::InsufficientMemory);
}
let LoadedKernel {
load_info,
entry_point,
} = object.load_kernel(
// Safety: Slice only lives during this fn call, so no aliasing happens
&mut unsafe { self.mem.as_slice_uninit_mut() }
[kernel_start_address..kernel_end_address],
kernel_start_address as u64,
);
self.entry_point = entry_point;

@n0toose
Copy link
Member

n0toose commented Jun 12, 2024

Duplicate of #323?

@n0toose
Copy link
Member

n0toose commented Jun 14, 2024

I came across a problem when trying to find the lowest possible barrier that applications can be loaded from. I spent the last hour investigating, and plan to bring out gdb and Valgrind after trying to get a better picture of where the problem exactly happens.

This is a long text, mostly for myself as I like "working in public" and, if I don't get any further, help any future contributors to this project (or perhaps myself).


So, we have memory, and we are looking for some place to load the kernel onto. There are currently three limitations:

  • We need enough space for the boot stack, as defined by KERNEL_STACK_SIZE.
    • So, the theoretical bare minimum for kernel_start_address should be 0x000000 + KERNEL_STACK_SIZE.
    • EDIT: Theoretical as in, "it should theoretically work although starting from the zero page is not a good idea, but it should work!". Using this to test things first might be a good idea to prevent problems later.
  • We have to make sure that there is enough space in between, also, that the space between the kernel_start_address and the kernel_end_address is appropriate.
    • Something like this:

      uhyve/src/vm.rs

      Line 187 in 9ba3c9d

      if kernel_end_address > self.mem.memory_size - self.mem.guest_address.as_u64() as usize {
  • Uhyve fails on odd memory sizes. #265

The size is assumed to be 0x8000:

pub const KERNEL_STACK_SIZE: u64 = 32_768;

uhyve/src/vm.rs

Lines 227 to 228 in 9ba3c9d

self.stack_address = (kernel_start_address as u64)
.checked_sub(KERNEL_STACK_SIZE)

I'd imagine that to be a bit large on its own, but nevertheless... let's set the start address to 0x000000 + KERNEL_STACK_SIZE. Uhyve fails silently when running cargo run -- -v data/x86_64/rusty_demo.

So, this is either an alignment problem or, perhaps, as the KERNEL_STACK_SIZE value is hard-coded, we need more space? It's probably the space, perhaps the hard-coded size is not enough, so we could try guessing things until things work.

I tried a few different values for 0x000000 + KERNEL_STACK_SIZE (as I was curious whether something else would break if I changed the KERNEL_STACK_SIZE but had enough space - the final bit is important for the paging):

KERNEL_STACK_SIZE with rusty_demo:

  • 0x100000: Works
  • 0x101110 and 0x101120: Works
  • 0x91120: Works
  • Decrementing further, 0x11120: Fails without any output.
  • 0x21120, 0x20000, ..., 0x13000, : Works
  • 0x12000: Fails without output.
  • 0x13000: Works

Again:

  • 0x12000 is the point where there is no output whatsoever.
  • Everything like 0x12100, 0x12200, 0x12900 returns:
   Compiling uhyve v0.3.0 (/home/user/ACS/uhyve)
    Finished dev [unoptimized + debuginfo] target(s) in 1.38s
     Running `target/debug/uhyve -v data/x86_64/rusty_demo`
[0][INFO] Welcome to Hermit 0.6.9
[0][INFO] Kernel starts at 12100
[0][INFO] BSS starts at 0xeadf0
[0][INFO] tls_info = Some(
    TlsInfo {
        start: 0xdde20,
        filesz: 0x30,
        memsz: 0xc9,
        align: 0x10,
    },
)
[0][PANIC] panicked at src/arch/x86_64/mm/paging.rs:319:58:
called `Result::unwrap()` on an `Err` value: ParentEntryHugePage
Number of interrupts
[0][INFO] shutting down with code 1

... which is fair. But worth a try anyway.

The lowest address from which Uhyve (rusty_demo) starts to work, starting from 0x000000 is 0x13000 (*). The assumed value of KERNEL_STACK_SIZE is 0x8000. So there is an additional "gap" of 0x13000 - 0x8000 = 0x5000. "OK", I said, "The boot stack takes up more space than expected, should I just bump the number to 0x13000 for now and trust that it can be used as a lower boundary?".

Here's the fun part! hello_c, which is presumably more lightweight, works if we set the start address (again, KERNEL_STACK_SIZE) from 0x12000 (gap: 0x12000 - 0x8000 = 0x4000) too! hello_world doesn't work at 0x12000.

Is something wrong with start_addr()? Why does the gap left of the defined kernel_start_address "change" depending on the application being run?

To be continued?

@mkroening
Copy link
Member

hello_c, which is presumably more lightweight, works if we set the start address (again, KERNEL_STACK_SIZE) from 0x12000 (gap: 0x12000 - 0x8000 = 0x4000) too!

Note that Hermit C images are not relocatable yet, so KernelObject::start_addr returns Some(addr), that we have to load to.

@n0toose
Copy link
Member

n0toose commented Jun 15, 2024

Note that Hermit C images are not relocatable yet, so KernelObject::start_addr returns Some(addr), that we have to load to.

Thank goodness that there's a clear explanation for this and that not all images are relocatable - even if that's the goal - but it seems convenient enough to just compare the values. Should've brought the debugger out, but my brain turned mushy.

(EDIT: I laid out a plan here, but now I am actually attempting the implementation and turned this already-large reply to a TODO list. My tree can be found here: https://github.com/n0toose/uhyve/tree/aslr-support-poc)

Preparation:

There's, without being sure if it's exclusively because of Uhyve or if anything else is reserving anything, not enough space.

  • Increase the constant to 0x13000. (or perhaps even further as a buffer, 0x20000?)

I am not sure what the consequences would be, given the value is only used to check if there's space before loading anything.

Implementation:

  • Generate a random_start_address starting from 0x100000 ("proof of concept" first, evaluation of doing it securely using a seed later), in a magical hacky way that doesn't make paging break or trigger any bugs that previously existed before.
    • Figure out the lower bound. (0x100000 is used for now)
    • Figure out the upper bound. (let end_address_upper_bound: u64 = self.mem.memory_size as u64 - self.mem.guest_address.as_u64();
    • Triple-check the soundness of the address range (especially the upper bound) and the way I process it.
  • Check the soundness of the generated addresses.
    • 0xFFFFF0 is the mask I am temporarily using to remove. I need to understand why it works or if I'm removing something that I shouldn't.
      • Answer: 0b111111111111111111111000 doesn't work.
        • Take a look at the bits used for the paging in x86_64 and aarch64 again when I'm less tired and can count properly, just to make sure I'm not cutting off an upper part.
          • Oops, I was cutting off an upper part! 0xFFFFFF0 it is, then. But I still have to check it. u64 can hold like 64 bits, I am not even sure why I am fighting with a number I can't always predict or not doing proper pointer arithmetic.
          • Or just use a mask for the entirety of those 64 bits despite virtual addresses not using 64 bits because I'm too tired and can't put the keyboard away.
        • Move this constant to arch/?
    • ThreadRng is used for randomness, which implements the CryptoRng trait, but sources randomness from OsRng. Ensure that this is OK.
    • It probably is.
    • What if the random number generator fails for some reason?
  • Call start_addr(), check whether Some(addr) == random_start_address or None.
    • Update: Not necessary. We only generate a random address if we get a None.
    • Store the result internally (debug infos) and alert the user.
  • Integration:
    • ASLR should be made optional for now.

Clean up: (will open in separate issues if the implementation lands to main)

  • Go over the cryptographic soundness of the implementation.
  • Write tests
    • Edge cases?
    • Some(addr) == random_start_address or Some(addr) != random_start_address
  • Document what you just said in the comments, the README and other places around hermit-os.

I am not sure if I can estimate the time this would take me.

I previously discussed using goblin to evaluate whether an ELF file is relocatable, but it was seen as too redundant because images should always be relocatable by definition. This still represents the project's direction and I believe in reducing bloat when developing software, but has been discussed before (#26 (comment)). I'll avoid using goblin, as hermit_entry is already doing what we'd otherwise need from goblin.

@mkroening
Copy link
Member

Note that Hermit C images are not relocatable yet, so KernelObject::start_addr returns Some(addr), that we have to load to.

FYI: Hermit C images are now relocatable. 🥳
At least once this PR is merged and published: hermit-os/hermit-entry#37

@n0toose
Copy link
Member

n0toose commented Aug 17, 2024

FYI: Hermit C images are now relocatable. 🥳

Thank you! Now I can finally sleep peacefully. 😄

That means that the changes where I tried to detect whether an image is relocatable will have to be reverted, as they should be relocatable by definition now. There was still an outstanding problem that a relocatable image would not boot if an address over 0x100000 is chosen, and that the way that I generate a "start address" is not as clean as I'd like it to be: #725

@mkroening
Copy link
Member

See #744

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Development

Successfully merging a pull request may close this issue.

3 participants