Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create sandbox for host file system access #783

Open
wants to merge 8 commits into
base: main
Choose a base branch
from
6 changes: 0 additions & 6 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,14 +5,8 @@
[![crates.io](https://img.shields.io/crates/v/uhyve.svg)](https://crates.io/crates/uhyve)
[![Zulip Badge](https://img.shields.io/badge/chat-hermit-57A37C?logo=zulip)](https://hermit.zulipchat.com/)

## Introduction

Uhyve is a small hypervisor specialized for the [Hermit kernel](https://github.com/hermitcore/kernel).

> [!WARNING]
> For the time being, Uhyve provides the unikernel full host file system access with the permissions of the user running Uhyve.
> Thus, it should not be used for applications which require isolation from the host system.

## Installation

1. Install the Rust toolchain. The Rust Foundation provides [installation instructions](https://www.rust-lang.org/tools/install).
Expand Down
12 changes: 12 additions & 0 deletions src/bin/uhyve.rs
Original file line number Diff line number Diff line change
Expand Up @@ -55,6 +55,16 @@ struct Args {
#[cfg(target_os = "linux")]
gdb_port: Option<u16>,

/// Paths that the kernel should be able to view, read or write.
///
/// Files and directories are separated using commas.
/// Desired mount paths must be explicitly defined after a colon.
///
/// Example: --file_map host_directory:/root/guest_directory,file.txt:/root/my_file.txt
#[arg(value_delimiter = ',')]
#[clap(long, env = "HERMIT_FILE_MAP")]
file_map: Option<Vec<String>>,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be a better interface, if we have the separate mounts as separate options. Like uhvye -v a.txt:b.txt -v c.txt:/bla/blub.txt

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The thing with the separate options works by default. I removed the delimiter and the environment variable for now, and changed file_map into mount (while still referring to a file_map internally, leaving room for further extensions in the future).

As described in private (after a discussion with @mkroening), the middle ground that I found for this was to call the parameter --mount again without introducing a short. -v is already taken by --verbose (just like -m for --file-map, only -f isn't taken but that's a weird choice).


/// The kernel to execute
#[clap(value_parser)]
kernel: PathBuf,
Expand Down Expand Up @@ -243,6 +253,7 @@ impl From<Args> for Params {
},
#[cfg(target_os = "linux")]
gdb_port,
file_map,
kernel: _,
kernel_args,
} = args;
Expand All @@ -256,6 +267,7 @@ impl From<Args> for Params {
cpu_count,
#[cfg(target_os = "linux")]
pit,
file_map,
#[cfg(target_os = "linux")]
gdb_port,
#[cfg(target_os = "macos")]
Expand Down
50 changes: 42 additions & 8 deletions src/hypercall.rs
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
use std::{
ffi::OsStr,
ffi::{CStr, CString, OsStr},
io::{self, Error, ErrorKind, Write},
os::unix::ffi::OsStrExt,
};
Expand All @@ -8,6 +8,7 @@ use uhyve_interface::{parameters::*, GuestPhysAddr, Hypercall, HypercallAddress,

use crate::{
consts::BOOT_PML4,
isolation::UhyveFileMap,
mem::{MemoryError, MmapMemory},
virt_to_phys,
};
Expand Down Expand Up @@ -84,13 +85,46 @@ pub fn unlink(mem: &MmapMemory, sysunlink: &mut UnlinkParams) {
}

/// Handles an open syscall by opening a file on the host.
pub fn open(mem: &MmapMemory, sysopen: &mut OpenParams) {
unsafe {
sysopen.ret = libc::open(
mem.host_address(sysopen.name).unwrap() as *const i8,
sysopen.flags,
sysopen.mode,
);
pub fn open(mem: &MmapMemory, sysopen: &mut OpenParams, file_map: &Option<UhyveFileMap>) {
// TODO: We could keep track of the file descriptors internally, in case the kernel doesn't close them.
let requested_path = mem.host_address(sysopen.name).unwrap() as *const i8;

// If the file_map doesn't exist, full host filesystem access will be provided.
if let Some(file_map) = file_map {
// Rust deals in UTF-8. C doesn't provide such a guarantee.
// In that case, converting a CStr to str will return a Utf8Error.
//
// See: https://nrc.github.io/big-book-ffi/reference/strings.html
let guest_path = unsafe { CStr::from_ptr(requested_path) }.to_str();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd much prefer a checked safe variant here

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see any better way of doing this in https://doc.rust-lang.org/std/ffi/struct.CStr.html

I presumed to_str() to be "good enough" given that the type is a Result(&str, Utf8Error) and basically relies on Rust itself to protect against malicious input, no?


if let Ok(guest_path) = guest_path {
let host_path_option = file_map.get_host_path(guest_path);
if let Some(host_path) = host_path_option {
// This variable has to exist, as pointers don't have a lifetime
// and appending .as_ptr() would lead to the string getting
// immediately deallocated after the statement. Nothing is
// referencing it as far as the type system is concerned".
//
// This is also why we can't just have one unsafe block and
// one path variable, otherwise we'll get a use after free.
let host_path_c_string = CString::new(host_path.as_bytes()).unwrap();
let new_host_path = host_path_c_string.as_c_str().as_ptr();

unsafe {
sysopen.ret = libc::open(new_host_path, sysopen.flags, sysopen.mode);
}
} else {
error!("The kernel requested to open() a non-whitelisted path. Rejecting...");
sysopen.ret = -1;
}
} else {
error!("The kernel requested to open() a path that is not valid UTF-8. Rejecting...");
sysopen.ret = -1;
}
} else {
unsafe {
sysopen.ret = libc::open(requested_path, sysopen.flags, sysopen.mode);
}
}
}

Expand Down
101 changes: 101 additions & 0 deletions src/isolation.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,101 @@
use std::{collections::HashMap, ffi::OsString, fs, path::PathBuf};

/// HashMap matching a path in the guest OS ([String]) a path in the host OS ([OsString]).
///
/// Using a list of parameters stored in a [Vec<String>], this function creates
/// a HashMap that can match a path on the host operating system given a path on
/// the guest operating system.
///
/// See [crate::hypercall::open] to see this in practice.
pub struct UhyveFileMap {
files: HashMap<String, OsString>,
}

impl UhyveFileMap {
/// Creates a UhyveFileMap.
///
/// * `parameters` - A list of parameters with the format `./host_path.txt:guest.txt`
pub fn new(parameters: &[String]) -> Option<UhyveFileMap> {
Some(UhyveFileMap {
files: parameters
.iter()
.map(String::as_str)
.map(Self::split_guest_and_host_path)
.map(|(guest_path, host_path)| {
(
guest_path,
fs::canonicalize(&host_path).map_or(host_path, PathBuf::into_os_string),
)
})
.collect(),
})
}

/// Separates a string of the format "./host_dir/host_path.txt:guest_path.txt"
/// into a guest_path (String) and host_path (OsString) respectively.
///
/// `parameter` - A parameter of the format `./host_path.txt:guest.txt`.
fn split_guest_and_host_path(parameter: &str) -> (String, OsString) {
let mut partsiter = parameter.split(":");

// Mind the order.
// TODO: Do this work using clap.
let host_path = OsString::from(partsiter.next().unwrap());
let guest_path = partsiter.next().unwrap().to_owned();
Copy link
Member Author

@n0toose n0toose Nov 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

--file-map "" results in:

thread 'main' panicked at src/isolation.rs:44:43:
called `Option::unwrap()` on a `None` value

This is "by design" for the time being, but suboptimal - there is definitely room for improvement here. Given that I'm targeting main, should this be improved later down the line or now?

Something that @cagatay-y brought up to me is that using a bunch of .unwrap()'s to make up for invalid inputs is not explicit enough and could result in future problems during a supposed refactoring.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

More specifically, I think it is possible that if the function signature is changed to return an Option in the future, it is possible that a None caused by an error in parsing may get confused with the None variant that signifies unsandboxed filesystem access.


(guest_path, host_path)
}

/// Returns the host_path on the host filesystem given a requested guest_path, if it exists.
///
/// This function will look up the requested file in the UhyveFileMap and return
/// the corresponding path.
///
/// `guest_path` - The guest path. The file that the kernel is trying to open.
pub fn get_host_path(&self, guest_path: &str) -> Option<&OsString> {
self.files.get(guest_path)
}
}

#[cfg(test)]
mod tests {
use super::*;

#[test]
fn test_split_guest_and_host_path() {
let host_guest_strings = vec![
"./host_string.txt:guest_string.txt",
"/home/user/host_string.txt:guest_string.md.txt",
":guest_string.conf",
":",
"exists.txt:also_exists.txt:should_not_exist.txt",
];

// Mind the inverted order.
let results = vec![
(
String::from("guest_string.txt"),
OsString::from("./host_string.txt"),
),
(
String::from("guest_string.md.txt"),
OsString::from("/home/user/host_string.txt"),
),
(String::from("guest_string.conf"), OsString::from("")),
(String::from(""), OsString::from("")),
(
String::from("also_exists.txt"),
OsString::from("exists.txt"),
),
];

for (i, host_and_guest_string) in host_guest_strings
.into_iter()
.map(UhyveFileMap::split_guest_and_host_path)
.enumerate()
{
assert_eq!(host_and_guest_string.0, results[i].0);
assert_eq!(host_and_guest_string.1, results[i].1);
}
}
}
1 change: 1 addition & 0 deletions src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,7 @@ pub mod macos;
#[cfg(target_os = "macos")]
pub use macos as os;
mod hypercall;
mod isolation;
pub mod mem;
pub mod paging;
pub mod params;
Expand Down
8 changes: 5 additions & 3 deletions src/linux/x86_64/kvm_cpu.rs
Original file line number Diff line number Diff line change
Expand Up @@ -418,9 +418,11 @@ impl VirtualCPU for KvmCpu {
}
Hypercall::FileClose(sysclose) => hypercall::close(sysclose),
Hypercall::FileLseek(syslseek) => hypercall::lseek(syslseek),
Hypercall::FileOpen(sysopen) => {
hypercall::open(&self.parent_vm.mem, sysopen)
}
Hypercall::FileOpen(sysopen) => hypercall::open(
&self.parent_vm.mem,
sysopen,
&self.parent_vm.file_map,
),
Hypercall::FileRead(sysread) => {
hypercall::read(&self.parent_vm.mem, sysread)
}
Expand Down
8 changes: 5 additions & 3 deletions src/macos/aarch64/vcpu.rs
Original file line number Diff line number Diff line change
Expand Up @@ -190,9 +190,11 @@ impl VirtualCPU for XhyveCpu {
}
Hypercall::FileClose(sysclose) => hypercall::close(sysclose),
Hypercall::FileLseek(syslseek) => hypercall::lseek(syslseek),
Hypercall::FileOpen(sysopen) => {
hypercall::open(&self.parent_vm.mem, sysopen)
}
Hypercall::FileOpen(sysopen) => hypercall::open(
&self.parent_vm.mem,
sysopen,
&self.parent_vm.file_map,
),
Hypercall::FileRead(sysread) => {
hypercall::read(&self.parent_vm.mem, sysread)
}
Expand Down
8 changes: 5 additions & 3 deletions src/macos/x86_64/vcpu.rs
Original file line number Diff line number Diff line change
Expand Up @@ -737,9 +737,11 @@ impl VirtualCPU for XhyveCpu {
}
Hypercall::FileClose(sysclose) => hypercall::close(sysclose),
Hypercall::FileLseek(syslseek) => hypercall::lseek(syslseek),
Hypercall::FileOpen(sysopen) => {
hypercall::open(&self.parent_vm.mem, sysopen)
}
Hypercall::FileOpen(sysopen) => hypercall::open(
&self.parent_vm.mem,
sysopen,
&self.parent_vm.file_map,
),
Hypercall::FileRead(sysread) => {
hypercall::read(&self.parent_vm.mem, sysread)
}
Expand Down
4 changes: 4 additions & 0 deletions src/params.rs
Original file line number Diff line number Diff line change
Expand Up @@ -35,6 +35,9 @@ pub struct Params {

/// Arguments to forward to the kernel
pub kernel_args: Vec<String>,

/// Paths that should be mounted on-device
pub file_map: Option<Vec<String>>,
}

#[allow(clippy::derivable_impls)]
Expand All @@ -51,6 +54,7 @@ impl Default for Params {
pit: false,
cpu_count: Default::default(),
gdb_port: Default::default(),
file_map: Default::default(),
kernel_args: Default::default(),
}
}
Expand Down
5 changes: 5 additions & 0 deletions src/vm.rs
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,7 @@ use crate::{
arch::{self, FrequencyDetectionFailed},
consts::*,
fdt::Fdt,
isolation::*,
mem::MmapMemory,
os::HypervisorError,
params::Params,
Expand Down Expand Up @@ -118,6 +119,7 @@ pub struct UhyveVm<VCpuType: VirtualCPU = VcpuDefault> {
pub virtio_device: Arc<Mutex<VirtioNetPciDevice>>,
#[allow(dead_code)] // gdb is not supported on macos
pub(super) gdb_port: Option<u16>,
pub(crate) file_map: Option<UhyveFileMap>,
_vcpu_type: PhantomData<VCpuType>,
}
impl<VCpuType: VirtualCPU> UhyveVm<VCpuType> {
Expand Down Expand Up @@ -149,6 +151,8 @@ impl<VCpuType: VirtualCPU> UhyveVm<VCpuType> {
"gdbstub is only supported with one CPU"
);

let file_map = params.file_map.as_deref().and_then(UhyveFileMap::new);

let mut vm = Self {
offset: 0,
entry_point: 0,
Expand All @@ -161,6 +165,7 @@ impl<VCpuType: VirtualCPU> UhyveVm<VCpuType> {
verbose: params.verbose,
virtio_device,
gdb_port: params.gdb_port,
file_map,
_vcpu_type: PhantomData,
};

Expand Down