Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Will memory alignment be supported? #360

Closed
Kerollmops opened this issue Sep 11, 2022 · 22 comments · Fixed by #521
Closed

Will memory alignment be supported? #360

Kerollmops opened this issue Sep 11, 2022 · 22 comments · Fixed by #521

Comments

@Kerollmops
Copy link
Contributor

Hey,

I was looking at redb and thought about one feature that could be very interesting and that is supported only by sanakirja right now. Memory alignment could bexcitingng when it comes to speed, for example, avoid having to copy memory could bring more performance.

@cberner
Copy link
Owner

cberner commented Sep 11, 2022

Very unlikely, because that could cause portability issues: type alignment is architecture dependent. I would also be surprised if the performance difference was significant. In most cases you can use functions like u64::from_le_bytes without penalty

@cberner cberner closed this as completed Sep 12, 2022
@Kerollmops
Copy link
Contributor Author

I was thinking more about a bigger data-structure like roaring bitmaps. In C they developed a "frozen format" that can be directly reinterpreted and is much faster than allocating and copying entire containers.

@cberner
Copy link
Owner

cberner commented Sep 13, 2022

I see. Supporting external libraries that require alignment isn't one that I had considered. Has anyone tried porting roaring bitmaps to Rust and removing the alignment requirements? All the benchmarks I've seen show that there's basically no different between aligned and unaligned access on modern CPUs.

@cberner cberner reopened this Sep 13, 2022
@Kerollmops
Copy link
Contributor Author

Kerollmops commented Sep 13, 2022

Thank you for reopening this issue.

Has anyone tried porting roaring bitmaps to Rust and removing the alignment requirements?

There indeed is a pure Rust port of the RoaringBitmap library that I maintain. There is no alignment required for the default de/serialization format, but it brings a lot of performance impact when you need to deserialize it in memory to possibly just do a big union of them (you could keep them frozen) and generate a new in-memory-mutable bitmap.

I was thinking about using the power of memory mapping to keep things where they are and avoid duplicating them elsewhere to be able to read them when this would be unnecessary if the memory were correctly aligned.

We talked a little bit about this format with Daniel Lemire in this issue. It can be hard to read, so I advise you only to read the last message:

In C, we use a distinct format (which we refer to as frozen) which provides the necessary alignment assuming that the starting pointer is itself 32-bit aligned.

@cberner
Copy link
Owner

cberner commented Sep 14, 2022

Ah yes, I can see why that's expensive. Is it possible to instead borrow the &[u8] that represents the roaring bitmap data, and read the byte as u32, u64...etc as needed, rather than deserializing it into a RoaringBitmap object?

I took that approach with my GroupedBitmap struct that is logically an &[u64], but instead it performs accesses into the &[u8] using from_le_bytes() to avoid having alignment requirements.

@Kerollmops
Copy link
Contributor Author

Is it possible to instead borrow the &[u8] that represents the roaring bitmap data, and read the byte as u32, u64...etc as needed, rather than deserializing it into a RoaringBitmap object?

It seems like a good idea, but I am not sure it will be easy to implement instead of just porting the frozen format C implementation to Rust. However, performance cost can be significant on some arch like the ARMv7 (Apple Mac M1 is an ARMv8) where an unaligned instruction generates four instructions instead of two when the bytes are unaligned.

I am just thinking about the implementation detail on your side. Wouldn't it be some padding bytes in front of the value to make it correctly aligned on the page? I understand that you must keep the length of the padding to be able to skip it when reading 🤔

@cberner
Copy link
Owner

cberner commented Sep 14, 2022

However, performance cost can be significant on some arch like the ARMv7 (Apple Mac M1 is an ARMv8) where an unaligned instruction generates four instructions instead of two when the bytes are unaligned.

Well, additional instructions doesn't necessarily translate into a difference in performance. Except for very tight loops, instructions should be practically free compared with the loads from memory/disk that redb has to do.

Oh yes, the implementation would be pretty straightforward. The thing that I'm worried about is portability between architectures. The alignment of the primitive types is not specified in Rust, and is platform dependent. I have not review every architecture that's supported to see if they have the same alignment, but I could imagine a 64bit platform on which say u32 has 8 byte alignment

@ousado
Copy link

ousado commented Nov 26, 2022

For using rkyv in redb, memory alignment would also be very interesting. I'd think it's sufficiently useful to make it available even if portability can't be ensured when using it.

@dignifiedquire
Copy link

Fyi trying to actually use rkyv with redb right now, and running into alignment issues with data returned from redb, so I would very much appreciate anything in this direction.

@casey
Copy link
Contributor

casey commented Jan 12, 2023

I was thinking that the simplest way to support alignment may be:

  • Only support aligning byte slices. Push doing deserialization from aligned byte slices to the user.
  • Only support aligning whole redb keys and redb values. (Not elements of tuples.)
  • Require that the user ask for the alignment they want, so that padding bytes aren't used when alignment isn't required, and so redb doesn't need to worry about different alignment on different architectures, it's up to the user to ask for the alignment they need.

In practice, this might look like a type like:

struct Aligned<const N: usize>;

Which can be used as a redb key or value, and whose value is a byte slice for which slice.as_ptr() % N == 0 is guaranteed when read from the database.

@dignifiedquire
Copy link

@casey that sounds very reasonable, and would make it straightforward to integrate with tools like rkyv and zerocopy to allow for custom zero copy types.

@cberner
Copy link
Owner

cberner commented Jan 13, 2023

@casey yep, that's how I'm thinking of implementing it to start. I'm going to try and make it flexible enough that if I end up making RedbKey & RedbValue public in the future it will be possible to specify alignment for custom types, and I'd also like it to be possible to implement aligned nested types inside tuples (not going to support that to start with though, since it'd be pretty annoying to implement)

@cberner
Copy link
Owner

cberner commented Jan 15, 2023

I think this is probably blocked on rust-lang/rust#76560

To ensure that the alignment is checked at compile time, I need code like this to work:
https://play.rust-lang.org/?version=nightly&mode=debug&edition=2021&gist=bb5f314b6aa0e29f26566fb7d715d7eb
and I don't see a way to implement that on stable. If anyone has ideas lemme know!

@cberner
Copy link
Owner

cberner commented Jan 15, 2023

#490 adds alignment to the file format, so that it can be supported in the future without requiring a file format upgrade

@cberner
Copy link
Owner

cberner commented Feb 8, 2023

I did some work on this in the alignment branch. The next step from that branch is changing RedbValue::from_bytes() to take an AlignedSlice<Self::ALIGNMENT>

However, I'm not sure this is going to be possible in the near future with the Rust type system. Basically, I need code like this to work: https://play.rust-lang.org/?version=nightly&mode=debug&edition=2021&gist=00c17b86bc4d44cec05ffab93da10cc0 and it looks like that's been ruled out for the moment due to it not being const well-formed. I don't entirely understand const wf, but it looks like rust-lang/rust#76560 won't be sufficient to make that branch work.

Anyone have ideas for making this work?

@dignifiedquire
Copy link

If you make the size a generic parameter it could work, not sure this fits in the rest of your design: https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=a239c92b3f0225c0d472bf635deb32e7

@cberner
Copy link
Owner

cberner commented Feb 9, 2023

ya, I looked into that solution briefly, and good to see that it seems to work. However, I think it's probably not going to be a good fit for the rest of the redb API, because then those alignment generic parameters are going to pollute the whole API since nearly everything references RedbValue and would now have twice as many parameters. I'll play with it a bit more though.

@cberner
Copy link
Owner

cberner commented Feb 10, 2023

@dignifiedquire I tried that out, and unfortunately it's not going to work well. The const generic parameter would leak out into the whole API, requiring things like const MY_TABLE: TableDefinition<&str, 1, &str, 1> ... instead of TableDefinition<&str, &str>.

I think the best workaround is going to be to store the values that require alignment in a separate structure, such as a separate mmap'ed file, and then store the offset & length in redb to allow retrieving the value.

Or if you know anyone who works on the const generic type system, maybe you can pitch them on adding support for this case :)

@cberner
Copy link
Owner

cberner commented Feb 11, 2023

I added an example showing one way to workaround this. Please re-open if you find a way around the Rust associated const generic issue though! I do think this would be a nice feature to have in the core API, but don't want to add it if it requires adding const generic parameters to everything.

cberner added a commit that referenced this issue Feb 11, 2023
This feature is not implemented, and there is no clear path to
implementing it (see #360).
However, if it becomes possible to implement it in the future, this
design should be backward compatible
cberner added a commit that referenced this issue Feb 11, 2023
This feature is not implemented, and there is no clear path to
implementing it (see #360).
However, if it becomes possible to implement it in the future, this
design should be backward compatible
cberner added a commit that referenced this issue Feb 11, 2023
This feature is not implemented, and there is no clear path to
implementing it (see #360).
However, if it becomes possible to implement it in the future, this
design should be backward compatible
@Ralith
Copy link

Ralith commented Mar 5, 2023

Would it be possible to move alignment concerns out of the type system? I haven't studied redb's internals yet, but it seems like adding/skip padding could be dynamic, with redb's API guaranteeing that the argument to RedbValue::from_bytes matches the required alignment, and no other interface changes. Downstream code could use e.g. slice::align_to to leverage this safely. This would confer the practical benefits benefits without placing more demands on the language.

@cberner
Copy link
Owner

cberner commented Mar 5, 2023

Ya, that's definitely possible, but I don't want to implement it that way because it would be too easy to introduce bugs

@Ralith
Copy link

Ralith commented Mar 6, 2023

Nothing an assert couldn't guard against, surely?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants