-
Notifications
You must be signed in to change notification settings - Fork 2.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Feature Request] [mojo-lang] [proposal] Add Linear / Explicitly Destroyed Types #3848
Comments
Could we have some kind of __unsafe_destroy builtin as an escape hatch? For instance, there are reasons to destroy a coroutine without finishing its execution if you want have information you don't need it to run more. I'd also like to see I think a
One feature I've found useful in Rust's Arc is the ability to unwrap the Arc and take ownership if the refcount is 1. Could we make use of conditional conformance to make arc into a linear type which returns
What Linear types held across yield points?
I'd lean towards keeping those as-is because we have no way of knowing if the inner type is actually properly initialized. Overall I'm very interested in this feature, but I want to make sure that we both have the tools to make it nice to use and some escape hatches to deal with bad API design or unorthodox requirements. |
I would prefer opting-in to the
|
in terms of |
Hello, looking forward to use linear types, this is great! |
+1 to this line of thinking, since library authors can't perfectly predict the future, and otherwise linear types would cut off their ability to
IIRC,
How we'll do Rc and linear types is an interesting topic that deserves its own entire proposal, but it would be good to address a question "Are linear types incompatible with any future RC?" I think not, because there are a few solutions:
I think either #1 or #2 would be sufficient to say Rc isn't incompatible with linear types, though we'd figure out later which we want (or both!).
I think that's fine, because coroutines are themselves linear, so must be continued until their end. Unless there's a problem I'm not seeing?
When I was implementing a basic sample LinearList (which I still have to post somewhere),
That's a pretty good argument. Though, it would be annoying for the user to have to specify Also, +1 for having a escape hatch for users to use when the library author didn't design their API correctly. It might even be worth having two keywords: |
Sorry, I meant
I'd lean towards both. I know that there was a prototype for having an optional weak pointer for
I'm considering whether the coroutine frame type (which after a lot of discussing back and forth with @nmsmith I think may be useful to expose to users) needs to be linear.
I can live with this if there is a way to say "I know this is empty, free the backing allocation while doing nothing else". I don't want the hot loop of an async executor which is throwing around pointers to coroutines constantly to need to do a non-zero-cost check. For io_uring, linear UnsafePointer would also mean that the most straightforward way to writing the executor (where you stuff the coroutine struct pointer into the user data field) would make every single completion event a linear type.
I think it would also produce issues for things like FFI if the default was linear type, since that means writing out a dtor for every trivial type.
Thank you! I'm generally in favor of having the person who writes |
I disagree with this. As I understand it, In fact, there's no reason that Taking this further: I'd like us to consider making the concept of implicitly going out of scope orthogonal to the concept of destruction. If we go in this direction, then
There's a second reason to move away from the name So my first suggestions for Evan's proposal are to consider introducing:
My next suggestion concerns
I agree that having the existing Going further, perhaps we should replace the Notably, Mojo doesn't allow trait authors to define default implementations (yet), but in the short term, we can just give the above traits special treatment in the compiler. IMO this is a promising way to make it ergonomic for Mojo programmers to opt-in to lifecycle methods. Also, it seems more principled than all of these random decorators. 😇 Note: The compiler would only generate default implementations of the above trait methods in the cases where all of the struct's fields conform to the trait as well. Otherwise, the struct author would be required to provide a custom implementation. For example, you'd only get a default implementation of In the absence of the
That's all of the feedback I have for now. Generally speaking, I love the proposal! I can't wait to see how the Mojo community makes use of this feature. |
I think that Mojo probably does need to pick. Do we try to have sane defaults and opt-out of things that aren't commonly needed (meaning that move constructors are automatic), or do we do opt-in and have some more decorators ( I don't think doing it with traits will be doable until we have reflection, unless it's pure compiler magic. |
I have a longer response, but long story short, I'd like a way to specify a custom error (or warning) message when you try to copy a non-copyable anyways, so hopefully we can do something which allows that in the same way non-dels will |
That's exactly what I'm suggesting: the traits IMO this approach is no less principled than having the |
This! You've just clarified something that I think I've been trying to realize for a long time. The classic example I always use for this is a
Well said. Thinking out loud here, I want to make explicit a certain challenge Mojo faces: that we need to keep complexity down (of course), but if we need complexity, we need to incorporate it in a way that keeps the learning curve gradual as much as possible. It's the only way Python users will adopt a complex language like Mojo. Example: C#'s This is particularly challenging because high performance (and guarantees in general) works against us here. It's especially egregious with viral mechanisms like Ways to keep our learning curve down (if we can't avoid introducing complexity):
So, from that perspective, here's some more specific options we have w.r.t. linear types: A. Keep linearity out of the defaults. If a Python user ever encounters an error about As much as the software architect in me is wary of the last two, we must consider them, because we need to prioritize learning curve, truly and highly.
Just to reduce scope: whether "value"ness comes from I lean towards yes: types should by default be linear, because otherwise, a Python user might encounter a
I like this, because it makes the distinction between dropping and destroying clear. But do most Python users know that it has no relationship? If Python users are coming in with a misconception about Python, it'll make it harder for them to learn Mojo. I don't actually know. ...do most Python users even know about |
But, we need to steer library developers towards the more powerful mode, so they can write libraries that can be used in a variety of places.
In what way?
We can continue the argument on whether making async invisible makes things more or less complex elsewhere, but I agree in principle that abstractions which simplify things are good so long as you can always open up the curtain and gradually increase your complexity exposure in order to get more power/expressiveness.
Mutable xor alias brings a lot of benefits, and I don't think it's clear that allowing mutable aliasing is a good idea. We can have warnings for things which are probably a bad idea but you may still want to do, but I would prefer that if the word "unsafe" doesn't appear in my program (in function names, class names, etc) that I get most or all of the guarantees safe Rust has.
I think it should be
I don't think we want this. Linear types are often going to be used to represent resources, and those resources should be cleaned up as quickly as possible (within reason). Not reclaiming resources during execution means you can't use it for anything long-running.
If you prioritize the learning curve above all else, you get Go. Mojo is a systems language, so control of the system needs to come first, and then we can build "I don't care about that" interfaces on top.
I think value should be a composite of
It's not always a GC, if there are no cycles the it runs when the refcount hits zero, which often happens at the end of a scope. |
If we were just trying to make a better Rust, or a better C++, I'd agree with everything you just said. Especially this part:
But devil's advocate: if we don't ever prioritize learning curve, we get Rust, and that's not a good thing for us. I've seen veteran programmers (who were already fans of static typing) bounce off Rust's learning wall. Our challenge is even harder: appeal to Python programmers. Whether we succeed depends on how seriously we regard this problem. If one doesn't believe that this challenge is important, then my entire last post will make zero sense (and sound very ridiculous). Also note, I didn't say "above all else". We both know that language design is a balancing act between different tradeoffs. And if we're really good, we can even avoid the tradeoffs, and find clever best-of-all-worlds solutions. Here, I hope we can make Mojo have the most powerful type system of any systems programming language in the world, but also make it palatable to newcomers, especially Python users.
I appreciate that stance, since it strengthens the power and guarantees and benefits of linear types. However, it also increases linear types' cost, particularly to newcomers' learning curve. Specifically, I suspect that putting
I don't want this; it's against the spirit of linear types 😆 But if we value a good learning curve, newcomers might need this escape hatch to be within reach. +1 to everything else you mentioned, well said. |
I think we do have an inherit vs accidental complexity problem. There is an inherent level of complexity to "any allocation can fail", concurrency, and parallelism. Python choose to ignore all of those issues, but I don't think we can. We can do our best to have language features that let the compiler help users along, but I don't think we can totally eliminate the complexity of many things without making performance compromises. I think that trying to simplify complex problems can sometimes be dangerous, since you end up making important decisions for the user. Think about how many ML apps crash when they run out of VRAM instead of backing off and asking the user if they would like to quantize the model and try again, or how badly people wanted multithreading in Python when it was decided early on to close that door, such that the entire python ecosystem is now built around having a global mutex. I don't want to allow programs we know are ill formed in the name of UX, instead I would rather have the compiler guide users towards solving the problem. I think that, with the right messaging, users can be brought around to the idea of doing a lot of their debugging before the compiler lets them run their program ("It compiles it works"). This may be a philosophical difference between me and other people, but I generally prefer to have a fight with the borrow checker over a call at 2am.
The reason I want unsafe to be part of the keyword is because I'd like to be able to grep a codebase for "unsafe" and find the places where memory safety problems may have occurred. The ability to do scoped audits and not need to consider every single line of code for debugging like this is very valuable. I think that the Mojo community may need to come up with a testable definition of unsafe, since I am operating on the "able to cause UB" definition, which I know is strict. To me, this is an assertion that you have both upheld the invariants of the linear type and that you have decided, for whatever reason, that you know better than the author of the library. To me, that is a strong assertion to make, and not one that should be made lightly. If we could get something in the compiler that would allow emitting a warning for all instances of "unsafe constructs" (functions marked by some annotation or built into the definition in the compiler for builtins), then I can be fine with
We could build the escape hatch with globals and allow global dtors, and then let library authors decide if they want to do it for their library. Things which are not dangerous, just wasteful, to leak can be put there, but things which have actual invariants tied to them can be handled via |
+1 to pretty much everything you said. And thanks for bringing up how greppable
An extra +1 to this. I hope to write some blog posts on linear type best practices one day, and this should definitely go in there.
Perhaps. Swift and Typescript are successfully catching up to / overtaking their predecessors... but Rust hasn't had as much success w.r.t. C/C++. I tend to assume it's because Rust hasn't really convinced the mainstream to fight the borrow checker as much as we would have hoped. Perhaps their messaging was indeed a factor. Our target demographic (Python users) is slightly different, but any thoughts on their messaging, or how ours could be different, to make borrow checking (and linear types) more palatable? |
There probably exists a more meaningful way to convey the idea of what "borrow checking" does, One idea is to not call it |
+1. This is simple, and consistent with how the To explicitly answer Evan's question: I think conforming to
Update: I was using the term "trivial" above, but that's not correct, because a struct that inherits from We should choose a name for As Evan has mentioned, we also need excellent error messages that ensure that if a learner forgets to conform to
I don't think there are many Python users who use Concerning escape hatches for linear types, Evan suggested:
IMO this is jumping too far ahead. It sounds like you're predicting that linear types will be so challenging to work with that we need to have easy-to-reach-for escape hatches with names that don't make people feel ashamed for reaching for them. But... maybe linear types won't actually be that difficult to use. I'd like to see how people use this feature in practice, and what the pain points are. Once we gather that information, that sounds like the best time to talk about what the escape hatches (if any) should be. In the meantime, offering a placeholder function named Said another way: it probably makes more sense to design linear types iteratively rather than trying to predict everybody's needs ahead-of-time, with zero usage data. At the very least, I personally can't plan that far ahead.
+1. This is definitely a feature that could be a source of disastrous bugs if used without careful consideration. We don't want to teach this as a "solution" that you should reach for when you're struggling with linear types. |
Agreed. I believe we need mutually exclusive traits to avoid impossible diamonds. This approach would also be beneficial for I don't think we should delve into discussions about naming or teachability just yet. We could start with reasonably acceptable placeholders ( |
Yeah, IMO the resyntaxing thread is a big lesson on "don't try to finalize the syntax of something until you've figured out all of the places it will be used in practice". You can spend an hour coming up with the perfect name for something on the assumption that it's going to be used in a certain way, and then find out weeks or months later that actually, it's going to be used a different way, and now that name no longer works well. That's not to say that we shouldn't talk about syntax. I think we should just be careful about how long we bikeshed it for. |
At the risk of being hypocritical—given my last comment—I'll share another thought on names that I just had. This proposal is currently entitled as one of:
The former name is obviously "academic" and we probably all agree that this is not how we should describe this proposal to others. The latter name is actually a bit misleading IMO, because this proposal is really about being able to define a data type that doesn't have a Given the above, I'd say this proposal is really about introducing support for "undroppable types". Notice I'm using my earlier-proposed "drop" terminology here. It's more accurate than "undeletable types", which is what the "del" terminology would lead us to. |
We should also investigate how trivial destructibility (the ability to get rid of a value without invoking Maybe we need both "droppable" and "forgettable" types. Maybe this will affect naming too. Perhaps "discardable" and "forgettable" is a better combination? The former adjective implies that you actually have to take an action ( |
One way to think about it is also as an |
But, "Linear Types" leads you directly to a bunch of high-quality introductions (Including a lot of stuff by @VerdagonModular). I don't think we need to go full "a monad is a monoid in the category of endofunctors", but for people coming from C, or older versions of C++ (since many students are taught pre-RAII C++), almost all types are "explicitly destroyed". I think that using the academic name may be a good idea here because either you already know what it is, or you go "what's that?" and have to read the introduction in the "Mojo Book", where we can far better explain them than a simple label.
While "undroppable" makes sense to those of us from RAII backgrounds, I'm not sure it makes sense to people from GC language backgrounds, which to me means that we've opted into both less precise wording which may cause confusion among those who think they know what it means (ex: "must leak" types) and a major target audience still having no idea what we mean, which to me is the worst of both worlds. |
Which traits are you suggesting should be mutually exclusive with "trivial" as a descriptor for things like this is deep enough in the systems programming lexicon that I don't think changing the words but meaning the same thing is a good idea, unless there are some weird warts on how C++ defines it. |
We can also separate this out a bit, so it's less of a blunt hammer than Rust has. There's a difference between "using printf is technically a race condition" unsafe (changing the locale while a function is in printf can cause problems), "you can violate the guarantees of linear types with this" unsafe and "this function takes the reference you gave it, goes 5 bytes past the referree, casts to a function pointer and calls it" unsafe.
I think Rust's main issue is that it has no easy migration path. It threw out OOP and there are many devs, especially those who entered the industry pre-2010, for whom OOP and "Clean Code" is how you stop a enterprise codebase from becoming a mess. Many large C++ codebases are such a tangled web of inheritance that moving to a language without that is nearly impossible. Mojo, in supporting both OOP and ADT, gives the "C with templates" people new things to work with, and gives C++ devs a place to go as far as familiar design patterns. Having to throw out 20+ years of design patterns is I think the hardest part about learning Rust for some people. So, some level of support for inheritance will help, but we also are grabbing a demographic which doesn't use OOP as heavily due to duck typing, so traits should make more sense to them. As far as the borrow checker, Mojo has a lot of the tools that Rust is just getting now, so the smarter borrow checker should help quite a bit. I think some "learning to love the borrow checker" docs, where we give examples of nasty race conditions or bugs that the borrow checker catches (iterator invalidation, pointer/reference instability when appending to a collection, launching a lambda with a reference to the local stack frame on another thread then returning, etc). |
People subscribed to email notifications may have noticed me rambling about trivial types just now. Just ignore that. I've deleted those messages because they don't really have anything to do with Evan's proposal. 🙂 |
To follow-up: On Discord, Owen and I discussed the need to have a trivial equivalent of the |
And, |
Review Mojo's priorities
What is your request?
See Proposal for Linear / Explicitly Destroyed Types.
TL;DR: We'd upgrade the type system to be able to express "explicitly destroyed types"; a struct (or trait) definition could use an e.g.
@explicit_destroy("Use some_method(a, b, c) to destroy")
annotation to inform the use when they haven't disposed the object correctly.What is your motivation for this change?
This solves a bunch of problems, for example:
Coroutine
objects go out of scope, and that they should insteadawait
them (and take ownership of the coroutine’s result).t: Thread
object should never go out of scope, the user should only destroy it viat^.join()
ort^.abandon()
.spaceship.land(landing_zone)
instead of just letting the ship go out of scope.For more examples, check out the proposal! Also take a look at this talk at the 2024 LLVM conference where I talked a bit about these concepts.
Any other details?
I previously implemented this feature in Vale, and I've implemented a proof-of-concept to verify this is possible in the Mojo compiler (proposal's steps 1-7), but some open questions still remain:
@explicit_destroy
,destroy
,ImplicitlyDestructible
, etc.None of this is final, so I'd love to get everyone's feedback!
The text was updated successfully, but these errors were encountered: