Skip to content

Commit

Permalink
Update C++ interop goal
Browse files Browse the repository at this point in the history
  • Loading branch information
tmandry authored Dec 17, 2024
1 parent 72dde22 commit 916ede6
Showing 1 changed file with 20 additions and 17 deletions.
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# Enable incremental Rust adoption in C++ codebases
# Evaluate approaches for seamless use of C++ APIs from Rust

| Metadata | |
|----------|-----------------------|
Expand All @@ -8,7 +8,7 @@

## Summary

Seriously consider what it will take to enable incremental Rust adoption in large C++ codebases that want to introduce memory safety, and map out the space of long-term solutions. These solutions should enable interop between Rust and other languages in the future.
Seriously consider what it will take to enable Rust adoption in projects that must make use of large, rich C++ APIs. Map out the space of long-term solutions we are interested in. These solutions should enable interop between Rust and other languages in the future.

## Motivation

Expand All @@ -25,7 +25,7 @@ We should aim to spread the benefits of Rust and its underlying ideas as far as

Memory safety vulnerabilities are the most costly kinds of vulnerabilities, both for product owners and their users. These vulnerabilities and their costs have persisted despite the deployment of many mitigation measures in memory unsafe languages which often impose costs of their own.[^ag][^rust-in-android]

Experience has shown that regardless of the size of an existing codebase, incremental adoption of a memory safe language like Rust brings roughly linear benefits in terms of new memory safety vulnerabilities. This is because most vulnerabilities come from new code, not old code.[^android]
Experience has shown that regardless of the size of an existing codebase, incrementally adopting a memory safe language like Rust in new code brings roughly linear benefits in terms of new memory safety vulnerabilities. **This is because most vulnerabilities come from new code, not old code.**[^android] This means Rust adoption has value even if only adopted in new code.

Given the growing recognition of this problem from within various technical communities, major technology companies, and major governmental bodies, there is increasing pressure to adopt memory safe languages across the board for all new code. As this proposal explains, this presents both a significant opportunity and a significant challenge for Rust.

Expand All @@ -39,7 +39,7 @@ Roughly speaking, there are three axes to adoption of memory safety: Social, Tec

For example, safety has become more socially desirable in many technical communities over the years, which has led to the development of mitigation measures and the adoption of languages like Rust. This has come partly as a result of the recognition of the economic costs of memory safety vulnerabilities.

For C/C++ this has led to an improvement along the technical front in terms of automated checking, in both static and dynamic tooling. However, this protracted effort has also revealed the limits of such an approach without language changes. While there have been growing calls for C++ to adopt memory safety features,[^safe-cpp] they have not gained traction within the body that controls the C++ standard for a combination of structural, technical, social, and economic reasons.[^corentin-profiles] The cost to adopting such features in C++ is too high for a combination of technical, social, and economic reasons.
For C/C++ this has led to an improvement along the technical front in terms of automated checking, in both static and dynamic tooling. However, this protracted effort has also revealed the limits of such an approach without language changes. While there have been calls for C++ to adopt memory safety features,[^safe-cpp] they have not gained traction within the C++ standards body for a combination of technical, social, and economic reasons.[^corentin-profiles]

[^safe-cpp]: https://safecpp.org/draft.html
[^corentin-profiles]: https://cor3ntin.github.io/posts/profiles
Expand All @@ -50,29 +50,31 @@ For C/C++ this has led to an improvement along the technical front in terms of a
[^oncd]: https://downloads.regulations.gov/ONCD-2023-0002-0020/attachment_1.pdf

Rust itself is a major technical breakthrough that enables safety from all kinds of undefined behavior, including spatial safety, temporal safety, and data race safety, with very high confidence.
Rust itself is a major technical breakthrough that enables safety from all kinds of undefined behavior, including spatial safety, temporal safety, and data race safety, with very high confidence. This makes it appealing for those looking to introduce safety to their codebase. Rust adoption is feasible in the following situations:

Rust adoption is feasible in the following situations:

##### Greenfield projects
##### Feasible: New codebases with Rust-only dependencies

This includes completely new projects as well as complete rewrites of existing projects, when such rewrites are socially and economically viable.

##### Interprocess boundaries
##### Feasible: Interprocess boundaries

Projects with a natural interprocess boundary are easily migrated incrementally to Rust. Microservice architectures with their RPC/HTTP boundaries are one example of this.
Projects with a natural interprocess boundary between components are more easily migrated to Rust. Because of the loose coupling enforced by the boundary, the project can be incrementally migrated one component at a time. Microservice architectures with their RPC/HTTP boundaries are one example of this.

##### Small, simple API surface
##### Feasible: Small, simple intraprocess API surface

Projects with a small, simple API surface that can be manually expressed in terms of the C ABI. This boundary, expressed and invoked in `unsafe` code, is prone to human error. It can be maintainable when the surface is small enough, but this also means that Rust adoption can *decrease* safety at the language boundary.

##### Larger API surfaces with limited vocabulary
##### Feasible: Larger intraprocess API surface, but with limited vocabulary

Projects with a limited API vocabulary are able to use one of the existing interop tools like bindgen, cbindgen, or cxx.

The fact that all of these options exist and undergo active development is a testament to the value developers see in Rust adoption. However, they leave out a large portion of production code in use today: Projects that make rich use of a language such as C++ where comparatively limited interop support exists for Rust, and that link in enough code to make rewriting infeasible.
##### Infeasible: Everything else

The fact that all of these options exist and undergo active development is a testament to the value developers see in Rust adoption. However, they leave out a large portion of production use cases today: Projects that make rich use of an API in a language like C++ where comparatively limited interop support exists for Rust, and that link in enough code to make rewriting infeasible.

Furthermore, the limitations of current interop tooling are not simply a matter of adding features. Many of them stem from a mismatch in the expressiveness of the two languages along various axes. As one example, C++ and Java both support overloading while Rust does not. In some cases this mismatch is broadly accepted as a missing feature in Rust that will be added in time. In others, Rust's lack of expressiveness may be considered a feature in itself.

Furthermore, the limitations of current interop tools are not simply a matter of adding features. Many of them stem from a mismatch in the expressiveness of the two languages along various axes. As one example, C++ and Java both support overloading, while Rust does not. In some cases this mismatch is broadly accepted as a missing feature in Rust that will be added in time. In others, Rust's lack of expressiveness may be considered a feature in itself. These mismatches point to the limitations of such approaches.
These mismatches point to the limitations of such approaches. If we attempt to solve them one at a time, we may never reach the "shiny future" we are working towards.

### The next 6 months

Expand All @@ -82,7 +84,7 @@ We do not propose any specific deliverables over the next six months. We only pr

It is essential that our industry adopts memory safety broadly. To realize this, Rust should be feasible to adopt in any application, particularly those which prioritize performance and reliability in addition to safety.

This includes making Rust feasible to incrementally adopt in applications that make rich use of memory unsafe languages like C++. To the extent possible incremental Rust adoption should only *increase* safety, never *decrease* it.
This includes making Rust feasible to adopt in both new and existing applications that make rich use of APIs in memory unsafe languages like C++. To the extent possible, incremental Rust adoption should only *increase* safety, never *decrease* it.

Given that this is a highly ambitious, multi-year project, we should begin with presenting the problem space as accurately as possible to the Rust language team as a way to receive guidance and build alignment on overall direction.

Expand All @@ -98,17 +100,18 @@ This goal adheres to the general design axioms in the interop initiative's [prob

In addition, it proposes the following axioms:

* Seek out solutions that, in principle, make 100% coverage possible. This means 100% of functions and methods defined in one language are callable in the other language. For some functions, this may require compromising on ergonomics or safety.
* Seek solutions that make 100% coverage possible. This means 100% of functions and methods defined in one language are callable in the other language. This may require some APIs to be unergonomic and/or unsafe to call.
* Minimize the potential for human error. Interop should leverage trusted, automated tooling wherever possible.
* Extend contracts between languages where possible. For example, a strongly typed interface in one language should be equally strongly typed in the other language, subject to the constraints imposed by that language.
* Introduce zero overhead when calling between languages.
* Prefer solutions that are general enough to apply to languages beyond C++.

## Ownership and team asks

**Owner:** @baumanj and @tmandry

| Task | Owner(s) or team(s) | Notes |
|------------------------------|-----------------------------|--------------------------------------------------------------------------|
| ---------------------------- | --------------------------- | ------------------------------------------------------------------------ |
| Discussion and moral support | ![Team][] [lang] [compiler] | |
| Design meeting | ![Team][] [lang] [compiler] | 2 meetings expected; ideal audience is lang team with compiler team reps |
| Author design doc | @tmandry can drive | |
Expand Down

0 comments on commit 916ede6

Please sign in to comment.