Merge pull request #32 from extendr/strings_chapter

Various fixes
extendr · Jun 30, 2024 · 25a9024 · 25a9024
2 parents 2fa3993 + 083ca77
commit 25a9024
Show file tree

Hide file tree

Showing 8 changed files with 18 additions and 19 deletions.
diff --git a/_freeze/changelog/execute-results/html.json b/_freeze/changelog/execute-results/html.json
diff --git a/_freeze/user-guide/complete-example/execute-results/html.json b/_freeze/user-guide/complete-example/execute-results/html.json
diff --git a/_freeze/user-guide/type-mapping/extendr-macro/execute-results/html.json b/_freeze/user-guide/type-mapping/extendr-macro/execute-results/html.json
diff --git a/_freeze/user-guide/type-mapping/scalars/execute-results/html.json b/_freeze/user-guide/type-mapping/scalars/execute-results/html.json
@@ -1,8 +1,8 @@
 {
-  "hash": "af8f7e730143ad4599e3a14572806e65",
+  "hash": "7d123922be79f518188a370054bcf10c",
   "result": {
     "engine": "knitr",
-    "markdown": "---\ntitle: \"Scalar Type Mapping\"\n---\n\n::: {.cell}\n\n:::\n\n\nThis tutorial demonstrates some of the basics of passing scalar data types back\nand forth between Rust and R. We'll start with simple examples using explicit\nRust types but then move on to showing their extendr alternatives. Why does\nextendr have its own data types? For a number of reasons, of course, but the\nmost important reason is probably that Rust types do not allow for missing\nvalues, so no `NA`, `NaN`, `NULL`, or what have you. Fortunately, extendr types\nwill handle missing values for you. For this reason, **it is strongly\nrecommended that you work with the extendr types whenever possible.**\n\n## Scalar types\n\nA scalar type consists of a single value, and it can *only* consist of a single\nvalue, whether that value is a single character string, integer, or logical. As\nit happens, R doesn't have a way of representing a scalar value. That's because\neverything is a vector in R, and vectors can have any arbitrary length you want.\nSo, the closest thing to a scalar you will ever encounter in R is a vector that\njust so happens to have a length of one. In Rust, however, scalars are the\nbuilding blocks of everything, and they come in a bewildering variety, at least\nfor the traditional R user. Consider, for example, integers. R has just one way \nto represent this type of numeric value. Rust, on the other hand, has twelve!\n\nThe table below shows the most common R \"scalar\" types, along with their Rust\nand extendr equivalents. \n\n| R type         | extendr type | Rust type      |\n|----------------|--------------|----------------|\n| `integer(1)`   | `Rint`       | `i32`          |\n| `double(1)`    | `Rfloat`     | `f64`          |\n| `logical(1)`   | `Rbool`      | `bool`         |\n| `complex(1)`   | `Rcplx`      | `Complex<f64>` |\n| `character(1)` | `Rstr`       | `String`       |\n\nTo learn more about Rust types, see [section 3.2 of The\nBook](https://doc.rust-lang.org/book/ch03-02-data-types.html).\n\n## Sharing scalars\n\nTo see how scalars get passed back and forth between Rust and R, we'll first\nexplore Rust's `f64` value which is a 64-bit float. This is equivalent to R's\n`double(1)`. We'll write a very simple Rust function that prints the value of\nthe input and does not return anything.\n\n\n::: {.cell}\n\n```{.rust .cell-code}\n#[extendr]\nfn scalar_double(x: f64) { \n    rprintln!(\"The value of x is {x}\"); \n}\n```\n:::\n\n\nThrough the magic of extendr, we can now call this function in R and pass it a \nsingle double value.\n\n\n::: {.cell}\n\n```{.r .cell-code}\nscalar_double(4.2)\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\nThe value of x is 4.2\n```\n\n\n:::\n:::\n\n\nThere are several things to note about this example. First, in Rust, `x: f64` \ntells us that the type of `x` being passed to the function (`fn`) is a single \ndouble vector or \"float\" value. Second, `rprintln!(\"{}\", x);` is an extendr \nmacro that makes it easier to print information from Rust to the console in R. \nR users will perhaps notice that the syntax is vaguely `{glue}`-like in that the \nvalue of `x` is inserted into the curly brackets. Finally, if you are not \nworking inside of an extendr R package, you can create the `scalar_double()` \nfunction locally using `rextendr::rust_function()`.\n\n``` r\nrextendr::rust_function(\"\nfn scalar_double(x: f64) { \n    rprintln!(\"The value of x is {x}\"); \n}\n\")\n```\n\nNow, what if, rather than printing the value of `x` to the R console, we wanted\ninstead to return that value to R? To do that, we just need to let Rust know\nwhat type is being returned by our function. This is done with the `-> type`\nnotation. The extendr crate understands this notation and knows how to handle \nthe scalar `f64` type returned by the Rust function and pass it to R as double.\n\n\n::: {.cell}\n\n```{.rust .cell-code}\n#[extendr]\nfn return_scalar_double(x: f64) -> f64 { \n    x \n}\n```\n:::\n\n::: {.cell}\n\n```{.r .cell-code}\nx <- return_scalar_double(4.2)\n\ntypeof(x)\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\n[1] \"double\"\n```\n\n\n:::\n\n```{.r .cell-code}\nx + 1.0\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\n[1] 5.2\n```\n\n\n:::\n:::\n\n\n## Missing values\n\nAs noted above, Rust does not allow a scalar type to have a missing value, so \nyou cannot simply pass a missing value like `NA` to Rust and expect it to just \nwork. Here is a demonstration of this issue using a simple function which adds \n1.0 to `x`.\n\n\n::: {.cell}\n\n```{.rust .cell-code}\n#[extendr]\nfn plus_one(x: f64) -> f64 { \n    x + 1.0 \n}\n```\n:::\n\n\nYou will notice that this function expects `x` to be `f64`, not a missing value.\nPassing a missing value from R to this Rust function will, therefore, result in \nan error.\n\n\n::: {.cell}\n\n```{.r .cell-code}\nplus_one(NA_real_)\n```\n\n::: {.cell-output .cell-output-error}\n\n```\nError in plus_one(NA_real_): Must not be NA.\n```\n\n\n:::\n:::\n\n\nFortunately, the extendr types are `NA`-aware, so you can, for instance, use \nextendr's `Rfloat` in place of `f64` to handle missing values without error. \nBelow, you will see that we have done this for the function `plus_one()`. \n\n\n::: {.cell}\n\n```{.rust .cell-code}\n#[extendr]\nfn plus_one(x: Rfloat) -> Rfloat { \n    x + 1.0 \n}\n```\n:::\n\n::: {.cell}\n\n```{.r .cell-code}\nplus_one(NA_real_)\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\n[1] NA\n```\n\n\n:::\n\n```{.r .cell-code}\nplus_one(4.2)\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\n[1] 5.2\n```\n\n\n:::\n:::\n\n\n## Additional examples\n\nHere are additional examples showing how to pass scalars to Rust and return them \nto R using Rust scalar types.\n\n\n::: {.cell}\n\n```{.rust .cell-code}\n#[extendr]\nfn scalar_integer(x: i32) -> i32 { x }\n\n#[extendr]\nfn scalar_logical(x: bool) -> bool { x }\n```\n:::\n\n::: {.cell}\n\n```{.r .cell-code}\nscalar_integer(4L)\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\n[1] 4\n```\n\n\n:::\n\n```{.r .cell-code}\nscalar_logical(TRUE)\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\n[1] TRUE\n```\n\n\n:::\n:::\n\n\nAnd here are the same examples with extendr scalar types.\n\n\n::: {.cell}\n\n```{.rust .cell-code}\n#[extendr]\nfn scalar_integer(x: Rint) -> Rint { x }\n\n#[extendr]\nfn scalar_logical(x: Rbool) -> Rbool { x }\n```\n:::\n\n::: {.cell}\n\n```{.r .cell-code}\nscalar_integer(4L)\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\n[1] 4\n```\n\n\n:::\n\n```{.r .cell-code}\nscalar_logical(TRUE)\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\n[1] TRUE\n```\n\n\n:::\n:::\n\n\nDid you notice that we didn't give an example with character strings? Yeah, well,\nthere's a good reason for that. You can find out what that is by heading over to\nthe tutorial on [Handling Strings](user-guide/type-mapping/characters.qmd).",
+    "markdown": "---\ntitle: \"Scalar Type Mapping\"\n---\n\n::: {.cell}\n\n:::\n\n\nThis tutorial demonstrates some of the basics of passing scalar data types back\nand forth between Rust and R. We'll start with simple examples using explicit\nRust types but then move on to showing their extendr alternatives. Why does\nextendr have its own data types? For a number of reasons, of course, but the\nmost important reason is probably that Rust types do not allow for missing\nvalues, so no `NA`, `NaN`, `NULL`, or what have you. Fortunately, extendr types\nwill handle missing values for you. For this reason, **it is strongly\nrecommended that you work with the extendr types whenever possible.**\n\n## Scalar types\n\nA scalar type consists of a single value, and it can *only* consist of a single\nvalue, whether that value is a single character string, integer, or logical. As\nit happens, R doesn't have a way of representing a scalar value. That's because\neverything is a vector in R, and vectors can have any arbitrary length you want.\nSo, the closest thing to a scalar you will ever encounter in R is a vector that\njust so happens to have a length of one. In Rust, however, scalars are the\nbuilding blocks of everything, and they come in a bewildering variety, at least\nfor the traditional R user. Consider, for example, integers. R has just one way \nto represent this type of numeric value. Rust, on the other hand, has twelve!\n\nThe table below shows the most common R \"scalar\" types, along with their Rust\nand extendr equivalents. \n\n| R type         | extendr type | Rust type      |\n|----------------|--------------|----------------|\n| `integer(1)`   | `Rint`       | `i32`          |\n| `double(1)`    | `Rfloat`     | `f64`          |\n| `logical(1)`   | `Rbool`      | `bool`         |\n| `complex(1)`   | `Rcplx`      | `Complex<f64>` |\n| `character(1)` | `Rstr`       | `String`       |\n\nTo learn more about Rust types, see [section 3.2 of The\nBook](https://doc.rust-lang.org/book/ch03-02-data-types.html).\n\n## Sharing scalars\n\nTo see how scalars get passed back and forth between Rust and R, we'll first\nexplore Rust's `f64` value which is a 64-bit float. This is equivalent to R's\n`double(1)`. We'll write a very simple Rust function that prints the value of\nthe input and does not return anything.\n\n\n::: {.cell}\n\n```{.rust .cell-code}\n#[extendr]\nfn scalar_double(x: f64) { \n    rprintln!(\"The value of x is {x}\"); \n}\n```\n:::\n\n\nThrough the magic of extendr, we can now call this function in R and pass it a \nsingle double value.\n\n\n::: {.cell}\n\n```{.r .cell-code}\nscalar_double(4.2)\n#> The value of x is 4.2\n```\n:::\n\n\nThere are several things to note about this example. First, in Rust, `x: f64` \ntells us that the type of `x` being passed to the function (`fn`) is a single \ndouble vector or \"float\" value. Second, `rprintln!(\"{}\", x);` is an extendr \nmacro that makes it easier to print information from Rust to the console in R. \nR users will perhaps notice that the syntax is vaguely `{glue}`-like in that the \nvalue of `x` is inserted into the curly brackets. Finally, if you are not \nworking inside of an extendr R package, you can create the `scalar_double()` \nfunction locally using `rextendr::rust_function()`.\n\n``` r\nrextendr::rust_function(\"\nfn scalar_double(x: f64) { \n    rprintln!(\"The value of x is {x}\"); \n}\n\")\n```\n\nNow, what if, rather than printing the value of `x` to the R console, we wanted\ninstead to return that value to R? To do that, we just need to let Rust know\nwhat type is being returned by our function. This is done with the `-> type`\nnotation. The extendr crate understands this notation and knows how to handle \nthe scalar `f64` type returned by the Rust function and pass it to R as double.\n\n\n::: {.cell}\n\n```{.rust .cell-code}\n#[extendr]\nfn return_scalar_double(x: f64) -> f64 { \n    x \n}\n```\n:::\n\n::: {.cell}\n\n```{.r .cell-code}\nx <- return_scalar_double(4.2)\n\ntypeof(x)\n#> [1] \"double\"\n\nx + 1.0\n#> [1] 5.2\n```\n:::\n\n\n## Missing values\n\nAs noted above, Rust does not allow a scalar type to have a missing value, so \nyou cannot simply pass a missing value like `NA` to Rust and expect it to just \nwork. Here is a demonstration of this issue using a simple function which adds \n1.0 to `x`.\n\n\n::: {.cell}\n\n```{.rust .cell-code}\n#[extendr]\nfn plus_one(x: f64) -> f64 { \n    x + 1.0 \n}\n```\n:::\n\n\nYou will notice that this function expects `x` to be `f64`, not a missing value.\nPassing a missing value from R to this Rust function will, therefore, result in \nan error.\n\n\n::: {.cell}\n\n```{.r .cell-code}\nplus_one(NA_real_)\n#> Error in plus_one(NA_real_): Must not be NA.\n```\n:::\n\n\nFortunately, the extendr types are `NA`-aware, so you can, for instance, use \nextendr's `Rfloat` in place of `f64` to handle missing values without error. \nBelow, you will see that we have done this for the function `plus_one()`. \n\n\n::: {.cell}\n\n```{.rust .cell-code}\n#[extendr]\nfn plus_one(x: Rfloat) -> Rfloat { \n    x + 1.0 \n}\n```\n:::\n\n::: {.cell}\n\n```{.r .cell-code}\nplus_one(NA_real_)\n#> [1] NA\n\nplus_one(4.2)\n#> [1] 5.2\n```\n:::\n\n\n## Additional examples\n\nHere are some additional examples showing how to pass scalars to Rust and return them \nto R using Rust scalar types.\n\n\n::: {.cell}\n\n```{.rust .cell-code}\n#[extendr]\nfn scalar_integer(x: i32) -> i32 { x }\n\n#[extendr]\nfn scalar_logical(x: bool) -> bool { x }\n```\n:::\n\n::: {.cell}\n\n```{.r .cell-code}\nscalar_integer(4L)\n#> [1] 4\n\nscalar_logical(TRUE)\n#> [1] TRUE\n```\n:::\n\n\nAnd here are the same examples with extendr scalar types.\n\n\n::: {.cell}\n\n```{.rust .cell-code}\n#[extendr]\nfn scalar_integer(x: Rint) -> Rint { x }\n\n#[extendr]\nfn scalar_logical(x: Rbool) -> Rbool { x }\n```\n:::\n\n::: {.cell}\n\n```{.r .cell-code}\nscalar_integer(4L)\n#> [1] 4\n\nscalar_logical(TRUE)\n#> [1] TRUE\n```\n:::\n\n\nDid you notice that we didn't give an example with character strings? Yeah, well,\nthere's a good reason for that. You can find out what that is by heading over to\nthe tutorial on [Handling Strings](user-guide/type-mapping/characters.qmd).",
     "supporting": [],
     "filters": [
       "rmarkdown/pagebreak.lua"