-
Notifications
You must be signed in to change notification settings - Fork 45
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Replace buggy number conversions with tryfrom #736
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall good work, but there are two main issues here:
- Some conversions are inherently safe and should not be wrapped into
TryFrom
- Other conversions repeat throughout the codebase. This increases cognitive load, you should either create simple helpers or trivial traits that can then be used. Basically, abstract away
TInto::try_into::<TFrom>(&TFrom).unwrap()
or something like this/
@@ -64,9 +64,10 @@ fn str_from_strsxp<'a>(sexp: SEXP, index: isize) -> &'a str { | |||
let charsxp = STRING_ELT(sexp, index); | |||
if charsxp == R_NaString { | |||
<&str>::na() | |||
} else if TYPEOF(charsxp) == CHARSXP as i32 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let ptr = R_CHAR(charsxp) as *const u8; | ||
let slice = std::slice::from_raw_parts(ptr, Rf_xlength(charsxp) as usize); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This one is interesting. R_xlen_t
is isize
, but we need usize
here. If we trust R that lengths are never negative, then all valid isize
values fit in usize
. But ok, let's use safe conversion here.
@@ -87,14 +88,16 @@ impl Iterator for StrIter { | |||
let i = self.i; | |||
self.i += 1; | |||
let vector = self.vector.get(); | |||
let vector_u32: u32 = TYPEOF(vector).try_into().unwrap(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same logic here, TYPEOF
returns c_int
from a known list. I'd leave it as TYPEOF(vector) as u32
. But! Good catch for deduplicating code, let's just opt for a better name: vector_u32
-> typeof_vector
} else if TYPEOF(vector) as u32 == NILSXP { | ||
} else if vector_u32 == STRSXP { | ||
Some(str_from_strsxp(vector, isize::try_from(i).unwrap())) | ||
} else if vector_u32 == INTSXP && u32::try_from(TYPEOF(self.levels)).unwrap() == STRSXP |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here. I suggest we introduce a helper function that promotes TYPEOF()
from c_int
to u32
, matching *SXP
type.
Some(str_from_strsxp(vector, isize::try_from(i).unwrap())) | ||
} else if vector_u32 == INTSXP && u32::try_from(TYPEOF(self.levels)).unwrap() == STRSXP | ||
{ | ||
let j: isize = (*(INTEGER(vector).add(i))).try_into().unwrap(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can't find reference to INTEGER
definition in libR-sys
, but all integers in R are i32
, so i32
-> isize
is always safe.
} | ||
|
||
/// Convert R's SEXPTYPE to extendr's Rtype. | ||
pub fn sxp_to_rtype(sxptype: i32) -> Rtype { | ||
use Rtype::*; | ||
match sxptype as u32 { | ||
match u32::try_from(sxptype).unwrap() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is also (but now I question why we have here i32
as input).
let preservation = Rf_allocVector(VECSXP, INITIAL_PRESERVATION_SIZE as R_xlen_t); | ||
let preservation = Rf_allocVector( | ||
VECSXP, | ||
R_xlen_t::try_from(INITIAL_PRESERVATION_SIZE).unwrap(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is also safe, since it is a compile-time constant.
@@ -157,7 +168,7 @@ impl Ownership { | |||
unsafe fn garbage_collect(&mut self) { | |||
// println!("garbage_collect {} {}", self.cur_index, self.max_index); | |||
let new_size = self.cur_index * 2 + EXTRA_PRESERVATION_SIZE; | |||
let new_sexp = Rf_allocVector(VECSXP, new_size as R_xlen_t); | |||
let new_sexp = Rf_allocVector(VECSXP, R_xlen_t::try_from(new_size).unwrap()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So many calls to R_xlen_t::try_from().unwrap()
. Introduce a helper method that does this.
@@ -155,9 +155,12 @@ fn manifest(x: SEXP) -> SEXP { | |||
single_threaded(|| unsafe { | |||
Rf_protect(x); | |||
let len = XLENGTH_EX(x); | |||
let data2 = Rf_allocVector(TYPEOF(x) as u32, len as R_xlen_t); | |||
let data2 = Rf_allocVector( | |||
u32::try_from(TYPEOF(x)).unwrap(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is safe
Rf_protect(data2); | ||
match TYPEOF(x) as u32 { | ||
match u32::try_from(TYPEOF(x)).unwrap() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And this, see comments above
Hello again! Unfortunately, I'm still not in favour of the changes proposed here. Similarly, the spot where you've got things like |
@CGMossa I mostly agree with you, this PR is far from the ideal fix, a lot things should be fixed by a refactor, but this is the point, to solve by refactor you need know a lot better the program, also takes time, so if the refactor can't be done in the short term, this PR would be good enough. As you says, how there is already PRs that could solve this, and in a better way, that is great! I would like to know it before before spent my time in this particular case and could be do other thing. The Maybe just wait until the PRs you says, and then check again this, to see if there is some unexpected truncation left. |
Dear @latot Again, I thank you very much for your contribution and attention with the development of extendr. I'd also like to encourage participating in reviewing PRs, and replying to issues, as we have very few conversations going in-depth into what are all the aspects that need attention. While a community review doesn't yield approval on a PR, it shows approval of the community, which in many cases are much, much more valuable. I'll make an effort to land a few more refactoring very soon. |
Dear @latot If I have overlooked a conversion, or similar, there are more things you'd like to contribute to extendr, Godspeed! |
This is a first part of replace buggy number conversions, it covers most of cases where we transform between numerical types and the conversion can fails.
There is still some cases like:
extendr-api/src/scalar/rint.rs:189:26
But will not be included here.