-
Notifications
You must be signed in to change notification settings - Fork 284
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CA-400560: Support tilde in RPM version/release comparison #6092
CA-400560: Support tilde in RPM version/release comparison #6092
Conversation
changlei-li
commented
Oct 30, 2024
•
edited
Loading
edited
- Fix version segment division error.
- Support tilde in RPM version/release comparison.
ocaml/xapi/rpm.ml
Outdated
*) | ||
match (Astring.String.cut ~sep:"~" s1, Astring.String.cut ~sep:"~" s2) with | ||
| None, None -> | ||
compare_no_tilde_version_strings s1 s2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A general approach of parsing is preferred than the specific functions.
For example of 1.2.3~rc1~beta
"<" 1.2.3~rc1
, the segment
lists would be:
[Int 1; Int 2; Int 3; Tilde; Str "rc"; Int 1; Tilde; Str "beta"]
and
[Int 1; Int 2; Int 3; Tilde; Str "rc"; Int 1]
.
It would be easy to update the pattern matching in compare_segments
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Accept
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I support the proposal to parse a string into tagged components and to define order over these lists. The reason is that order defined over numbers and strings is different; so it matters if a sequence of characters is treated as a number or a string; this extends to special cases like tilde.
:ring3 $ echo "1 10 2" | tr ' ' \\012 | sort -n
1
2
10
:ring3 $ echo "1 10 2" | tr ' ' \\012 | sort
1
10
2
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
According to the comment, I split the version string to segment of Int, Str and Tilde and fix some errors. I split the commits to fix segment division and support tilde. Because there is a big difference from the initial version, I push -f directly for not affecting the review experience.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know if the tilde could be considered just a character in a Str
and you would get the desired order.
:ring3 $ echo "ab abc ~abc" | tr ' ' \\012 | sort
ab
~abc
abc
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lindig, in your example, bash command sort
thinks ~abc
< abc
because the locale settings. For example in the en_US.UTF-8 locale, the tilde character is often treated as less than alphanumeric characters.
In man sort
*** WARNING *** The locale specified by the environment affects sort order. Set LC_ALL=C to get the traditional sort order that uses native byte values.
And in OCaml, String.compare uses lexicographical order.
utop # List.sort String.compare ["ab"; "abc"; "~abc"];;
- : string list = ["ab"; "abc"; "~abc"]
Tilde is more complicated than just a character in a Str
. It may be along with letters or numbers, and even used separately. You can find more examples in the ut.
Generally, tilde (and the part after it) in the version string stands for pre-release:
- Compare the formal version part first then compare the pre-release version part.
- Version with tilde is earlier than the same version without tilde.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It the end, a total order needs to be defined between two arbitrary versions. Hence, we can't rely on "same versions without a tilde" because this is just a special case. So I would like to see this total order being defined over lists of Str
, Int
and Tilde
or similar.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Of course. It's just a general statement to help comprehension. The function compare_segments
in the code define how to compare segment list.
Function compare_version_segment
define how to compare single segment.
type order = LT | EQ | GT
type version_segment = Int of int | Str of string | Tilde
let order_of_int = function 0 -> EQ | r when r > 0 -> GT | _ -> LT
let rec compare_segments l1 l2 =
match (l1, l2) with
| c1 :: t1, c2 :: t2 -> (
match compare_version_segment c1 c2 with
| EQ ->
compare_segments t1 t2
| r ->
r
)
| Tilde :: _, [] ->
LT
| [], Tilde :: _ ->
GT
| _ :: _, [] ->
GT
| [], _ :: _ ->
LT
| [], [] ->
EQ
let compare_version_segment s1 s2 =
match (s1, s2) with
| Int i1, Int i2 ->
Int.compare i1 i2 |> order_of_int
| Str s1, Str s2 ->
String.compare s1 s2 |> order_of_int
| Tilde, Tilde ->
EQ
| Int _, Str _ ->
GT
| Str _, Int _ ->
LT
| Tilde, _ ->
LT
| _, Tilde ->
GT
I think the code is clear to specify the order over segment lists.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The code is clear; this needs to be tested against the examples to show it has the desired order.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The description is just like code :)
(1) Compare two segments:
For different kinds of segments:
Int > Str > Tilde
If they are the same kind, just compare the value (string or int)
(2) Compare two segment lists:
Compare segments in the two lists in order as (1).
If the order is determined (Not Eq) -> end.
If they are equal, compare the next segments from the lists recursively.
If one list is empty, then Str/Int > Empty > Tilde -> end
d5bd6ed
to
def777e
Compare
For example "1.2.3a" will be divided to [Int 1; Int 2; Str 3a] and "1.2.3" is divided to [Int 1; Int 2; Int 3]. It leads to "1.2.3" > "1.2.3a" which is incorrect. After fix, "1.2.3a" will be divided to [Int 1; Int 2; Int 3; Str a], then we can get the right compare result. Signed-off-by: Changlei Li <[email protected]>
def777e
to
ec44b29
Compare
GT | ||
|
||
let split_version_string = | ||
let r = Re.Posix.compile_pat {|[a-zA-Z]+|[0-9]+|~|} in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we have to detect when a version string contains illegal characters? For example, the above regex does not allow a underscore to be part of the string.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All characters expect for [alphabetic]+, [digit]+ and ~
are acted as delimiter and omitted in the results:
1.2#3_45abc -> ["1"; "2"; "3"; "45"; "abc"]
1.2.3α45 -> ["1"; "2"; "3"; "45"]
12汉字45 -> ["12"; "45"]
| "~" -> | ||
Tilde | ||
| s -> ( | ||
try Int (int_of_string s) with _ -> Str s |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Be aware that int_of_string "0xabc"
does not throw an exception as you might expect.,
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. No need to test the string is all digits when convert string to Int as the original implementation, because the split method has promised the string is all digits or all alphabetic. There is no situation like "0b11" may mislead int_of_string
. So just use int_of_string
exception to distinguish Str and Int.
Due to the function has precondition that the input is from the split result, it is not a complete function to convert string
to version_segment
. So I just put it in the normalize
function scope.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
An alternative would be to use String.sscanf "%d"
. But you are correct that the problem does not arise because the split makes sure this is only digtits or only letters.
|
Tilde `~` used in RPM version stands for pre-release. So version with `~` is earlier than the same version without `~`. For example: 1.2.3~beta < 1.2.3 1.xs8 > 1.xs8~2_1 Signed-off-by: Changlei Li <[email protected]>
b138c57
to
a6187d5
Compare