-
Notifications
You must be signed in to change notification settings - Fork 62
Mismatch in output of onnx exported CharTokenizer model #477
Comments
There are 3 main independent issues that explain this behavior, and that would require independent solutions to make the CharTokenizer onnx export to have the same outputs from ML.NET, OnnxRunner and ORT. Offset between ML.NET columns and OnnxRunner/ORT columnsWhen using the Because the outputs are Keys, it is affected by the same issue found on #428, i.e. that the output for ML.NET's Key columns in NimbusML is 0-based, whereas the output of the OnnxRunner and ORT is 1-based. It seems that this offset is caused because NimbusML automatically substracts "1" from Fixing #428, should fix this issue in here. As discussed offline, it seems that solving that issue would require major changes in how NimbusML works with NaN vs. 65535 (float vs. uint16)The But, why do we get NaN in the other outputs? This has to do with PR #267 "Add variable length vector support". What is relevant from that PR in here, is that NimbusML will take a variable length uint16 vector column and will cast it to float, and also add NaNs where values are "missing". The exact mechanism of how this PR works isn't clear to me, but playing around by modifying what was introduced on that PR (particullarly on files Since the output of ORT doesn't involve NimbusML, it doesn't make the described casts, nor it fills missing values with NaNs, so that's why it is uint16 and has the 65535's. On the other hand, since ML.NET (without NimbusML) has no problem working with columns with vectors of variable sizes, this issue isn't reproducible using only ML.NET (without NimbusML). Fixing this issue might require changing or reverting PR #267 (which might bring its own set of problems, given that the behavior of that PR was introduced for a reason), or modify ML.NET's float64 vs. float32The output from ML.NET is float64 whereas the output of OnnxRunner is float32. Again, this is related to PR #267. Without using NimbusML, the output of the NimbusML/src/NativeBridge/PythonInterop.cpp Lines 47 to 64 in 1b7c399
Where it says that There doesn't seem to be any clear indication to treat NimbusML/src/DotNetBridge/NativeDataInterop.cs Lines 141 to 151 in 1b7c399
The exact mechanisms that explain all of the above castings would need further investigation. But perhaps this type mismatch between float64 and float32 isn't a blocking issue, and it wouldn't need to be fixed. |
The onnx export test for
CharTokenizer
is failing in the current tests so it has been disabled (link). The output comming from ML.NET, OnnxRunner, and ORT on that test are different.Here is a repro script, and its output. Notice the difference both in values and dtypes between the different outputs.
NOTE: The DataFrameTool is the one found here in the repository.
Repro
Output
The text was updated successfully, but these errors were encountered: