-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Random 0xfeff / ZERO WIDTH NO-BREAK SPACE being added to returned string values #1623
Comments
You haven’t provided any error messages or examples of the text containing a ZWNBSP or where in the string. However, if this is happening at the start of your string, then this is probably a result of Byte-Order Marking, where a string starts with 0xfeff, and since 0xfffe is defined as an invalid Unicode codepoint, you can then identify if you’re dealing with UTF-16LE, from UTF-16BE. Especially, if it’s pulling this data from lines from a Windows text file, like |
Sorry for that missing information. The data is come from a MySQL select query. It is essentially has a gRPC api server that node.GetWork is calling and returning this. I have confirmed that inside of the server it is not being added. It is being added some where in the encoding and transfer across the wire and then the subsequent decode on the client side. Yes I have been able to work around it by specifically stripping the 0xfeff character from the string but it seems odd it is there in the first place. I also agree that this is likely the result of Byte-Order Marking because it is at the very start of the string. What I find most odd is that it only happens sometimes, it is not every string. It almost feels as if it is being used as padding for the encode/decode for the wire transfer but is not properly being stripped off in all cases. I am happy to provide more detail or code or debug out put, I just was not sure what all would be helpful. Or if this was some known expected behavior I was not aware of. |
Protobuf doesn’t typically use any padding let alone 0xfeff specifically. Have you tried looking at the raw MySQL query values directly? Maybe someone is copy-pasting in from a Windows text file somewhere? It can be a notoriously difficult character to notice because it’s zero-width, and thus might not seem to show up normally. Maybe a short copy of an encoded |
This may be expected behavior I am not sure and cannot find any information about it. In some cases my protobuffer strings are returned with a ZERO WIDTH NO-BREAK SPACE.
`message Work {
Identifer id = 1;
WorkType work_type = 2;
string command = 3 [
(google.api.field_behavior) = OPTIONAL,
(grpc.gateway.protoc_gen_openapiv2.options.openapiv2_field) = {
title: "Work Request Command"
description: "Command to perform the work request."
}
];
}
node, err := a.srvClient.client.GetWork(ctx, &npb.GetWorkReq{Hardware: &npb.HardwareInfo{MacAddr: a.cfg.host.macAddr, IpAddr: a.cfg.host.ipAddrs}})
if err != nil {
return err
}
I would expect cmd.Command to not contain random ZERO WIDTH NO-BREAK SPACE.
Any insights would be appreciated.
The text was updated successfully, but these errors were encountered: