Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Random 0xfeff / ZERO WIDTH NO-BREAK SPACE being added to returned string values #1623

Open
agruetz opened this issue Jun 26, 2024 · 3 comments

Comments

@agruetz
Copy link

agruetz commented Jun 26, 2024

This may be expected behavior I am not sure and cannot find any information about it. In some cases my protobuffer strings are returned with a ZERO WIDTH NO-BREAK SPACE.

`message Work {
Identifer id = 1;
WorkType work_type = 2;
string command = 3 [
(google.api.field_behavior) = OPTIONAL,
(grpc.gateway.protoc_gen_openapiv2.options.openapiv2_field) = {
title: "Work Request Command"
description: "Command to perform the work request."
}
];
}

node, err := a.srvClient.client.GetWork(ctx, &npb.GetWorkReq{Hardware: &npb.HardwareInfo{MacAddr: a.cfg.host.macAddr, IpAddr: a.cfg.host.ipAddrs}})
if err != nil {
return err
}

for _, cmd := range node.Work {
	switch cmd.WorkType {
	case npb.WorkType_INSTALL:
		if a.cfg.agent.mode == dev {
			//TODO FIX THE STUDDER (has to be fixed in ProtoFile)
			err = a.installPrimary(cmd.Id.Id, primaryDev)
			if err != nil {
				return err
			}
		} else {
			err = a.installPrimary(cmd.Id.Id, primary)
			if err != nil {
				return err
			}
		}
	case npb.WorkType_EXEC:
		cmdWithArgs := strings.Split(strings.TrimSpace(cmd.Command), " ")

		err = a.execCmd(cmd.Id.Id, cmdWithArgs[0], cmdWithArgs[1:]...)
		if err != nil {
			return err
		}
	default:
		//LOG BAD CMD TYPE
		//TODO LOG
		return fmt.Errorf("unknown work type: %s", cmd.WorkType)
	}`

I would expect cmd.Command to not contain random ZERO WIDTH NO-BREAK SPACE.

Any insights would be appreciated.

@puellanivis
Copy link
Collaborator

You haven’t provided any error messages or examples of the text containing a ZWNBSP or where in the string.

However, if this is happening at the start of your string, then this is probably a result of Byte-Order Marking, where a string starts with 0xfeff, and since 0xfffe is defined as an invalid Unicode codepoint, you can then identify if you’re dealing with UTF-16LE, from UTF-16BE. Especially, if it’s pulling this data from lines from a Windows text file, like .BAT as it is known to add these BOMs in files saved in Unicode.

@agruetz
Copy link
Author

agruetz commented Jun 26, 2024

Sorry for that missing information. The data is come from a MySQL select query. It is essentially has a gRPC api server that node.GetWork is calling and returning this.

I have confirmed that inside of the server it is not being added. It is being added some where in the encoding and transfer across the wire and then the subsequent decode on the client side.

Yes I have been able to work around it by specifically stripping the 0xfeff character from the string but it seems odd it is there in the first place.

I also agree that this is likely the result of Byte-Order Marking because it is at the very start of the string.

What I find most odd is that it only happens sometimes, it is not every string. It almost feels as if it is being used as padding for the encode/decode for the wire transfer but is not properly being stripped off in all cases.

I am happy to provide more detail or code or debug out put, I just was not sure what all would be helpful. Or if this was some known expected behavior I was not aware of.

@puellanivis
Copy link
Collaborator

Protobuf doesn’t typically use any padding let alone 0xfeff specifically.

Have you tried looking at the raw MySQL query values directly? Maybe someone is copy-pasting in from a Windows text file somewhere? It can be a notoriously difficult character to notice because it’s zero-width, and thus might not seem to show up normally.

Maybe a short copy of an encoded Work message that triggers the issue? I maybe wouldn’t jump straight to copy-pasting here an excerpt of the MySQL data for that message, but also, it probably wouldn’t hurt.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants