-
Notifications
You must be signed in to change notification settings - Fork 578
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix buffer overflow and data corruption when a type has more than 5 layers of nesting #3203
Conversation
|
||
try writer.encode(tag: 1, value: message) | ||
|
||
assertBufferEqual(writer, """ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this test fails without the change.
here's the bytes after the fix, followed by the bytes before the fix. before the fix one of the bytes is actually missing - the 0A (Tag 1 | Length Delimited)
byte that preceeds the name2
string value
0A 34 0A 05 6E616D6531 12 2B 0A 05 6E616D6532 12 22 0A 05 6E616D6533 12 19 0A 05 6E616D65 3412 10 0A 05 6E616D6535 1207 0A 05 6E616D6536
0A 34 0A 05 6E616D6531 12 2B XX 05 6E616D6532 12 22 0A 05 6E616D6533 12 19 0A 05 6E616D65 3412 10 0A 05 6E616D6535 1207 0A 05 6E616D6536
\
\_ missing byte
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this doesn't seem to repro for me... did you have to do anything special? what toolchain were you using?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was using Xcode 16.0
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what part doesn't repro for you?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, I can't seem to repro the corrupted data either. Perhaps I encountered this while testing some other fix? I'm not sure. I was copying those hex strings from the assertions that were failing 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Assuming this was a false signal I encountered then its a relief that the actual data that was encoded was not impacted
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what part doesn't repro for you?
the test still passes even if i run with the implementation change as it was vs the fixed version.
Assuming this was a false signal I encountered then its a relief that the actual data that was encoded was not impacted
agreed. but does the test actually fail then without the change? obviously good to have it regardless so that something is exercising the 'grow the buffer' path, but also makes me feel it might provide less safety than it might aspire to. ideally these tests should probably run in a mode with ASAN on as part of CI.
aaaeaaf
to
d498d08
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💯 great find!
cc @lickel for one more additional review.
@@ -778,7 +778,7 @@ public final class ProtoWriter { | |||
let newCapacity = messageStackCapacity * 2 | |||
|
|||
let newMessageStack = UnsafeMutablePointer<MessageFrame>.allocate(capacity: newCapacity) | |||
newMessageStack.moveInitialize(from: messageStack, count: messageStackIndex + 1) | |||
newMessageStack.moveInitialize(from: messageStack, count: messageStackCapacity) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this just be the stack index? (Eg just remove the +1)
Expanding the stack doesn't necessarily mean you're at the end of the stack, does it?
I think I'd want to see a few more test cases like:
Size is 4, write 2 bytes, then try to append a 4 byte message. I would guess that index is 2, and your change would shift the index to 4.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think we don't want to use messageStackCapacity
since in theory it could still have uninitialized memory that we presumably should not read when copying it over into the new storage location.
IMO whatever the ultimate change, the code should probably be symmetric with the analogous logic in ProtoReader
(here), and that codepath should also probably get a comparable test case.
if the invariant we want is: 'messageStackIndex + 1
is always the number of initialized elements in the message stack buffer', then we should make sure that always holds. in this case maybe changing where the index increment & test happens would make the overall logic more straightforward & consistent. e.g. we'd change this:
wire/wire-runtime-swift/src/main/swift/ProtoCodable/ProtoWriter.swift
Lines 711 to 714 in 67b73b5
messageStackIndex += 1 | |
if messageStackIndex >= messageStackCapacity { | |
expandMessageStack() | |
} |
to something like what is done in ProtoReader
:
wire/wire-runtime-swift/src/main/swift/ProtoCodable/ProtoReader.swift
Lines 136 to 140 in 67b73b5
if messageStackIndex + 1 >= messageStackCapacity { | |
expandMessageStack() | |
} | |
messageStackIndex += 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had originally done it that way but was worried about some other issue cropping up. Since ProtoReader
has similar code I'll go back to that version
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Typically if working if an unsafe language we wouldn't really allow this type of random access with pointer and size tracking a bit scattered. It would be great if we could do the same here. Maybe we could encapsulate the message stack into it's own class which closes up access to its internals, it might help a bunch with a few of the problems enumerated by @jamieQ.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree, that was a much bigger refactor than I was prepared to do at the moment. I'd like to fix this issue without requiring a big refactor
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I'd want to see a few more test cases like:
Size is 4, write 2 bytes, then try to append a 4 byte message. I would guess that index is 2, and your change would shift the index to 4.
This stack appears to be specific to length delimited fields, not bytes, and tracks the number of lengths that have been written or something along those lines. So in the test case it goes like this:
- Start writing the entire message
Nested 1
withtry writer.encode(tag: 1, value: message)
- in
beginLengthDelimitedEncode
, incrementmessageStackIndex
from -1 to 0. - Start writing the
nested
property ofNested1
viatry protoWriter.encode(tag: 2, value: self.nested)
inencode(to:)
- in
beginLengthDelimitedEncode
, incrementmessageStackIndex
from 0 to 1. - Start writing the
nested
property ofNested2
viatry protoWriter.encode(tag: 2, value: self.nested)
inencode(to:)
- in
beginLengthDelimitedEncode
, incrementmessageStackIndex
from 1 to 2. - Start writing the
nested
property ofNested3
viatry protoWriter.encode(tag: 2, value: self.nested)
inencode(to:)
- in
beginLengthDelimitedEncode
, incrementmessageStackIndex
from 2 to 3. - Start writing the
nested
property ofNested4
viatry protoWriter.encode(tag: 2, value: self.nested)
inencode(to:)
- in
beginLengthDelimitedEncode
, incrementmessageStackIndex
from 3 to 4. - Start writing the
nested
property ofNested5
viatry protoWriter.encode(tag: 2, value: self.nested)
inencode(to:)
- in
beginLengthDelimitedEncode
, incrementmessageStackIndex
from 4 to 5.
At this point we are at index 5 which is equal to the initial capacity set here:
private var messageStackCapacity: Int = 5 |
but we have not yet initialized that index. The number of bytes we have initialized is 5 (indices 0-4).
Now that messageStackIndex == 5
, we hit the call to expandMessageStack()
wire/wire-runtime-swift/src/main/swift/ProtoCodable/ProtoWriter.swift
Lines 712 to 713 in 67b73b5
if messageStackIndex >= messageStackCapacity { | |
expandMessageStack() |
In expandMessageStack()
it attempts to copy and deinitialize 6 bytes from the source buffer because messageStackIndex + 1 == 6
newMessageStack.moveInitialize(from: messageStack, count: messageStackIndex + 1) |
which copies garbage data into index 5. We then call initialize(to:)
at index 5. This is potentially an invalid use of initialize(to:)
, depending on whether the MessageFrame
type is considered a "trivial type" as mentioned in the docs:
The destination memory must be uninitialized or the pointer’s Pointee must be a trivial type
d498d08
to
e68fe4c
Compare
Fixes #3202