-
-
Notifications
You must be signed in to change notification settings - Fork 51
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Tracepoint extension support #160
base: master
Are you sure you want to change the base?
Conversation
I wasn't sure how the zero panic verification guarantee can be checked - there isn't a how to section in the README about it. I tried to write code as panic-free as I could, but I may have missed some pieces. |
Thanks for sending in this PR - I'm excited to dig in here! Unfortunately, I'm just about to leave for a vacation where I'll be completely AFK, so I'll only be able to take a look sometime in the range of ~Dec 20th to ~Dec 23rd. Just wanted to give you a heads up, so that you're not worried about the lack of movement here 😅 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hello again, and happy holidays!
I've finally found some time to sit down and give this PR a review.
Please find a plethora of comments attached.
Broadly speaking, I think this is an impressive chunk of work, implementing what appears to be a very annoying and non-trivial part of the GDB RSP. From an organizational and syntactical POV, there's nothing that a few review comments can't polish up, and the overall vision here appears to be consistent and well put together. Kudos!
That said, I do have some concerns about the amount of API surface area we're taking on here, and the feasibility of testing all of it. Its great that the you've got some things working in the armv4t
example, but from looking at the code (notably: obvious errors such as handlers which send responses with spaces delimiting various part of the packet), it seems that you've got a lot of code here that theoretically works, but hasn't been directly validated.
That's not to say we shouldn't try to land this work!
I think we should certainly try to get this PR landed... but to temper expectations for end-users, I might suggest we land this code under a mod experimental
, with some documentation mentioning that this code covers a lot of surface area, and may not be fully tested.
Alternatively, if you're so inclined, I'd be happy to see more investment into the armv4t
example code, with some corresponding logs that show all these codepaths having been smoke-tested. Or, of course, logs from whatever project you're implementing this feature for (and ideally, a link to the implementation itself - assuming you're working on something open-source).
I wasn't sure how the zero panic verification guarantee can be checked - there isn't a how to section in the README about it. I tried to write code as panic-free as I could, but I may have missed some pieces.
CI has a check for this, but it seems that the check doesn't run if clippy
fails. Oops.
I should probably re-jig CI a bit so that clippy failing still gives no_panic feedback... my apologies.
@@ -172,6 +172,11 @@ impl<T: Target, C: Connection> GdbStubImpl<T, C> { | |||
} | |||
} | |||
|
|||
if let Some(_ops) = target.support_tracepoints() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there are quite a few other tracepoint-related features in the docs. could you explain why these are the only two that were enabled, and maybe leave a comment mentioning what other features may need to be enabled in the future if/when additional functionality is ever implemented?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added documentation about the tracepoint extensions we aren't supporting yet in ce4acb4. I just dropped InstallInTrace
for now - we wanted to support it internally for the project this is in support for but I think we're actually not going to do it, and so not requiring implementers to support it either is fine until gdbstub builds out the rest the extensions.
src/protocol/commands/_QTBuffer.rs
Outdated
// Our response has to be a hex encoded buffer that fits within | ||
// our packet size, which means we actually have half as much space | ||
// as our slice would indicate. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so, interestingly enough, this is not the approach gdbstub
has taken thus-far when it comes to responses.
While we certainly have a fixed buffer for incoming packets (i.e: the buffer we are parsing from here), we assume that the GDB client is capable of accepting any amount of data we stream out as part of our responses. This aligns with my particular reading of the GDB spec, which discusses the size of packets the stub can accept... but doesn't make a judgement on the size of response packets the client can receive.
as such - I would suggest skipping this buffer-slicing step entirely, and instead offering the handler a callback function they can write an arbirarily sized &[u8]
into, which gdbstub
can then simply stream out. If you poke around some of the other target APIs, you'll see examples of this callback-based / streaming-based pattern being employed to great effect.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
alternatively, if you think that from a purely ergonomic POV, its nicer to provide end-users with a buffer to write data into... lets make sure to pass along the entire buffer we have access to, and let downstream response-writer code stream out the corresponding hex bytes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The tracepoint packet documentation specifically calling out
The reply consists as many hex-encoded bytes as the target can deliver in a packet
makes me think that limiting our response to one packet is important and we can't just stream. Some other places like e.g.
Any number of actions may be packed together in a single ‘QTDP’ packet, as long as the packet does not exceed the maximum packet length (400 bytes, for many stubs).
I could see it not being important for, however, if we're the stub and because the QTDP
packets are bidirectional...but QTBuffer
responses are only ever being returned by the stub.
The packet buffer size in general is kind of confusing to me. Like you said, in most other places gdbstub just ignores the concept of a packet size and assumes it can stream data and it works fine, but then we have documentation like these occasionally...?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think its fair to say that both the GDB RSP and gdbstub
aren't always consistent about how these sorts of variable-length packets are implemented...
There are many instances where gdbstub
has opted to recycle the trailing packet buffer in order to provide the Target a &mut [u8]
it can write data into... while other times, it has opted to hand a callback-based object to the Target for it to write data into. But even in the former case, as I alluded to earlier / elsewhere, the data written in that buffer is nonetheless streamed out via a sequence of what are essentially putc
calls, without any bound on output packet size.
As I mull over this further, I'm realizing that I should really spin up a tracking issue to document and discuss potential solutions for these sorts of project-wide inconsistencies. It may even tie into #159, and possibly extend #88, in the sense that gdbstub
may need to care more about dealing with backpressure on outbound write
operations...
But in any case, for this PR and packet specifically - I'm not sure I'll push hard in either direction. Feel free to do what you think is best, and I'll add this API to the growing list of APIs that aught to be more consistent.
mut f: impl FnMut(&TracepointAction<'_, U>), | ||
) -> Option<bool> { | ||
let mut more = false; | ||
let mut unparsed: Option<&mut [u8]> = Some(actions); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can't this loop be replaced with a split
iterator, matching on b'S' | b'R' | b'M' | b'X' | b'-'
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
spliit_inclusive_mut
contains the needle as a terminator for each subslice, not the head.
src/stub/core_impl/tracepoints.rs
Outdated
let status = ops.trace_experiment_status().handle_error()?; | ||
res.write_str("T")?; | ||
res.write_dec(if status.running { 1 } else { 0 })?; | ||
for explanation in status.explanations.iter() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
using ManagedSlice in the return value of the target method is certainly one approach... but can't we avoid the need to expose ManagedSlice in the API entirely by having trace_experiment_status
accept a callback instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to write the T0
/T1
running status before we write any of the explanations. If we pass the res
writing callback to the target implementation, then it would need to have some mechanism of reporting the experiment state before it can run the explanation callback. We can't do like an &mut FnOnce(ExperimentStatus)->TargetResult<&mut FnMut(ExperimentExplantion)->TargetResult<(),Self>, Self>
because we can't return a borrow from the closure due to lifetime issues, and can't return a trait object by value due to it being unsized.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thinking about it, one solution would be to split the API into trace_experiment_status
and then trace_experiment_statistics
or something? Does gdbstub particularly care about matching API surface 1:1 with gdb packets or would that be ok?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does gdbstub particularly care about matching API surface 1:1 with gdb packets or would that be ok?
Reflecting the underlying packet structure to end users is actually an anti-goal of gdbstub
😄
It just-so-happens that the many times, the protocol is relatively trivial, so the resulting API ends up matching the packets 1:1... but there are other cases where gdbstub
goes out of its way to expose a more "user-friendly" API, and then take on whatever heavy-lifting it needs to do in order to map those friendly end-user semantics onto the underlying protocol.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Implemented in dbeb9cc
src/stub/core_impl/tracepoints.rs
Outdated
let e = (|| -> Result<_, _> { | ||
match desc { | ||
FrameDescription::FrameNumber(n) => { | ||
res.write_str("F ")?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
again, this sort of space isn't really a thing in the GDB RSP...
We include spaces in some of the templates for clarity; these are not part of the packet’s syntax. No GDB packet uses spaces to separate its components. For example, a template like ‘foo bar baz’ describes a packet beginning with the three ASCII bytes ‘foo’, followed by a bar, followed directly by a baz. GDB does not transmit a space character between the ‘foo’ and the bar, or between the bar and the baz.
https://sourceware.org/gdb/current/onlinedocs/gdb.html/Packets.html#Packets
This strongly implies to me that this codepath hasn't been tested...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I definitely missed that section of the docs! This codepath does actually work, however: in the armv4t
example that's included in this MR I can do
(gdb) tfind 3
Found trace frame 3, tracepoint 1
#0 main () at test.c:10
10 in test.c
(gdb) tfind 2
Found trace frame 2, tracepoint 1
10 in test.c
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh GDB... thou art truly exceptional software... /s
If we peek inside GDB and look at how it parses the packet response, we find this code
case 'F':
p = ++reply;
target_frameno = (int) strtol (p, &reply, 16);
(via https://github.com/bminor/binutils-gdb/blob/e16e638/gdb/remote.c#L14258-L14263)
And lo, reading the docs of strtol
reveals this lovely feature:
Discards any whitespace characters (as identified by calling isspace) until the first non-whitespace character is found [...]
So that solves that mystery...
In any case, even though this "works", lets make sure gdbstub
remains spec-compliant, in case other GDB RSP clients aren't quite so forgiving of bonus whitespace here.
Apologies for assuming that you hadn't tested this code, hopefully you understand why I might've gotten that assumption from reading just this code + the corresponding spec 😅
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed.
Thanks for the feedback! I'll look over and resolve them. For some background, we're using gdbstub in order to build out debugging introspection for a project we have. Our current tooling is via Python, and we're using gdbstub via some PyO3 bindings I hacked together (which I do intend to open source eventually, once I find the time), but that also isn't very "interesting" code. I'm working on this tracepoint support in tandem with building out the rest of the debugging stack, and so the API surface here was the minimal amount I needed in order to get gdb to not error out with "not supported" and implement the tracepoint functionality. We do have functionality using this, however, so none of it should be untested code (although some parts, like |
93a789d
to
4e7dcca
Compare
I added additional error handling to |
Hey! Just wanted to drop in and mention that I see all the commits flying here, and that I'll definitely try to find some time soon to look into them, hopefully at some point in the next couple days before the weekend. Things have gotten unexpectedly busy on my end now that the new year is back in full swing... but I'll try to give timely feedback here to keep the ball rolling. Thanks for all the work an iteration you're doing here! |
Description
This PR adds basic tracepoint extension support to GDB stub. Closes #157.
API Stability
Checklist
rustdoc
formatting looks good (viacargo doc
)examples/armv4t
withRUST_LOG=trace
+ any relevant GDB output under the "Validation" section below./example_no_std/check_size.sh
before/after changes under the "Validation" section belowexamples/armv4t
./example_no_std/check_size.sh
)Arch
implementationValidation
GDB output
loading section ".text" into memory from [0x55550000..0x55550078] Setting PC to 0x55550000 Waiting for a GDB connection on "127.0.0.1:9001"...
(The start of the
cargo run
output is corrupted by binary data that's printed, so I cut out the portion of the output relevant for tracepoint packets)Before/After `./example_no_std/check_size.sh` output
Before
After