Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix Starlark debugger crash on duplicate name in locals #24919

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

fmeum
Copy link
Collaborator

@fmeum fmeum commented Jan 14, 2025

No description provided.

@fmeum fmeum requested review from hvadehra and removed request for tetromino and brandjon January 14, 2025 13:50
@github-actions github-actions bot added the awaiting-review PR is awaiting review from an assigned reviewer label Jan 14, 2025
Copy link
Contributor

@tetromino tetromino left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the obvious but wrong solution.

Consider

def foo(x):
    x += [[j(x) for x in i(x)] + h(x) for x in f(x) if g(x)]
    return k(x)

Now consider which value of which x should the debugger see

  • when f(x) is called
  • when g(x) is called
  • when h(x) is called
  • when i(x) is called
  • when j(x) is called
  • when k(x) is called

We saw the same bug reported internally and I have been scratching my head at it for a few days - it's very possible that to fix, we would need to revise how we represent the frame in debugger protos.

@iancha1992 iancha1992 added area-java-Starlark_API java_common, JavaInfo and other Starlark Java modules team-Starlark-Interpreter Issues involving the Starlark interpreter used by Bazel and removed area-java-Starlark_API java_common, JavaInfo and other Starlark Java modules labels Jan 14, 2025
@fmeum
Copy link
Collaborator Author

fmeum commented Jan 15, 2025

Thanks for the instructive example, I didn't grasp the full complexity of this issue.

This is the obvious but wrong solution.

I would argue that the current state (the debugger crashes) is pretty bad. What do you think of handling conflicting locals by communicating a dedicated Value proto with a description such as <conflicting>? That would resolve the crash while still not showing inaccurate information.

We saw the same bug reported internally and I have been scratching my head at it for a few days - it's very possible that to fix, we would need to revise how we represent the frame in debugger protos.

I checked what PyCharm's debugger is doing in this case. It emits additional <listcomp> frame that holds the value of x as passed into g through j. The frame below it has x at the value passed into foo (and later updated via the +=).

@tetromino
Copy link
Contributor

tetromino commented Jan 15, 2025

I checked what PyCharm's debugger is doing in this case. It emits additional <listcomp> frame that holds the value of x as passed into g through j. The frame below it has x at the value passed into foo (and later updated via the +=).

A new frame is probably the right way to present this. Unfortunately, while this is easy in cpython (where I believe the intepreter physically pushes new frames for comprehensions), it's not easy in starlark (to save memory, the starlark resolver puts comprehension's variables into the enclosing function's frame).

@fmeum
Copy link
Collaborator Author

fmeum commented Jan 15, 2025

I updated the PR to reflect what I proposed as an incremental improvement above.

@tetromino
Copy link
Contributor

Filed #24931 for the long-term fix

@tetromino
Copy link
Contributor

I think I see a better way of solving the crash: emit only the comprehension variable currently visible and hide the shadowed ones. Let me check if that's feasible.

@tetromino
Copy link
Contributor

I think my proposed fix will work: https://bazel-review.googlesource.com/c/bazel/+/269374

copybara-service bot pushed a commit that referenced this pull request Jan 15, 2025
The Starlark interpreter, for optimization reasons, stores the bindings of
comprehension variables in their enclosing function's locals array. This
means we cannot naively transform the locals array into a name -> value map
for debugger output: there may be multiple initialized variables having the
same name.

And since we have been building this map using buildOrThrow(), attempting
to debug a function with a comprehension variable having the same name as
a parameter or other local variable would crash Bazel.

The immediate fix would be to track comprehension bindings' lexical scope
(by saving a pointer to the comprehension AST node; note that a
comprehension's scope has a non-trivial shape - it has a "hole" punched out
for the first `for` clause), so we can ignore comprehension variables outside
their comprehension, and emit only the innermost one when inside a
comprehension (relying on the fact that the resolver places inner
comprehensions' variable bindings after the outer ones).

However, while this fixes the crash and the most obvious correctness problem,
it does not allow the user of the debugger to examine the value of the
shadowed variables.

The proper fix - to be implemented later - would be to push a new debugger
frame when debugging inside a comprehension.

See #24931 and discussion with @fmeum in #24919

PiperOrigin-RevId: 715908968
Change-Id: I3066e89f2e92d0fd0e483851a767b7c7d70ab555
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
awaiting-review PR is awaiting review from an assigned reviewer team-Starlark-Interpreter Issues involving the Starlark interpreter used by Bazel
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants