Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rule: multiple-statements-per-line #246

Draft
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

LiamPattinson
Copy link
Collaborator

Fixes #64

This was a lot harder to implement than I expected!

  • Adds a function that replaces all text in strings and comments with whitespace/'!' respectively, allowing text rules to avoid matching on characters in that context.
    • The difficulty here is catching multiline strings, and the fact that the following is valid:
"This is a character string &
  & going over &
  !!! surprise comment !!!
  & multiple lines"
  • Adds rule that finds semi-colons in code and replaces them with a newline and an appropriate amount of indentation.

I've left this as a draft as I'd like to add a companion rule, superfluous-semicolon, for a semicolon at the end of a line, and I still need to add tests.

@ZedThree
Copy link
Member

Could you do this without the blanking by something like: find ; nodes in the AST, then check if the next non-comment sibling is on the same line?

@LiamPattinson
Copy link
Collaborator Author

As far as I can tell ; doesn't show up in the AST. If you switch the search in check.rs from root_node.named_descendants() to just root_node.descendants() and print every node kind, things like ::, , and = show up, but there's no sign of ;. I'm not sure if I'm just missing something though.

@LiamPattinson
Copy link
Collaborator Author

While working on this further I've found something that might be a tree-sitter-fortran bug. The following code raises syntax errors, but as far as I know it's valid (albeit awful) Fortran:

program p
  implicit none
  integer :: i, j
  ! 1st, 2nd/3rd, and 4th semicolons are superfluous
  ! 3rd/2nd separates the two statements
  ;i = 1;; j = 2;
  write (*,*) i, j
end program p

It doesn't like a semicolon at the start of a line or multiples grouped together.

I can't see a single instance of either one of these scenarios in my limited collection of test projects, so this is a very low priority bug.

@LiamPattinson
Copy link
Collaborator Author

I've added a rule superfluous-semicolon, but ultimately I don't think it would be a good idea to merge these rules in their current state.

Blanking all comments and strings is a very expensive operation, and as we can't have one rule return multiple kinds of violation, it has to be done twice per file if both rules are active. Benchmarking suggests this slows the code by ~7%:

❯ hyperfine -i './target/release/fortitude check --output-format=concise --preview --ignore=S081,S082 ../geos-chem'
Benchmark 1: ./target/release/fortitude check --output-format=concise --preview --ignore=S081,S082 ../geos-chem
  Time (mean ± σ):      1.763 s ±  0.046 s    [User: 8.042 s, System: 0.396 s]
  Range (min … max):    1.684 s …  1.808 s    10 runs
 
  Warning: Ignoring non-zero exit code.
 
❯ hyperfine -i './target/release/fortitude check --output-format=concise --preview ../geos-chem'
Benchmark 1: ./target/release/fortitude check --output-format=concise --preview ../geos-chem
  Time (mean ± σ):      1.883 s ±  0.075 s    [User: 8.973 s, System: 0.414 s]
  Range (min … max):    1.761 s …  1.996 s    10 runs
 
  Warning: Ignoring non-zero exit code.
 

The difference would be a lot less noticeable in a realistic IO-constrained run (geos-chem takes almost 10 seconds on my machine with full output!), but this still indicates that we need a less expensive solution to this problem.

@ZedThree Would it be possible to have tree-sitter recognise semicolons in the AST?

@ZedThree
Copy link
Member

Ah, I think the issue is that "end-of-statement" is an unnamed node and done entirely through an external scanner.

We could change the grammar to something like:

_end_of_statement: $ => choice(';', $._scan_end_of_statement),

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Avoid multiple statements on the same line
2 participants