-
Notifications
You must be signed in to change notification settings - Fork 29
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cabal exact print #65
base: main
Are you sure you want to change the base?
Cabal exact print #65
Conversation
From a quick glance at the mega-issue, the approach of using |
@andreabedini, it would be useful to have your thoughts on this as someone that's considered this stuff deeply before. |
Having reviewed everything I see there's already a lot of work in this draft PR: https://github.com/haskell/cabal/pull/9436/files It would be good to clarify if this proposal is to continue to extend that work, or to do something else inspired by it, or something else. It looks like the first to me! If so, it would be good to clarify A) what remains to be done given that work already existing, and B) if this approach has been tested in terms of what happens if we parse, modify, and then exact-print. Are there examples of this -- and where in that code is the algorithm that sort of "merges" the changes back into the exact-printed structure? Reviewing that PR I think I recall that the code looked very promising, and answers a lot of questions that the proposal itself isn't in-depth enough to get to. |
@gbaz
There are no tests for this right now, we should add tests for this, please list which modifications are important and I'll add an automated test for each of them.
The current implementation will "improvise" and fall back to the pretty printer. it does something like this right now but it's kind of adhoc: and it isn't quite right, because it'll find negative relative rows and do nothing with that information, but it indicates those improvisations have occurred: https://github.com/haskell/cabal/pull/9436/files#diff-53cbe1fb815e26b11060bd5c78c372013a079192b8acad5c1c6044ecc647dcefR137 For comment placement you can do the same relative shifts per row, but for columns you may need to keep right shifting them untill you find place (under modification) |
Hi eric,
as far as I can tell it's the root type used for dealing with
I'm open to suggestions! Comments specifically are underdeveloped in its current form because it was the zurich hack prototype which I barely got working 😄
Invalidation works kinda great for exact positions themselves (eg for fields and sections). Perhaps using a map wasn't the best idea for comments, |
Thanks for the updates and comments. On the thing you added with remaining work, I think a proposal we accept would need some sense of how we can architect thing to solve those problems, since they're pretty fundamental:
Specifically, I think that adding support for common stanzas and conditionals is going to be both very tricky and very important. Meanwhile, adding support for braces can be out of scope -- i.e. if braces are translated away, I think that's livable -- they're a basically unused feature. And commas should be straightforward. |
It is great to see you pushing forward with this @jappeace. I know you have invested significant time on it already and we should not let that effort go unnoticed. I also liked working with you on this at ZuriHac, if only that weekend had been a few days longer! With respect to the proposal. Thanks for mentioning my
Replacing GenericPackageDescription (GPD) has never been on the cards. My approach, like yours, was to make the minimal change that is a step forward. Therefore what I have been working on was intentionally ... anticlimactic: forget about the exact-print of GPD and start with the exact-print of A value of type To make a long story short: my plan was to get the lexer to keep whitespace and comments so I could exactprint In any case, the two approaches are not incompatible or in alternative. I believe you must have gone past the obstacles I have encountered, which now leaves me to decide what to do with my efforts :-)
Maybe I misunderstand the plan but I am not sure whether GPD can even be mapped to the concrete grammar. @gbaz rightly mentions the There are some constraints I think we should keep in mind.
All this might well within reach, you just need to prove it :P Footnotes
|
I strongly suspect this approach I'm taking will work, but I don't wish to discourage you.
Could you elaborate on how this deals with changes made? For example, if hls wants to add a dependency to a library.
yes it passes the tests. Like mentioned in the proposal, it's not complete however, not all comment positions are captured.
I'm not familiar with the cabal package structure, why is this desired?
Could you elaborate on this?
I already mentioned this I think.
I mentioned in the proposal it should not crash, because no increase at all seems a bit stringent. Please review:
Why do we want this? |
@gbaz I added some ideas for both common stanzas and conditionals. |
Thanks for the updates. Starting to shape up. |
@jappeace Even with relative offsets, the separate map idea sounds very hard to work with to me. I still strongly lean towards the |
We want operations to not get slower because cabal needs to parse the full tarball of cabal files to build its database to solve, etc. If parsing is slow, then it becomes amplified at scale, and makes end-users frustrated when common commands take longer. |
in terms of "Refactor GenericPackageDescription so that the parser no longer merges these sections and instead stores them as proper records. Then make the callsites smart enough to deal with common stanzas." I think this will be very painful. There are a lot of callsites and a lot of usages -- especially since so many fields can be within common stanzas. A remembering and "smart back-merge" algo seems much more lightweight. As far as this approach vs fields, the advantage to this gpd approach that I see is that it would let people modify the gpd, which is a much more "natural" structure to the data, as opposed to modifying fields directly. So its like letting people manipulate an actual datastructure vs the json-encoding of that structure. In fact, roughly, Fields : GPD :: Json : ADT |
I greatly appreciate the effort, so please consider my remarks below not as a critique against but as an encouragement to make the proposal stronger. In a world of infinite resources having an exact printer is better than not having one. But in the real world I'd suggest to adopt a problem-based approach. What is a practical problem which an exact printer can solve, which cannot be reasonably solved by other, already available methods? There is no such business ask as "bidirectional parsing and printing" and I hardly imagine an HF sponsor immediately excited about by such achievement. The proposal lists automated addition of dependencies, expansion of modules and formatting. But we do have (maybe, imperfect) automated tools for all of this already. Would it be easier to accomplish such tasks with an exact printer? How exactly? To pick a particular instance, what kind of impact would an exact printer have for As much as I can tell from a cursory glance, GHC exact printer needs lots of maintenance. Pretty much every release of GHC requires non-trivial amount of work in this area. We'd better have a very good justification to invest into one for Cabal. |
it's kindoff annoying because the reference implementation already uses this, but somehow people aren't connecting the dots.
I don't quite understand this, |
It looks like parsing to me because it's working directly on |
It would help me to see a specific example how |
the type we wish to expose isn't There are many ways of capturing the cruft of commas and spacing, the one I proposed here is just an imperfect but hopefully decent solution, |
rendered proposal