-
Notifications
You must be signed in to change notification settings - Fork 22
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Consider using mypyc to speed things up like black does #159
Comments
PerfBlack is partly faster because its CST implementation (a fork of lib2to3) isn't doing as much as LibCST does when parsing, validating, and transforming the tree (eg, it basically doesn't do any validation). There are ongoing efforts to improve performance of LibCST, including an experimental PEG parser written in Rust, which can be enabled by setting The other part is that black doesn't need to materialize the tree if it hasn't modified it, while µsort will materialize the updated tree every time. If black hasn't changed the tree, it just returns the original bytes, so the happy case is that much faster. µsort could probably be better about this, but would require extra bookkeeping effort during sorting to determine if anything has changed when sorting any of the blocks of imports. It's not impossible, but it's probably not a quick change, and I would expect that requires more memory and CPU time to track/verify, especially for large files. I would be happy to answer questions or offer guidance if you are interested in making that happen. ProfilingIt's worth keeping in mind that µsort—like black—uses multiprocessing when formatting. black uses its own code wrapping the multiprocessing module, but µsort uses trailrunner, which handles the process of walking and filtering any paths given, and running the appropriate sorting functions via multiprocessing. The reason you see 60%+ of time spent in trailrunner is because trailrunner is spawning the child process and waiting for it to finish sorting the given paths. This is to be expected, and it's not wasted time; that's the time spent actually sorting the requested files. If you are interested in a detailed break down of time spent in various pieces, please run µsort with the
In actuality, we are only spending a few milliseconds in trailrunner when walking paths; the rest of the time is spent in child processes doing the actual sorting. The fewer files you are asking to sort, the more the overhead of spawning the child process(es) will affect your total runtime. Another place where µsort could be better is in handling multiple path arguments. Currently, we walk and sort those given paths in sequence, rather than walking/gathering all of the given paths up front. This means something like |
To be clear, we are not dismissing the idea or benefit of projects like mypyc, but I wanted to at least discuss the reasons why there is a performance difference between black, and where trailrunner fits into µsort. I believe @zsol has been investigating the possibility of mypyc for LibCST, and as the majority of time running µsort is spent in LibCST, that is probably the best place to focus efforts on improving the runtime speed of µsort. |
Thank you for detailed response! @zsol, is there opened issue with discussion of using |
Black is significantly faster than ufsort, especially when it produces no changes.
I believe such performance difference may be due to black using mypyc: https://github.com/ichard26/black-mypyc-wheels
black is 4.25 times faster on check:
black is 2 times faster when making changes:
However looking at profiler run, I see that 3/5 of the time is taken by
trailrunner
(and nearly 90% of that time is spend inlock.acquire
), despite usort was given only one file path:pyinstrument --show-all -m usort format example.py
So clearly, there's a lot of room for optimizations even before compiling code with mypyc :)
The text was updated successfully, but these errors were encountered: