Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add revise argument to serve(...) #220

Merged
merged 4 commits into from
Oct 15, 2024

Conversation

frankier
Copy link
Contributor

Fixes #122

@JanisErdmanis This implementation is guided quite a bit by your ideas:

  • We guide the user towards loading Revise themselves
  • We warn the user in case they're code seems to not be running in a package. We let them continue anyway, since it should be possible to use includet, or its possible to only want updates from some other package you are developing.
  • Users can run without eager revisions if they want
  • There are locks around revise to prevent simultaneous eager and pre-request revisions

There's some differences too:

  • Revise is not enabled implicitly after loading, but a keyword option is required. I think this is less surprising. Some people load Revise in their startup files. Also it seems to me like having Revise in production could exacerbate security issues since now any change an attacked can write to code files are loaded immediately.

@ndortega I haven't added Revise as a dependency, since I think if Oxygen imports Revise it is could be too late. As a side benefit, people can keep their production Oxygen project completely Revise-free, which as said seems like it could be a useful defense-in-depth precaution.

@frankier frankier changed the title Add revise option to serve method Add revise argument to serve(...) Sep 18, 2024
@JanisErdmanis
Copy link
Contributor

JanisErdmanis commented Sep 18, 2024

I agree with many of your points, and the ability to switch between revised modes with a keyword argument is neat! I think perhaps naming eager and pre-request seems somewhat non-intuitive. Perhaps we could consider :background and :jit or some other pairs.

There are locks around revise to prevent simultaneous eager and pre-request revisions

This is a bit unclear to me. In the :eager mode, I thought that one only needs to listen to the revise condition, which shall trigger Revise.revise(). To prevent race condition also add Revise.revise at the request handler level as in the simple variant.

Revise is not enabled implicitly after loading, but a keyword option is required. I think this is less surprising. Some people load Revise in their startup files. Also it seems to me like having Revise in production could exacerbate security issues since now any change an attacked can write to code files are loaded immediately.

I don't see situations where one would load Revise in production or where it would be unintentionally left in Main. Also, source files in production should be immutable; thus, Revise would not have an effect. Hence, since the intention is quite strong regarding what users expect when they run using Revise in the main, I think it should be enabled implicitly.

@frankier
Copy link
Contributor Author

I agree with many of your points, and the ability to switch between revised modes with a keyword argument is neat! I think perhaps naming eager and pre-request seems somewhat non-intuitive. Perhaps we could consider :background and :jit or some other pairs.

I considered :jit, but I think it's a bit unclear given the multiple things it can be used to evoke. :prerequest. Probably the best combinations for consistency are :jit/:lazy/:eager on the one hand or :prerequest/:background on the other. I'm leaning towards the second, i.e. changing :eager => :background but I can be outvoted e.g. together with @ndortega.

I don't see situations where one would load Revise in production or where it would be unintentionally left in Main. Also, source files in production should be immutable; thus, Revise would not have an effect. Hence, since the intention is quite strong regarding what users expect when they run using Revise in the main, I think it should be enabled implicitly.

Nevertheless, the startup.jl file is loaded when programs are run as scripts see: JuliaLang/julia#46948 . So it would be possible through misconfiguration for someone to put a startup.jl file with Revise on a production server and Revise. Revise is commonly loaded in startup.jl (I don't personally in fact I think the startup file is a bit of an anti pattern). So now for defense in depth in production, we are asking people to set a special hidden no_revise_under_any_circumstances flag.

Yes they should be immutable. Security problems often happen when several shoulds become turned out they weren't -s i.e. assumptions get invalidated.

Also this forces the user to pick a Revise mode, and so sidesteps further discussion about the default.

Again I could be persuaded if @ndortega takes your side on this, but I'm fairly strongly convinced that this kind of implicit behaviour usually isn't worth it. In terms of inconvenience of writing a slightly longer entrypoint script, this doesn't have to be the last word: a generic entrypoint that does the correct thing could eventually be bundled with Oxygen; and a default-on behaviour could be added to some future "debug-mode" together with other features such as Infiltrator.jl integration.

This is a bit unclear to me. In the :eager mode, I thought that one only needs to listen to the revise condition, which shall trigger Revise.revise(). To prevent race condition also add Revise.revise at the request handler level as in the simple variant.

This was the old behaviour and also what is being done now. :eager does a superset of what :prerequest does.

You said earlier that:

With the race condition, I meant the effect of the Revise limitation of being unable to cancel running recompilation. Let's say T is the time it takes to recompile the project for some change. The probability that a change will happen within a recompilation time is proportional to P \propto T. If such a change happens, the user can wait 2T until they see their last change take effect. Hence, there is a tradeoff.

By locking the Revise, we linearise them. This means that each Revise can see the result of the previous Revise. This ensures that Revise does not duplicate work by starting two Revises at the same time. This is analagous to the https://en.wikipedia.org/wiki/Thundering_herd_problem where multiple processes try and recompute a new version of an invalidated cached value simultaneously. I thought this was what you were referring to here?

Or do you simply mean that Revise carries on Revising even when it is doing a stale update? Of course this is not ideal, but then again, in some cases Revise can reuse definitions the were unchanged since the last Revise meaning the subsequent Ts are not independent. I've been working with this eager mode and it seems to work well in practice, but it may depend on your exact file saving and code modification patterns. For that reason I think it's good that we have another Revise mode in case this one causes problems for some people.

@JanisErdmanis
Copy link
Contributor

I considered :jit, but I think it's a bit unclear given the multiple things it can be used to evoke. :prerequest. Probably the best combinations for consistency are :jit/:lazy/:eager on the one hand or :prerequest/:background on the other. I'm leaning towards the second, i.e. changing :eager => :background but I can be outvoted e.g. together with @ndortega.

I have found :lazy/:eager combination good as it is short. Perhaps we can reconsider this pair. I particularly don't like :prerequest as it is hard to understand what the :pre stands for.

Nevertheless, the startup.jl file is loaded when programs are run as scripts see: JuliaLang/julia#46948 . So it would be possible through misconfiguration for someone to put a startup.jl file with Revise on a production server and Revise. Revise is commonly loaded in startup.jl (I don't personally in fact I think the startup file is a bit of an anti pattern). So now for defense in depth in production, we are asking people to set a special hidden no_revise_under_any_circumstances flag

It seems like a bizarre thing to do, and it sounds highly unlikely to happen. It is often the case that HTTP servers are deployed with containers which also manage Julia installation and, by default, startup.jl is empty. I mean, there is a case in startup.jl that mocks up Oxygen.serve and adds HTTP /backdoor/ endpoint, or it can happen within some code hidden in Base itself. This supersedes this type of attack vector and makes a case for controlled deployments.

A few things can make implicit Revise deployment more transparent. The first one is that Oxygen can inform the user that a Revise handler has been added to the pipeline just after the splash screen so that it would appear in logs. If the user wants to test their code in a deployment scenario, they could do so with julia --startup-file=no. In addition, if necessary, users can specify an environment variable ENV["OXYGEN_REVISE_MODE"] in their startup file, which can take precedence over the implicit default of Oxygen.

By locking the Revise, we linearise them. This means that each Revise can see the result of the previous Revise. This ensures that Revise does not duplicate work by starting two Revises at the same time. This is analagous to the https://en.wikipedia.org/wiki/Thundering_herd_problem where multiple processes try and recompute a new version of an invalidated cached value simultaneously. I thought this was what you were referring to here?

I may be wrong, but I suspect Revise already has safeguards to ensure that one Revise.revise can run at a time. If there aren't, those locks would be best suited for Revise internals themselves.

Or do you simply mean that Revise carries on Revising even when it is doing a stale update? Of course this is not ideal, but then again, in some cases Revise can reuse definitions the were unchanged since the last Revise meaning the subsequent Ts are not independent.

This is what I have meant. Revision events, if Revise is not available, should be dropped when Revise is running, but there seems no harm in adding Revise.revise at the handler level to make sure that requests are served from the fresh code. In particular, that would always imply that :lazy Revise mode codes is also running in the :eager mode.

@frankier
Copy link
Contributor Author

I have found :lazy/:eager combination good as it is short. P

Done

[implicit vs explicit]

I take your points, but I do still stand behind my argument.

One specific counterpoint is that this would introduce environment variables as an alternative configuration method that's only supported for some arguments when currently everything is done through serve(...). Maybe it's better to keep things simple/consistent?

After Julia 1.12, it will become easier for Oxygen.jl to provide its own generic entrypoint. At this point command line based configuration method can be added.

Let's pause this point until we get a third opinion?

I may be wrong, but I suspect Revise already has safeguards to ensure that one Revise.revise can run at a time. If there aren't, those locks would be best suited for Revise internals themselves.

I don't see any. There are locks to do with requires and adding new packages to the watch list. Please do go ahead and do your own analysis and/or bring this up in the Revise issue tracker. Locking the whole of each full Revise would may be a big change change so would probably take a while including lengthly debate and might be a big enough change to need a new major release. Let's not block on this so we can 🚢

@ndortega
Copy link
Member

@frankier

Thanks for adding the implementation and some docs to go along with it!

Feedback:

  • I also like having an explicit keyword in the serve() function to integrate the revise handler. This approach is very similar t other web frameworks that have built in reloading.

Questions:

  • Does the@oxidise macro need to loaded in the target module to work? Or can It use the default global state from oxygen?

Ideas:

  • Based on what @JanisErdmanis said, I was wondering if we could generate this sandbox module so they don't have to write it manually.
    using Revise
    using Oxygen
    using MyModule
    MyModule.serve(revise=:eager)
    This should be possible since I've created modules on the fly inside the instances.jl file here. I know this is introducing some extra changes but this would make the api much closer to "FastAPI" and remove the need for a debug.jl script altogether.

@frankier
Copy link
Contributor Author

Does the @oxidise macro need to loaded in the target module to work? Or can It use the default global state from oxygen?

I've tested this, and we can either have something like this:

eager.jl

using Revise
using Oxygen
using ProjectBasedDemo2

Oxygen.serve(revise=:eager)

ProjectBasedDemo2.jl

module ProjectBasedDemo2

using Oxygen

@get("/") do req
    "home" 
end

end

In this case the route is not registered and so opening up / gives a 404. I guess this is to do with precompilation.

We can fix this by putting it in ___init__:

module ProjectBasedDemo2

using Oxygen

function __init__()
    @get("/") do req
        "home" 
    end
end

end

But then Revise doesn't see updates. Possibly because it has been executed in the context of the Main module which is not tracked by Revise.

Based on what @JanisErdmanis said, I was wondering if we could generate this sandbox module so they don't have to write it manually.

I'm unclear on what that would look like both from the user's perspective and from an implementation perspective. Is it that you want everything to be in a single file? While not knowing the details, I will make a guess that generated modules like this wouldn't be able to benefit from precompilation. Then again, perhaps if @JanisErdmanis knows a lot about this then he could make a follow-up PR after this one is merged to add this in.

In terms of removing boilerplate, one perspective is that by moving to the module format like this, eventually the entrypoint script can be removed in the simple case and replaced with a generic entrypoint which takes the target module as an argument. This puts us back to a single file.

In summary, my current perspective is that @oxidise works fine and I'm not sure there is a need for an alternative, but perhaps that alternative could be added later.

@JanisErdmanis
Copy link
Contributor

The idea I had about evaluating into a module that Revise could track is covered with includet provided by Revise. I had forgotten about its existence when proposing includeox and evaluation in the modules. I haven’t used includet and did jump straight into using package structure. If there are issues with includet then they should be addressed with Revise itself. It does not seem productive to emulate the behaviour within Oxygen itself.

Regarding the issue, I am still curious whether it can cause a race condition and crash Julia when Revise.revise() is executed in a multithreaded scenario. It seems so intrinsic that it should stop the scheduler and not allow anything else to happen when Revise runs. Hence, the lock is unnecessary here.

@frankier
Copy link
Contributor Author

frankier commented Sep 23, 2024

includet

The usage of includet is recommended against in the Revise documentation. They recommend using a package + module. That way you get optimizations like precompilation.

In addition this still requires 2 files. Now we have an entrypoint script, and the file with the routes targetted by includet.

However, in case this is something people are interested in supporting, they could test it out, add it to the documentation and detect this case and remove the warning for this case in a follow-up PR.

Hence, the lock is unnecessary here.

The normal usage of Revise e.g. in the interpreter is to be (automatically) called only in the interactive thread. The multithreaded behaviour of Revise is a bit beyond the scope of things here. I guess not many people understand its implementation well enough to answer exactly what cases are protected against.

The locks, however, are not there to protect data structures against access from multiple threads, they're to serialise revisions in order prevent duplicated work. This is relevant also for single-threaded execution where multiple tasks can still interleave. This is important, since I have read enough of Revise's source code to determine that any locking is at a more fine grained level and does not lock the whole call to Revise and so multiple threads could start the first stage of Revise, deciding which parts of the code need to be revised at the same time.

@frankier
Copy link
Contributor Author

Okay I've asked here: https://discourse.julialang.org/t/is-it-safe-to-run-revise-from-multiple-threads/119743

@frankier
Copy link
Contributor Author

Julian Samaroo from Slack #helpdesk agreed that there is probably an issue with Revise's internal data global state being manipulated. So I guess together with the thundering herd issue, there are two good reasons to keep the locks.

@frankier
Copy link
Contributor Author

Issue filed with Revise here: timholy/Revise.jl#845

@JanisErdmanis
Copy link
Contributor

I also looked into the discussion on the Slack. The lock on Revise.revise is indeed necessary as there are multiple global variables manipulated within the function. Thank you for exploring this possibility!

@frankier
Copy link
Contributor Author

frankier commented Sep 25, 2024

It looks like the next release of Revise will have internal locking of Revise.revise(...). I could remove the locks and then we could check the version of Revise, and if it's an old one error when revise is passed together with parallel=true.

@JanisErdmanis
Copy link
Contributor

error when revise is passed together with parallel=true

Excellent! We were drawn into details where we tried to fix the use of Revise in the multithreaded scenario, but ultimately, there does not seem to be a use case for multiple threads when developing a service.

I hope we can merge this PR. Further improvements to address a major use case do not seem to conflict with what has already been done here.

@frankier
Copy link
Contributor Author

there does not seem to be a use case for multiple threads when developing a service

I'm a bit surprised to hear your say this. I was definitely thinking you had this in mind when bringing up concurrency issues.

Anyway, we've made Revise add locks now, so I've just added a warning that it won't work with old Revise. It may be useful for speeding things up in development.

Let's merge?

@JanisErdmanis
Copy link
Contributor

My only complaint about the :eager mode regarding concurrency was that it could take 2T time instead of T when there are two quick subsequent changes in the codebase. I guess I mentioned the potential for threading issues, which is my bad 🧵

@frankier
Copy link
Contributor Author

@JanisErdmanis No worries! It resulted in me uncovering a few edge cases in Revise that will hopefully help with its reliability long-term and maybe someone benefits from parallel Revise in the future.

@ndortega I think the tests need to be re-run, but you need to be the one to do this.

@frankier
Copy link
Contributor Author

frankier commented Oct 8, 2024

@ndortega Friendly bump

@ndortega ndortega merged commit a07fea1 into OxygenFramework:master Oct 15, 2024
3 of 7 checks passed
@ndortega
Copy link
Member

@frankier
When you get a chance can you add tests for this code? I generally don't publish any code that has no test coverage - unless it's a high impact change. I'd say this does qualify, but I'd still like to get some basic coverage: starting a server with revise and hitting some endpoint's just to do a quick sanity check

@ndortega ndortega linked an issue Oct 17, 2024 that may be closed by this pull request
@frankier
Copy link
Contributor Author

frankier commented Oct 21, 2024

I did consider adding some kind of test test, but I guess I wasn't sure how to add a meaningful test here without mocking Revise or starting a new Julia process within the tests, both of which are quite a pain/potentially fragile.

The reason a new process would have to be started in testing to use real Revise is that the whole implementation is quite dependent on loading order of modules.

For mocking in Julia there is either SimpleMock.jl which doesn't work anymore or Mocking.jl which requires mock targets to by marked with @mock in the source code. This also means it has to be added as a direct dependency rather than a test dependency.

So it seems to me there's no nice option here, and that maybe leaving it untested for now was overall a lesser evil, but if you have some idea for a reasonably maintainable/non-fragile way then I could also help add the test.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Programmatic server restart Hot module reloading
3 participants