-
Notifications
You must be signed in to change notification settings - Fork 143
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RFC: Initialize runtime globals during __init__ #703
base: master
Are you sure you want to change the base?
Conversation
Nice, I haven't fully looked at the PR in detail, but I think we should go for something like this. I actually prototyped a similar idea before hand over at https://github.com/musm/HDF5.jl/tree/refs, however I used constant global refs, which where initialized at runtime, in a dummy way without magic (which is super nice in terms of the overall user interface and ergonomics). I never fully pushed it through, but this PR got me interested to compare how the different approaches effect package precompilation and load times. Here are the timings I get (using your gist for timings)
I'm not too concerned with the precompile times, since the PR here does more codegen . Comparing the pkg load times, it seems that the hit is really not too bad in general. Using const refs, does seem to offer improved precompilation/pkgload times, which I am assuming come from the fact that they are global constants and that this helps compilation since the types are known a priori to runtime initialization. In summary, the pkgload time hit using this PR is 15%, and using constant refs 5% (https://github.com/musm/HDF5.jl/tree/refs). The constant ref approach, does make using the constants less ergonomic, which is why I think I held off from it, since I couldn't think of a clean interface for easily accessing the constants. EDIT: I should actually review the PR in detail before commenting, because this does use const refs, I just didn't realize it since I didn't look at the PR in detail 😄 I think in principle it should be possible to segfault master HDF5, by opening multiple HDF5.jl instances from different Julia processes, which would demonstrate why we need such a PR that makes these constants runtime initialized. Although, I haven't been able to successfully to do so. |
I just added a new commit which brings down both the precompile and load time just a little bit. The issue was that I was defining the The SnoopCompile setup: julia> using SnoopCompileCore
julia> invs = @snoopr using HDF5;
julia> using SnoopCompile Before: (commit b6ab18b using singleton types) julia> trees = invalidation_trees(invs)
1-element Vector{SnoopCompile.MethodInvalidations}:
inserting getproperty(::Type{H5E}, sym::Symbol) in HDF5 invalidated:
backedges: 1: superseding getproperty(x::Type, f::Symbol) in Base at Base.jl:28 with MethodInstance for getproperty(::DataType, ::Symbol) (1220 children)
3 mt_cache After: (commit f09463a using singleton instances) julia> trees = invalidation_trees(invs)
SnoopCompile.MethodInvalidations[] And timing precompile and load from master and the before/after singleton commits:
|
069e113
to
9c98300
Compare
""" | ||
macro defconstants(prefix::Symbol, expr::Expr) | ||
isexpr(expr, :block) || error("Expected block expression") | ||
stmts = expr.args |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe better to Base.remove_linenums!(expr1)
first ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Possibly — I don't know which is faster to execute. I was guessing the isa
-continue
might be since it wouldn't have to modify any arrays.
So in the new change |
Yes. |
Yeah I think this is pretty cool. In general, I'm comfortable with pushing through with this, because in principle it is the correct thing to do (runtime initialization) and the ergonomics of using these types is enhanced, albeit behind some chicanery, which I don't think is all that different from, say factorization using I wonder if we can block redefinitions of truly constant values or error them (without a runtime or compile time cost), because right now for non-refs things like |
It's definitely possible, but I'm a little worried that at some point the long chain of
Ah, yes, I should definitely add an error path to both |
I haven't thought about the implementation in great detail, but that sounds like it could work. I guess that means we would have to hide all of the lower-level functions in same 'private/api' module. Should we also fully commit to the hack and define
|
8163b58
to
ea957d2
Compare
I'd been wondering the same thing. |
I guess we should make a decision on the type of 'public' interface we want to go ahead with in the future. If we commit to this style of using 'dot' to prefix the various HDF5 modules, instead of our current convention of using underscore, then this change is very desirable. If not, this change is still desirable in the sense that it improves initialization of global and makes them much more ergonomic to use instead of trying to figure out which is a runtime global and having to use
I'm kind of interested to see if any other packages have utilized similar in sprit, 'hacks' (that feels a bit too strong a word here, but I think you get my point) |
setfn = quote | ||
function Base.setproperty!(::$einnermod.$prefix, sym::Symbol, value) | ||
$(setbody...) | ||
error($(string(prefix) * "."), sym, " cannot be set") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should probably change the error message to be consistent with what is printed in Base. Either:
Assigning a variable to in an other module:
"ERROR: cannot assign variables in other modules"
Trying to redefine a constant:
"ERROR: cannot declare x constant; it already has a value"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the second one because it's still technically correct.
* Hide all underlying constants within private module
Co-authored-by: Mustafa M <[email protected]>
This is crucial to avoid creating a huge number of invalidations in Base code. Using SnoopCompile (on Julia master ~v1.6): ```julia-repl julia> using SnoopCompileCore julia> invs = @Snoopr using HDF5; julia> using SnoopCompile ``` **Before:** ```julia-repl julia> trees = invalidation_trees(invs) 1-element Vector{SnoopCompile.MethodInvalidations}: inserting getproperty(::Type{H5E}, sym::Symbol) in HDF5 invalidated: backedges: 1: superseding getproperty(x::Type, f::Symbol) in Base at Base.jl:28 with MethodInstance for getproperty(::DataType, ::Symbol) (1220 children) 3 mt_cache ``` **After:** ```julia-repl julia> trees = invalidation_trees(invs) SnoopCompile.MethodInvalidations[] ```
ea957d2
to
42060a0
Compare
This isn't necessarily a real proposal — I wanted to explore how far we could get with actually doing runtime initialization of the in-principle dynamic global "constants" from the HDF5 library while simultaneously not adding a burden on users to know whether a value is a true constant or a runtime-constant.
As it turns out, I was able to hide the difference from users by using (abusing?)
getproperty
defined on structs. The basic idea is:getproperty
can evaluate an expression for each use ofname.symbol
, so a largeif-elseif
block can be used to hide whether a value is a true constant (and therefore returnsymbolname
) or a dynamic runtime value stored in aRef
(and therefore returnsymbolname[]
).getproperty
cannot be defined on a module, so I've "abused" a singletonstruct
for method dispatch. The runtime values are actually stored in a [lightly] name-mangled bare module.Ref
containers.setindex!
allows easily initializing the value during__init__
no matter the internal implementation.Pros:
Float16
on similar footings to all the built-in data types — the datatype is constructed at initialization and just fills in itsRef
container. No need for runtime@eval
which is otherwise necessary to avoid requiring users to do the Ref-indexing.Float16
to actually work — there's some changes to uses ofh5t_get_native_type
in Added support forFloat16
#341 that I haven't replicated here, and not including those changes means the genericread()
doesn't work due to size mismatch on the actual Float16 datatype (2 bytes) versus it's native-ified datatype (4 bytes). I've just included the last commit to show that adding a new "constant" is actually very easy with this scheme.propertynames
also means you get very convenient tab completion as well.Cons:
Ref
containers at__init__
must add a bit of load time on normal uses as well.getproperty
s to just the relevant expression, but I haven't checked prior versions for whether this is actually adding a large runtime if-else chain to every use.