Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add an offline precompile workload #1195

Merged
merged 10 commits into from
Nov 23, 2024
Merged

Conversation

IanButterworth
Copy link
Member

@IanButterworth IanButterworth commented Nov 20, 2024

Fixes #1194
Or at least makes decent progress.

Before

julia> using HTTP
   1732.9 ms  ✓ HTTP (precompile time)

julia> @time HTTP.get("http://google.com");
  4.109167 seconds (13.00 M allocations: 658.440 MiB, 2.90% gc time, 96.96% compilation time: 2% of which was recompilation)

julia> @time HTTP.get("http://google.com");
  0.155600 seconds (11.27 k allocations: 746.781 KiB, 12.46% compilation time)

julia> @time HTTP.get("http://google.com");
  0.122175 seconds (574 allocations: 196.016 KiB)

This PR

julia> using HTTP
   6780.5 ms  ✓ HTTP (precompile time)

julia> @time HTTP.get("http://google.com");
  0.678762 seconds (1.17 M allocations: 64.638 MiB, 1.65% gc time, 77.77% compilation time: 86% of which was recompilation)

julia> @time HTTP.get("http://google.com");
  0.174314 seconds (11.28 k allocations: 746.500 KiB, 10.45% compilation time)

julia> @time HTTP.get("http://google.com");
  0.137826 seconds (596 allocations: 196.641 KiB)

All the big precompiles are baked in and just one big recompile left.
Screenshot 2024-11-19 at 10 38 17 PM

#=  339.1 ms =# precompile(Tuple{HTTP.var"##request#32", Nothing, Array{UInt8, 1}, Nothing, Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}}, typeof(HTTP.request), HTTP.MessageRequest.var"#makerequest#messagelayer##0"{HTTP.MessageRequest.var"#makerequest#1#messagelayer##1"{HTTP.RedirectRequest.var"#redirects#redirectlayer##0"{HTTP.RedirectRequest.var"#redirects#1#redirectlayer##1"{HTTP.HeadersRequest.var"#defaultheaders#headerslayer##0"{HTTP.HeadersRequest.var"#defaultheaders#1#headerslayer##1"{HTTP.CookieRequest.var"#managecookies#cookielayer##0"{HTTP.CookieRequest.var"#managecookies#1#cookielayer##1"{HTTP.RetryRequest.var"#manageretries#retrylayer##0"{HTTP.RetryRequest.var"#manageretries#1#retrylayer##1"{HTTP.ConnectionRequest.var"#connections#connectionlayer##0"{HTTP.ConnectionRequest.var"#connections#1#connectionlayer##1"{HTTP.TimeoutRequest.var"#timeouts#timeoutlayer##0"{HTTP.TimeoutRequest.var"#timeouts#1#timeoutlayer##1"{HTTP.ExceptionRequest.var"#exceptions#exceptionlayer##0"{HTTP.ExceptionRequest.var"#exceptions#1#exceptionlayer##1"{typeof(HTTP.StreamRequest.streamlayer)}}}}}}}}}}}}}}}}, String, String, Nothing, Array{UInt8, 1}, Nothing}) # recompile

Though there is an issue open about recompiles being incorrectly marked as such JuliaLang/julia#56155

Importantly, are these __init__()'s safe to call during precompilation?

@IanButterworth
Copy link
Member Author

IanButterworth commented Nov 20, 2024

With this PR, on 1.11 (because SnoopCompile is broken on julia master)

julia> using SnoopCompileCore

julia> invs = @snoop_invalidations using HTTP;

julia> using SnoopCompile, AbstractTrees

julia> trees = invalidation_trees(invs)
8-element Vector{SnoopCompile.MethodInvalidations}:
 inserting joinpath(uri::URIs.URI, parts::String...) @ URIs ~/.julia/packages/URIs/7Tg5r/src/URIs.jl:584 invalidated:
   mt_backedges: 1: signature Tuple{typeof(joinpath), Any, String} triggered MethodInstance for Artifacts.jointail(::Any, ::String) (0 children)

 inserting write(io::IO, s::Union{Base.AnnotatedString, SubString{<:Base.AnnotatedString}}) @ StyledStrings ~/.julia/juliaup/julia-1.11.1+0.aarch64.apple.darwin14/share/julia/stdlib/v1.11/StyledStrings/src/io.jl:255 invalidated:
   backedges: 1: superseding write(io::IO, s::AbstractString) @ Base strings/io.jl:196 with MethodInstance for write(::HTTP.Streams.Stream{HTTP.Messages.Response, HTTP.Connections.Connection{Sockets.TCPSocket}}, ::Base.AnnotatedString{String}) (1 children)

 inserting write(io::IO, c::Base.AnnotatedChar) @ StyledStrings ~/.julia/juliaup/julia-1.11.1+0.aarch64.apple.darwin14/share/julia/stdlib/v1.11/StyledStrings/src/io.jl:270 invalidated:
   backedges: 1: superseding write(io::IO, x) @ Base io.jl:792 with MethodInstance for write(::HTTP.Streams.Stream{HTTP.Messages.Response, HTTP.Connections.Connection{Sockets.TCPSocket}}, ::Base.AnnotatedChar{Char}) (1 children)

 inserting print(io::IO, c::Base.AnnotatedChar) @ StyledStrings ~/.julia/juliaup/julia-1.11.1+0.aarch64.apple.darwin14/share/julia/stdlib/v1.11/StyledStrings/src/io.jl:281 invalidated:
   backedges: 1: superseding print(io::IO, c::AbstractChar) @ Base char.jl:253 with MethodInstance for print(::IOContext{IOBuffer}, ::Base.AnnotatedChar{Char}) (1 children)
              2: superseding print(io::IO, c::AbstractChar) @ Base char.jl:253 with MethodInstance for print(::IOBuffer, ::AbstractChar) (1 children)
              3: superseding print(io::IO, c::AbstractChar) @ Base char.jl:253 with MethodInstance for print(::Base.AnnotatedIOBuffer, ::Base.AnnotatedChar{Char}) (1 children)

 inserting show(io::IO, c::Base.AnnotatedChar) @ StyledStrings ~/.julia/juliaup/julia-1.11.1+0.aarch64.apple.darwin14/share/julia/stdlib/v1.11/StyledStrings/src/io.jl:283 invalidated:
   backedges: 1: superseding show(io::IO, c::AbstractChar) @ Base char.jl:283 with MethodInstance for show(::IOBuffer, ::AbstractChar) (12 children)

 inserting typeinfo_implicit(::Type{Dates.Second}) @ Dates ~/.julia/juliaup/julia-1.11.1+0.aarch64.apple.darwin14/share/julia/stdlib/v1.11/Dates/src/periods.jl:23 invalidated:
   backedges: 1: superseding typeinfo_implicit(T) @ Base arrayshow.jl:550 with MethodInstance for Base.typeinfo_implicit(::Any) (22 children)
   1 mt_cache

 inserting kwcall(::NamedTuple, ::typeof(printstyled), io::Base.AnnotatedIOBuffer, msg...) @ StyledStrings.Legacy ~/.julia/juliaup/julia-1.11.1+0.aarch64.apple.darwin14/share/julia/stdlib/v1.11/StyledStrings/src/legacy.jl:126 invalidated:
   backedges: 1: superseding kwcall(::NamedTuple, ::typeof(printstyled), io::IO, msg...) @ Base util.jl:141 with MethodInstance for Core.kwcall(::@NamedTuple{bold::Bool, italic::Bool, underline::Bool, blink::Bool, reverse::Bool, hidden::Bool, color::Symbol}, ::typeof(printstyled), ::IO, ::String, ::String, ::String) (1 children)
              2: superseding kwcall(::NamedTuple, ::typeof(printstyled), io::IO, msg...) @ Base util.jl:141 with MethodInstance for Core.kwcall(::@NamedTuple{bold::Bool, italic::Bool, underline::Bool, blink::Bool, reverse::Bool, hidden::Bool, color::Symbol}, ::typeof(printstyled), ::IO, ::String, ::String) (1 children)
              3: superseding kwcall(::NamedTuple, ::typeof(printstyled), io::IO, msg...) @ Base util.jl:141 with MethodInstance for Core.kwcall(::@NamedTuple{bold::Bool, italic::Bool, underline::Bool, blink::Bool, reverse::Bool, hidden::Bool, color::Symbol}, ::typeof(printstyled), ::IO, ::String, ::Vararg{String}) (15 children)
              4: superseding kwcall(::NamedTuple, ::typeof(printstyled), io::IO, msg...) @ Base util.jl:141 with MethodInstance for Core.kwcall(::@NamedTuple{bold::Bool, italic::Bool, underline::Bool, blink::Bool, reverse::Bool, hidden::Bool, color::Int64}, ::typeof(printstyled), ::IO, ::String, ::String) (1 children)
              5: superseding kwcall(::NamedTuple, ::typeof(printstyled), io::IO, msg...) @ Base util.jl:141 with MethodInstance for Core.kwcall(::@NamedTuple{bold::Bool, italic::Bool, underline::Bool, blink::Bool, reverse::Bool, hidden::Bool, color::Symbol}, ::typeof(printstyled), ::IO, ::String) (6 children)
              6: superseding kwcall(::NamedTuple, ::typeof(printstyled), io::IO, msg...) @ Base util.jl:141 with MethodInstance for Core.kwcall(::@NamedTuple{bold::Bool, italic::Bool, underline::Bool, blink::Bool, reverse::Bool, hidden::Bool, color::Int64}, ::typeof(printstyled), ::IO, ::String) (1 children)

 inserting print(io::IO, s::Union{Base.AnnotatedString, SubString{<:Base.AnnotatedString}}) @ StyledStrings ~/.julia/juliaup/julia-1.11.1+0.aarch64.apple.darwin14/share/julia/stdlib/v1.11/StyledStrings/src/io.jl:258 invalidated:
   backedges: 1: superseding print(io::IO, s::AbstractString) @ Base strings/io.jl:195 with MethodInstance for print(::IOContext{IOBuffer}, ::Base.AnnotatedString{String}) (56 children)

@Sixzero
Copy link

Sixzero commented Nov 20, 2024

Really nice, how fast this got fixed!

Removing TTFX for every julia session for simple .get request is really a valuable thing in my opinion.

Going to test the solution now!

@Sixzero
Copy link

Sixzero commented Nov 20, 2024

Okay, we are actually faster than CurlHTTP.jl! Also it fixed post request times!

This is insane! I don't know if precompilation of all julia packages took longer, but this is lifechanging, finally get/post requests are fast in julia!

@IanButterworth
Copy link
Member Author

@quinnj perhaps you're best positioned to review this?

src/precompile.jl Outdated Show resolved Hide resolved
Copy link
Member

@quinnj quinnj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I think this looks pretty great.

@IanButterworth
Copy link
Member Author

Great. What about the __init__'s?

I think they're robust to being called twice in a row.. i.e. push!(empty!
Except for ConnectionRequest.__init__() which just pushes? Perhaps that one's not needed here?
https://github.com/IanButterworth/HTTP.jl/blob/4510bd2a0cb818abe71d75fb71d2b57c01b0dfa4/src/clientlayers/ConnectionRequest.jl#L23-L31

@Sixzero
Copy link

Sixzero commented Nov 21, 2024

Things work well after 1 day of usage. I guess compat is also good.

@IanButterworth
Copy link
Member Author

@quinnj I don't have merge rights here

@timholy
Copy link
Contributor

timholy commented Nov 22, 2024

Calling __init__ from a precompile workload should be fine. The TTL increase is a little sad. (We really should start caching the results of invalidation checks.) Is there any part of this workload that you don't need precompiled and could be moved to a @setup block?

@KristofferC
Copy link
Contributor

The time to load increase makes the total latency bigger than it was before, or?

Regarding inits, those should already run, no?

@IanButterworth
Copy link
Member Author

Time to load is the same. I showed time to precompile.

@KristofferC
Copy link
Contributor

Aha, the using made me think it was just loading. All good then 👍. (I see @timholy made the same mistake heh).

@fredrikekre
Copy link
Member

It is interesting that this works so well given that the test case takes the https path while precompilation is for http.

@fredrikekre
Copy link
Member

julia> using HTTP

julia> @time HTTP.get("http://google.com");
  1.416276 seconds (1.73 M allocations: 83.683 MiB, 2.23% gc time, 66.31% compilation time)
julia> using HTTP

julia> @time HTTP.get("https://google.com");
  2.386149 seconds (3.09 M allocations: 153.946 MiB, 2.69% gc time, 95.02% compilation time)

@fredrikekre
Copy link
Member

Note also that HTTP compiles quite a lot depending on the input arguments, but this PR is still very benefitial for these cases too:

julia> using HTTP

julia> @time HTTP.get("https://google.com");
  2.792014 seconds (2.88 M allocations: 152.468 MiB, 1.08% gc time, 93.55% compilation time: 35% of which was recompilation)

julia> @time HTTP.get("https://google.com", ["User-Agent" => "julia"]);
  0.722978 seconds (34.91 k allocations: 1.862 MiB, 3 lock conflicts, 11.12% compilation time)

julia> @time HTTP.get("https://google.com"; retries = 5);
  0.440146 seconds (266.69 k allocations: 13.969 MiB, 3 lock conflicts, 78.77% compilation time)

julia> @time HTTP.get("https://google.com"; retry = false);
  0.447537 seconds (263.32 k allocations: 13.759 MiB, 3 lock conflicts, 77.86% compilation time)

julia> @time HTTP.get("https://google.com"; redirect_limit = 5);
  0.170655 seconds (43.77 k allocations: 2.449 MiB, 3 lock conflicts, 39.95% compilation time)

@IanButterworth
Copy link
Member Author

IanButterworth commented Nov 22, 2024

I know you're saying that its probably not necessary, but if I bake those in

julia> using HTTP
Precompiling HTTP...
Info Given HTTP was explicitly requested, output will be shown live
  2.267022 seconds (15.86 M allocations: 800.601 MiB, 4.71% gc time, 99.81% compilation time)
  0.020154 seconds (125.76 k allocations: 6.660 MiB, 96.36% compilation time)
  0.075205 seconds (334.63 k allocations: 17.019 MiB, 99.14% compilation time)
  0.075720 seconds (331.12 k allocations: 16.825 MiB, 99.05% compilation time)
  0.023594 seconds (108.50 k allocations: 5.781 MiB, 97.43% compilation time)
  1 dependency successfully precompiled in 7 seconds. 29 already precompiled.
  1 dependency had output during precompilation:
┌ HTTP
│  [Output was shown above]
└

julia> begin
           url = "https://google.com";
           @time HTTP.get(url);
           @time HTTP.get(url, ["User-Agent" => "julia"]);
           @time HTTP.get(url; retries = 5);
           @time HTTP.get(url; retry = false);
           @time HTTP.get(url; redirect_limit = 5);
       end;
  1.208905 seconds (3.10 M allocations: 154.791 MiB, 3.29% gc time, 84.32% compilation time)
  0.092161 seconds (13.77 k allocations: 853.656 KiB, 2 lock conflicts, 6.37% compilation time)
  0.094209 seconds (673 allocations: 185.328 KiB, 2 lock conflicts)
  0.097040 seconds (653 allocations: 255.398 KiB, 2 lock conflicts)
  0.095959 seconds (629 allocations: 183.906 KiB, 2 lock conflicts)

Without baking the kwarg ones in

  1.187503 seconds (3.10 M allocations: 154.793 MiB, 3.23% gc time, 85.06% compilation time)
  0.124479 seconds (134.01 k allocations: 6.998 MiB, 2 lock conflicts, 31.16% compilation time)
  0.346849 seconds (338.27 k allocations: 17.200 MiB, 0.98% gc time, 2 lock conflicts, 66.06% compilation time)
  0.313539 seconds (328.23 k allocations: 16.698 MiB, 1.03% gc time, 2 lock conflicts, 67.35% compilation time)
  0.148446 seconds (106.91 k allocations: 5.685 MiB, 2 lock conflicts, 29.42% compilation time)

Worth it?

Also, I assume it's not trivial to cover https in this offline workload?

@fredrikekre
Copy link
Member

I don't think it is worth it, I just picked some keywords at random, and there are lots of them.

Also, I assume it's not trivial to cover https in this offline workload?

You can probably do something like

# Set up server with self-signed cert
. If that is succesful it is probably not worth precompiling the http path.

@IanButterworth
Copy link
Member Author

Great, that works.

Now

julia> begin
               url = "https://google.com"
               @time HTTP.get(url);
               @time HTTP.get(url, ["User-Agent" => "julia"]);
               @time HTTP.get(url; retries = 5);
               @time HTTP.get(url; retry = false);
               @time HTTP.get(url; redirect_limit = 5);
       end;
  0.725483 seconds (1.74 M allocations: 83.781 MiB, 8.31% gc time, 70.41% compilation time)
  0.126748 seconds (134.04 k allocations: 6.999 MiB, 1.92% gc time, 2 lock conflicts, 34.05% compilation time)
  0.351043 seconds (338.45 k allocations: 17.206 MiB, 2 lock conflicts, 70.20% compilation time)
  0.322628 seconds (328.27 k allocations: 16.698 MiB, 1.44% gc time, 2 lock conflicts, 68.70% compilation time)
  0.151559 seconds (106.95 k allocations: 5.688 MiB, 2 lock conflicts, 34.52% compilation time)

@IanButterworth
Copy link
Member Author

New nightly failures are some unexpected allocations

HTTPVersion: Test Failed at /home/runner/work/HTTP.jl/HTTP.jl/test/httpversion.jl:19
  Expression: #= /home/runner/work/HTTP.jl/HTTP.jl/test/httpversion.jl:19 =# @allocated(HTTPVersion("1.1")) == 0
   Evaluated: 16 == 0

  Expression: #= /home/runner/work/HTTP.jl/HTTP.jl/test/httpversion.jl:20 =# @allocated(VersionNumber(HTTPVersion("1.1"))) == 0
   Evaluated: 64 == 0

@IanButterworth
Copy link
Member Author

Merge?

src/HTTP.jl Outdated
@@ -631,4 +631,9 @@ function Base.parse(::Type{T}, str::AbstractString)::T where T <: Message
return m
end

# only run if precompiling pkgimages
if VERSION >= v"1.9.0-0" && Base.JLOptions().use_pkgimages != 0 && ccall(:jl_generating_output, Cint, ()) == 1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeesh; it'd be nice if PrecompileTools provided an API for all these conditions (isprecompiling?)

Comment on lines +14 to +17
# random port in the dynamic/private range (49152–65535) which are are
# least likely to be used by well-known services
_port = 57813

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# random port in the dynamic/private range (49152–65535) which are are
# least likely to be used by well-known services
_port = 57813

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This broke things because _port is an arg given to serve!, but surprisingly not 1.10, which I don't understand...

btw, I had tried do listenany without a port arg and got a method error.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've reverted the change. And I think the reason 1.10 didn't fail is that I think on 1.10 if coverage is requested pkgimages aren't used.

So I just made the precompile workload run whether or not pkgimages are requested, so that coverage isn't weird in tests.

Copy link
Member

@quinnj quinnj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for doing this @IanButterworth!

@quinnj quinnj merged commit 6aa6981 into JuliaWeb:master Nov 23, 2024
10 of 13 checks passed
@IanButterworth IanButterworth deleted the ib/precompile branch November 23, 2024 05:01
@timholy
Copy link
Contributor

timholy commented Nov 23, 2024

Late comment:

I don't think it is worth it, I just picked some keywords at random, and there are lots of them.

Typically, the answer to this is to add Base.@constprop :none in front of the function. Unless there's really some reason we think it's important to compile for each separate combination of boolean or symbol keyword arguments.

@IanButterworth
Copy link
Member Author

Thanks. I took a look into adding that but it's not clear where to, and my attempts were in vain. There's a lot of functions returning functions, so it's not particularly clear where to put the annotation.

@quinnj says he is close to the 2.0 rewrite being ready, so it might not be worth it, unless it jumps out to someone.

@timholy
Copy link
Contributor

timholy commented Nov 24, 2024

Gotcha. I assumed it was compiling many variants of get, but perhaps @fredrikekre's examples are instead meant to exercise additional code paths that are hidden behind runtime dispatch.

The use-case for @constprop :none is if you end up doing multiple expensive compiles for specific arg/kwarg combinations. Functions that take Int, Bool, and Symbol args are particularly at-risk for "dumb" recompiles. You can use MethodAnalysis.jl to count the number of MethodInstances for a specific function. This is also a use-case for pgds which lets you check a whole workload.

This should probably be used more heavily in packages than it is.

@DilumAluthge
Copy link
Member

I suspect that this PR broke precompilation of HTTP on Julia 1.9. See for example: https://github.com/JuliaRegistries/General/actions/runs/12001290587/job/33451587371

Can we skip the precompile workload on Julia 1.9?

@IanButterworth
Copy link
Member Author

Just a trailing note.. I was underestimating the improvement here because it turns out google.com is quite slow to resolve for me...

julia> using HTTP

julia> @time HTTP.get("https://julialang.org");
  0.052138 seconds (2.42 k allocations: 368.070 KiB, 49.11% compilation time)

julia> @time HTTP.get("https://julialang.org");
  0.009134 seconds (292 allocations: 212.633 KiB, 1 lock conflict)

whereas before this PR

julia> using HTTP

julia> @time HTTP.get("https://julialang.org");
  4.739799 seconds (14.82 M allocations: 740.448 MiB, 2.34% gc time, 99.51% compilation time)

julia> @time HTTP.get("https://julialang.org");
  0.009363 seconds (288 allocations: 212.633 KiB, 1 lock conflict)

@Sixzero
Copy link

Sixzero commented Nov 27, 2024

Love to see! Insane improvements guys!

@krynju
Copy link

krynju commented Nov 28, 2024

didn't see anyone showing first server response time changes

#### 1.10.10
server = HTTP.serve!("127.0.0.1", 8088; ) do req
    HTTP.Response(200)
end


julia> using HTTP

julia> server = HTTP.serve!("127.0.0.1", 8088; ) do req
           HTTP.Response(200)
       end
[ Info: Listening on: 127.0.0.1:8088, thread id: 1

shell> time curl localhost:8088
curl 'localhost:8088'  0.00s user 0.01s system 0% cpu 1.325 total

shell> time curl localhost:8088
curl 'localhost:8088'  0.01s user 0.00s system 91% cpu 0.006 total


#### 1.10.12

julia> using HTTP

julia> server = HTTP.serve!("127.0.0.1", 8088; ) do req
           HTTP.Response(200)
       end
[ Info: Listening on: 127.0.0.1:8088, thread id: 1

shell> time curl localhost:8088
curl 'localhost:8088'  0.00s user 0.00s system 0% cpu 0.479 total

shell> time curl localhost:8088
curl 'localhost:8088'  0.00s user 0.01s system 84% cpu 0.007 total

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

TTFX time of request is high. MVP
8 participants