-
Notifications
You must be signed in to change notification settings - Fork 183
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Potential for URI parsing performance improvement. #151
Comments
Hacking the benchmark script to use a pre #135 version of HTTP.jl, and removing a few unicode URLs that the old HTTP.jl could not handle, produces a very similar result:
So, this doesn't look like a regression caused by #135. The only advantage I can see in the slower |
- Remove args -> string -> parse -> URI round-trip from constructors & merge() - Use parse_uri_reference() instead of slower http_parser_parse_url()
With this change URI parsing is 2 x faster again: 0aef98a
|
Something in latest v0.7 has made the old http_parser_parse_url parser even slower.
|
HTTP.jl uses the
http_parser_parse_url
function to parse URLs.HTTP.jl/src/urlparser.jl
Line 161 in 6ee7083
I believe this code is based on ngx_http_parse.c from NGINX. @quinnj is that right?
I recently added some more URI parsing tests based on https://github.com/cweb/url-testing/blob/master/urls.json and in the process of debugging made a simple regex pattern based on the regex from RFC 3986.
It turns out that the simple regex parser is faster than
http_parser_parse_url
.Running
test/uri_benchmark.jl
shows that the regex parser runs in 47% of the time taken byhttp_parser_parse_url
:The regex parser is in URIs.jl here:
HTTP.jl/src/URIs.jl
Lines 101 to 121 in 6ee7083
The text was updated successfully, but these errors were encountered: