You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Some APIs that accept batch requests return a sequence of separate JSON objects that are not delimited in any way, but by parsing them you can tell they are separate as when one complete JSON object is parsed, the next non-whitespace character will start the next object
For example, you might see an string like {"name":"Marco"} {"name":"Julia"}, representing two distinct JSON objects.
Currently, JSON.jl does not parse this correctly. It errors for the string case, and only parses the first object in the streaming case (without any indication that the stream was not exhausted).
Under the assumption that all JSON objects in the string have the same dicttype, I believe this can be extended to return a list of parsed objects. My first attempt is:
functionparsemany(str::AbstractString;
dicttype=Dict{String,Any},
inttype::Type{<:Real}=Int64,
allownan::Bool=true,
null=nothing)
out =Vector{dicttype}()
pc =_get_parsercontext(dicttype, inttype, allownan, null)
ps =MemoryParserState(str, 1)
v =parse_value(pc, ps)
push!(out, v)
chomp_space!(ps)
whilehasmore(ps)
pc =_get_parsercontext(dicttype, inttype, allownan, null)
v =parse_value(pc, ps)
push!(out, v)
chomp_space!(ps)
end
out
end
Example:
julia> JSON.parsemany(s)
2-element Vector{Dict{String, Any}}:Dict("name"=>"Marco")
Dict("name"=>"Julia")
# correctly errors on a malformed JSON object
julia> JSON.parsemany(s[1:end-1])
ERROR: Unexpected end of input
Line:0
Around:...":"Julia"... ^Stacktrace: [1] error(s::String) @ Base ./error.jl:33 [2] _error(message::String, ps::JSON.Parser.MemoryParserState) @ JSON.Parser ~/.julia/dev/JSON/src/Parser.jl:140 [3] byteat @ ~/.julia/dev/JSON/src/Parser.jl:49 [inlined] [4] parse_object(pc::JSON.Parser.ParserContext{Dict{String, Any}, Int64, true, nothing}, ps::JSON.Parser.MemoryParserState) @ JSON.Parser ~/.julia/dev/JSON/src/Parser.jl:233 [5] parse_value(pc::JSON.Parser.ParserContext{Dict{String, Any}, Int64, true, nothing}, ps::JSON.Parser.MemoryParserState) @ JSON.Parser ~/.julia/dev/JSON/src/Parser.jl:166 [6] parsemany(str::String; dicttype::Type, inttype::Type{Int64}, allownan::Bool, null::Nothing) @ JSON.Parser ~/.julia/dev/JSON/src/Parser.jl:472 [7] parsemany(str::String) @ JSON.Parser ~/.julia/dev/JSON/src/Parser.jl:464 [8] top-level scope @ REPL[10]:1# notice the first object is not properly closedjulia> s = "{\"name\":\"Marco\" {\"name\":\"Julia\"}""{\"name\":\"Marco\" {\"name\":\"Julia\"}"# fails to parse
julia> JSON.parsemany(s)
ERROR: Expected ',' here
Line:0
Around:...{"name":"Marco" {"name":"Julia"}...^
Stacktrace:
[1] error(s::String)
@ Base ./error.jl:33
[2] _error(message::String, ps::JSON.Parser.MemoryParserState)
@ JSON.Parser ~/.julia/dev/JSON/src/Parser.jl:140
[3] _error_expected_char(c::UInt8, ps::JSON.Parser.MemoryParserState)
@ JSON.Parser ~/.julia/dev/JSON/src/Parser.jl:83
[4] skip!
@ ~/.julia/dev/JSON/src/Parser.jl:80 [inlined]
[5] parse_object(pc::JSON.Parser.ParserContext{Dict{String, Any}, Int64, true, nothing}, ps::JSON.Parser.MemoryParserState)
@ JSON.Parser ~/.julia/dev/JSON/src/Parser.jl:234
[6] parse_value(pc::JSON.Parser.ParserContext{Dict{String, Any}, Int64, true, nothing}, ps::JSON.Parser.MemoryParserState)
@ JSON.Parser ~/.julia/dev/JSON/src/Parser.jl:166
[7] parsemany(str::String; dicttype::Type, inttype::Type{Int64}, allownan::Bool, null::Nothing)
@ JSON.Parser ~/.julia/dev/JSON/src/Parser.jl:467
[8] parsemany(str::String)
@ JSON.Parser ~/.julia/dev/JSON/src/Parser.jl:464
[9] top-level scope
@ REPL[16]:1# note the second one is not properly opened
julia> s ="{\"name\":\"Marco\"} \"name\":\"Julia\"}""{\"name\":\"Marco\"} \"name\":\"Julia\"}"# fails, though this case should have a better error message in the final version
julia> JSON.parsemany(s)
ERROR: MethodError: Cannot `convert` an object of type String to an object of type Dict{String, Any}
Closest candidates are:convert(::Type{T}, ::T) where T<:AbstractDict at abstractdict.jl:520convert(::Type{T}, ::AbstractDict) where T<:AbstractDict at abstractdict.jl:522convert(::Type{T}, ::T) where T at essentials.jl:205...
Stacktrace:
[1] push!(a::Vector{Dict{String, Any}}, item::String)
@ Base ./array.jl:932
[2] parsemany(str::String; dicttype::Type, inttype::Type{Int64}, allownan::Bool, null::Nothing)
@ JSON.Parser ~/.julia/dev/JSON/src/Parser.jl:473
[3] parsemany(str::String)
@ JSON.Parser ~/.julia/dev/JSON/src/Parser.jl:464
[4] top-level scope
@ REPL[18]:1# works even with no space
julia> s ="{\"name\":\"Marco\"}{\"name\":\"Julia\"}""{\"name\":\"Marco\"}{\"name\":\"Julia\"}"
julia> JSON.parsemany(s)
2-element Vector{Dict{String, Any}}:Dict("name"=>"Marco")
Dict("name"=>"Julia")
Is this an acceptable addition to JSON.jl? One argument on its behalf is that, while a user could split the string themselves, that is basically the same as writing a JSON parser themselves, as they have to correctly handle all of the edge cases, nesting, etc in order to determine where the outermost opening and closing brackets are. Without access to the internal helper methods of JSON.jl, this is a bit of a big ask.
The text was updated successfully, but these errors were encountered:
Some APIs that accept batch requests return a sequence of separate JSON objects that are not delimited in any way, but by parsing them you can tell they are separate as when one complete JSON object is parsed, the next non-whitespace character will start the next object
For example, you might see an string like
{"name":"Marco"} {"name":"Julia"}
, representing two distinct JSON objects.Currently, JSON.jl does not parse this correctly. It errors for the string case, and only parses the first object in the streaming case (without any indication that the stream was not exhausted).
Under the assumption that all JSON objects in the string have the same
dicttype
, I believe this can be extended to return a list of parsed objects. My first attempt is:Example:
Is this an acceptable addition to JSON.jl? One argument on its behalf is that, while a user could split the string themselves, that is basically the same as writing a JSON parser themselves, as they have to correctly handle all of the edge cases, nesting, etc in order to determine where the outermost opening and closing brackets are. Without access to the internal helper methods of JSON.jl, this is a bit of a big ask.
The text was updated successfully, but these errors were encountered: