Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simple selective extraction of fields #16

Open
vinipsmaker opened this issue Oct 31, 2017 · 10 comments
Open

Simple selective extraction of fields #16

vinipsmaker opened this issue Oct 31, 2017 · 10 comments

Comments

@vinipsmaker
Copy link
Contributor

vinipsmaker commented Oct 31, 2017

Hey,

a friend of mine ( @skhaz ) showed me a very interesting JSON library for C: https://github.com/cesanta/frozen

I thought it could be useful as an inspiration design for higher-level abstractions.

@breese
Copy link
Owner

breese commented Oct 31, 2017

Interesting approach to use a printf/scanf-like API. Something similar would be fairly straightforward to create with trial.protocol.

Thanks for the reference.

@vinipsmaker
Copy link
Contributor Author

Another interesting C library to parse JSONs:

But this time it doesn't serve as an inspiration to Trial.Protocol. Trial.Protocol already has integration with Boost.Serialization, so nothing missing on our side.

@vinipsmaker
Copy link
Contributor Author

Another interesting C++ library. This time, the interesting bit is the format side: https://github.com/bcsanches/JsonCreator.

@vinipsmaker
Copy link
Contributor Author

Now that #23 was closed, I think it'll be even easier to implement this one.

I think I'll implement this next (but it'll take a few weeks as I'll allocate some time to work on other stuff). What do you guys think about the following interface?

int x;
boost::variant<int, std::string, boost::none_t> y;
boost::optional<std::vector<int>> z;
auto h = [](const json::reader &reader) { /* a callback */ };
boost::string_view lit;

if (!json::scan(json_view,
                "foo", x,
                "bar.baz", y,
                "bar.a_list", z,
                "bar", h,
                "bar", lit)) {
  std::cerr << "missing field" << std::endl;
}

Every argument that is not boost::optional or boost::variant<boost::none_t, ...> is treated as required (i.e. json::scan will return false if this field is not found in the input json_view). If value field in json_field is null, then it is treated as if it was not there (and then only boost::optional<T> and boost::variant<boost::none_t, ...> can accept the (absence of this) value).

You can pass functors receiving const json::reader& if you want a callback called when this token is reached.

If you want a literal of some value, just pass a string_view.

The same field can be extracted into two or more targets (then you can, for instance, get a list of strings and the literal associated with this list at once).

More complex types like std::vector<boost::variant<boost::none_t, int>> are also supported.

My greatest concern is related to JSON objects like { "foo.bar": 42 } (i.e. if the user wants to read a field that has a dot). For such cases, I was thinking about using a tuple as an alternative path spec:

json::scan(json_view,
           "bar.foo", x,
           std::make_tuple("bar", "foo"), y,
           std::make_tuple("foo.bar"), z);

I like this idea because the common case (no . in fields) is less verbose and no escape syntax is required for the user to learn (but I'd probably write a json::make_path so it'd be easier for me to delve into the required TMP to make it work).

Oh, of course, trial::dynamic::variable is also supported, but I'll wait for the tree parsing feature #21 (and, as I said, I will take time to work on other stuff, so really plenty of time before I start this and no rush to merge branches).

Also, one last thing: I think I'd write a partial::scan and implement scan in terms of partial::scan.

I'll resist the temptation to propose the name magic_scan.

@vinipsmaker
Copy link
Contributor Author

Just an update on my previous idea.

  • I thought that a callback should return bool. On false, json::scan would consider as if a value is incompatible (e.g. trying to read to a string, but an int was found) and abort the operation and return false.
  • Possibility to read value to multiple targets at once raise too many questions (e.g. calling order if callback is present).
  • Possibility to read value to multiple targets at once is advanced usage.
  • User may unintentionally try to read a value to multiple targets, but it'd be a typo.

Therefore, I thought that this feature should be turned down. I have an update to my suggestion.

Add two functions:

  • partial::read_to(json::reader&, T).
  • partial::read_to_many(json::reader&, T...).

partial::read_to can be easily implemented in terms of the latter.

If user wants to read some field to multiple targets at once, he'll have to opt in explicitly like this:

boost::optional<int> x;
boost::optional<std::string> y;
json::scan(json_view,
           "foo.bar", [&] (json::reader &reader) {
               return partial::read_to_many(reader, x, y);
           });

Also, callback now receives non-const reader reference. json::scan will store a pointer to current token before calling the callback. Then the callback can choose to optionally consume the token itself or return the control w/o consuming token to let json::scan itself consume it.

One more feature:

std::string current_key;
json::scan(json_view,
           "foo.bar", x,
           std::make_tuple("foo", current_key), [&](json::reader&) {
               return !current_key.starts_with("_");
           });

It should be self-explanatory. Wildcard match on fields not matched by other keys on nesting level of the path spec. std::string& should be allowed only on the tail of the path spec.

Also, this feature (callback and matching on any nesting level or specific key I want) means I no longer see a use for #26.

I'll open separate issues to track partial::read_to_many(json::reader&, T...) and partial::read_to(json::reader&, T). I can use them to ease implementation, then I'll implement them first.

@breese
Copy link
Owner

breese commented Mar 17, 2018

I still haven't given much thought to this feature, but here is some arbitrary feedback...

Why should the callback signal an error with a boolean return value rather than by throwing an exception?

While I do agree that we could avoid the dotted-string syntax, we should be careful not to create a solution that could be confused with JSONPath or JSON Pointer. These are better handled with recursive iterators on dynamic::variable.

How does read_to() differ from reader::value()?

@vinipsmaker
Copy link
Contributor Author

Why should the callback signal an error with a boolean return value rather than by throwing an exception?

Using exceptions here does make sense. I'll of course change json::scan to use exceptions instead a boolean return as well. Good point.

How does read_to() differ from reader::value()?

read_to() reads a JSON value, not a JSON token:

screenshot-2018-3-24 json

The other difference is just cosmetic (result passed as a by-ref arg instead of return value).

@breese
Copy link
Owner

breese commented Mar 24, 2018

So how does read_to() differ from json::iarchive?

@vinipsmaker
Copy link
Contributor Author

So how does read_to() differ from json::iarchive?

I'm not familiar with Boost.Serialization to be honest. I went to take a more serious look now that you mentioned. It seems to me that json::iarchive should do.

Sorry about the late answer.

@vinipsmaker
Copy link
Contributor Author

I'm gonna focus on finishing Boost.Http parser now (expose chunk extensions and then add parsers combinators), so it'll be some time until I contribute to this project again.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants