Skip to content

Conversation

@the-moisrex
Copy link
Member

@the-moisrex the-moisrex commented Aug 16, 2025

This is essentially the same as .get(output) -> error_code, but with one different, it does not throw when there's an error, and simply sets the error when a reference to it is given.

So this becomes possible:

.value() >> car.model >> error;

It doesn't change much more than that, it's a syntactic sugar.

struct Car {
  std::string make{};
  std::string model{};
  int year{};
  std::vector<double> tire_pressure{};

  friend error_code tag_invoke(simdjson::deserialize_tag, auto &val, Car &car) {
    simdjson::ondemand::object obj;
    error_code error;
    val.get_object() >> obj >> error;
    if (error) {
      return error;
    }
    // Instead of repeatedly obj["something"], we iterate through the object
    // which we expect to be faster.
    for (auto field : obj) {
      simdjson::ondemand::raw_json_string key;
      field.key() >> key >> error;
      if (key == "make") {
         field.value() >> car.make >> error;
      } else if (key == "model") {
        field.value() >> car.model >> error;
      } else if (key == "year") {
        field.value() >> car.year >> error;
      } else if (key == "tire_pressure") {
        field.value() >> car.tire_pressure >> error;
      }
      if (error) {
        return error;
      }
    }
    return simdjson::SUCCESS;
  }
};

We can't do the same with document and object and others that have .get(output) because they don't hold the error in themselves, but they return it after trying to set things.

So, no val >> obj >> error (in above example) for us.

@the-moisrex the-moisrex marked this pull request as ready for review August 16, 2025 10:30
@lemire
Copy link
Member

lemire commented Aug 19, 2025

@the-moisrex Ok. So I am totally open to improving our exception-free API but I don't see how what you are proposing helps.

The first principle is that we want to make it very hard to accidentally forget to check the error code. That's why we work this way...

object obj;
error_code code = something.get_object().get(objt);

We force people to capture the error code.

Previously, we let people do the following:

auto [value, error] = ...

We counted on people's good sense to always check the error code, but we got bug reports from people who simply decided to consume the values without checking the error codes.

Now, even with our new API, people still go around it and call value_unsafe() which we discourage. But, at the very least, if they call value_unsafe(), they have the word 'unsafe' in their code.

At a glance, your proposal would encourage the following usage pattern:

simdjson::ondemand::object obj;
val.get_object() >> obj;

I might be misreading your code. If it is disallowed, then fine, but if it is allowed, it is a step back. We want to really, really, really push the users to look at the error codes, we want to remove opportunities for them to skip that test.

(Exceptions work fine too, obviously.)

@the-moisrex
Copy link
Member Author

I see.

SIMDJSON gives the control of the errors to the callers, after the calls, and doesn't even store it in itself; that's forcing our hands in API design.

This idea was supposed to be part of a bigger idea that we essentially would be able to describe what we want, and where we want them, and then it would loop around and put things where they needed to be (maybe in a destructor of some returned type).

In-loop error handling style of SIMDJSON makes me think maybe we need a Transaction mechanism as well, because the users most-likely would be doing half of their application logic while they're in the loop, even though the input may not be a correct JSON, but we wouldn't know about it. The user has to manually handle any rollback on error.

I guess I shouldn't work on this idea either:

from(...)["one", "two", three"] >> one >> two >> three >> error;
auto [one, two, three, error] = from(...)["one", "two", "three"];

@the-moisrex
Copy link
Member Author

Maybe we could throw an error if error is not moved out of it in destructor?

Seems hacky, but possible.

@lemire
Copy link
Member

lemire commented Aug 19, 2025

@the-moisrex

SIMDJSON gives the control of the errors to the callers, after the calls, and doesn't even store it in itself; that's forcing our hands in API design.

The model is very similar to std::expected:

https://en.cppreference.com/w/cpp/utility/expected.html

I guess I shouldn't work on this idea either:

I disagree. We have two open PRs on this topic:

#2247

#2274

If you read the latter PR, I write "I am observing a very significant performance regression compared to manually provided code."

It is a common pattern to be extracting multiple keys and it would be handy to do it more conveniently and maybe more quickly, but it is not a great thing to do it more slowly.

@lemire
Copy link
Member

lemire commented Aug 19, 2025

Maybe we could throw an error if error is not moved out of it in destructor?

Right. Or you flip the script and go result >> error >> value.

But what is the purpose ?

Your code looks simpler because you have refactored the error handling:

Capture d’écran, le 2025-08-19 à 14 21 27

Your code is nicer, no question about it, but I don't think it is nicer because of the use of operator>>.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants