top | item 44815545

(no title)

davidkunz | 6 months ago

A little bit unrelated, but how do people deal with the abstinence of payloads in zig errors? For example, when parsing a JSON string, the error `UnexpectedToken` is not very helpful. Are libraries typically designed to accept an optional input to store potential errors?

discuss

order

quantummagic|6 months ago

The idiomatic way in Zig is to return the simple unadorned error, but return detailed error data through a pointer argument passed into the function, allowing the function to fill in extra information before returning an error.

    const MyError = error{ FileNotFound, PermissionDenied };

    fn readFile(path: []const u8, outErrInfo: *ErrorInfo) ![]const u8 {
      if (fileMissing) {
        if (outErrInfo) |info| {
            info.* = ErrorInfo{
                .code = MyError.FileNotFound,
                .message = "File missing",
                .line = @line(),
            };
        }
        return MyError.FileNotFound;
      }
      return data; // success
    }
The advantage of this is that everything is explicit, and it is up to the caller to arrange memory usage for error data; ie. the compiler does not trigger any implicit memory allocation to accommodate error returns. This is a fundamental element of Zig's design, that there are no hidden or implicit memory allocations

nextaccountic|6 months ago

> The advantage of this is that everything is explicit, and it is up to the caller to arrange memory usage for error data

Likewise, this has the disadvantage that the caller must allocate space for the error payload, even if the error is very unlikely

randyrand|6 months ago

IMO error objects should’ve included a pointer to some extra preallocated memory.

In my C code I always allocate my error objects first, with usually 1024 bytes just for error strings.

In cases where i don’t care for error strings, i allocate 0 bytes for them.

I have a simple function to append error strings, and it checks for space. So all the code is ambivalent about whether this extra space exists.

works wonderfully.

dnautics|6 months ago

should be ?*ErrorInfo in the header there =D

metaltyphoon|6 months ago

So… pretty much how C does it.

maleldil|6 months ago

> Are libraries typically designed to accept an optional input to store potential errors?

Yes. Stdlib's JSON module has a separate diagnostics object [1]. IMO, this is the weakest part of Zig's error handling story, although the reasons for this are understandable.

[1] https://ziglang.org/documentation/master/std/#std.json.Scann...

AndyKelley|6 months ago

I'd like to note that std.json, as it currently stands, is not a good example of proper error handling. Unless you use that awkward lower level Scanner API, if you get a schema mismatch it reports some failure code and does not populate a diagnostics struct, which is painful and useless.

On the other hand the std.zon author did not make this mistake, i.e. `std.zon.parse.fromSlice` takes an optional Diagnostics struct which gives you all the information you need (including a handy format method for printing human readable messages).

dnautics|6 months ago

I wrote an article about one possible pattern which is a concrete realization of your question -- though with more ceremony and complexity since the pathway is fully compiled out if you don't use it (vs a nullable pointer strategy):

> Are libraries typically designed to accept an optional input to store potential errors?

https://zig.news/ityonemo/sneaky-error-payloads-1aka

if you prefer video form:

https://www.youtube.com/watch?v=aFeqWWJP4LE

The answer is no, libraries are not typically designed with a standardized convention for payload return.

davidkunz|6 months ago

Thank you all for these great and detailed explanations, I've learned a lot! I like the approach with an optional pointer, it fits to zig's philosophy quite well. Although there's a bit of a disconnect between the unadorned error and the corresponding data struct. I could imagine it requires care when the data struct is a union, as one needs to know which error corresponds to which variant.

jmull|6 months ago

I think the idea is errors are for control flow. If you have other information to return from a function, you can just return it — whether directly as the return value or through an “out” parameter or setting it in some context.

hansvm|6 months ago

At a practical level, most of the language doesn't care about the distinction between errors and other types. You mostly just have to consider `try/catch/errdefer`. Your question then, mildly restated, is "how do people deal with cases where they want to use `try/catch/errdefer` but also want to return a payload?"

It's worth asking, at least a little, how often you want that in the first place.

Contrasting with Rust as an example, suppose you want Zig's "try" functionality with arbitrary payloads. Both functions need a compatible error type (a notable source of minor refactors bubbling into whole-project changes), or else you can accept a little more boilerplate and box everything with a library like `anyhow`. That's _fine_, but does it help you solve real problems? Opinions vary, but I think it mostly makes your life harder. You have stack unwinding available if you really need to see the source of a thing, and since the whole point of `try` is to bubble things up to callers who don't have the appropriate context to handle them, they likely don't really care about the metadata you're tacking on.

Suppose you want Zig's "catch" functionality with arbitrary payloads. That's just a `union` type. If you actually expect callers to inspect and care about the details of each possible return branch, you should provide a return type allowing them to do stuff with that information.

The odd duck out is `errdefer`. IMO it's reasonably common for libraries to want to do some sort of cleanup on "error" conditions, where that cleanup often doesn't depend on which error you hit, and you lose that functionality if you just return a union type. My usual workaround (in the few cases where I actually want that information returned and also have to do some sort of cleanup) is to have a private inner function and a public outer function. The inner function has some sort of `out` parameter where it sticks that unioned metadata. The outer function executes the code which might have to be cleaned up on errors, calls the inner function, and figures out what to do from there. Result location semantics make it as efficient as hand-rolled code for release builds. Not everything fits into that paradigm, but the exceptions are rare enough that the extra boilerplate really isn't bad on average (especially when comparing to an already very verbose language).

Depending on the API, your proposal of having a dedicated `out` parameter exposed further up the chain to callers might be appropriate. I'm sure somebody has done so.

Something I also do in a fair amount of my code is let the caller specify my return type, and I'll avoid work if they don't request a certain payload (e.g., not adding parse failure line numbers if not requested). It lets you write a reasonably generic API without a ton of code complexity, still allowing callers to get the information they want.

Ar-Curunir|6 months ago

> suppose you want Zig's "try" functionality with arbitrary payloads. Both functions need a compatible error type (a notable source of minor refactors bubbling into whole-project changes), or else you can accept a little more boilerplate and box everything with a library like `anyhow`. That's _fine_, but does it help you solve real problems? Opinions vary, but I think it mostly makes your life harder.

This is not true, you simply need to add a single new variant to the callers error type, and either a From impl or a manual conversion at the call site