top | item 40310165

(no title)

I'm working towards my own language, and I'm sort of stuck on this issue, too:

Namely:

1. Should the "default" (syntactically-sugar'd) arrays in the language (e.g. `int[]`) be restricted to simple contiguous storage, or should they also hold (potentially negative) stride, etc?

2. Should I have multidimensional arrays? (to be clear, I'm talking about contiguous blocks of data, not arrays-holding-pointers-to-arrays)

3. Should slicing result in a view, or a copy? (naturally, some of these options conflict)

I'm honestly erring towards:

1. "fat" arrays;

2. multidimensional arrays;

3. slicing being views.

This allows operations like reversing, "skipping slices" (e.g. `foo[::2]` in Python), even broadcasts (`stride=0`) to be represented by the same datastructure. It's ostensibly even possible to transpose matrices and such, all without actual copying of the data.

But there are good arguments for the reverse, too. E.g. "fat" arrays need 1 more `size_t` of storage per dimension (to store the stride), compared to "thin" arrays. And a compiler can optimize thin ones much better due to more guarantees (for example, a copy can use vector instructions instead of having to check stride first). Plus "thin" arrays are definitely the more common scenario and it simplifies the language.

So, yeah, still not quite sure despite liking the concept. One obvious option would be for `int[]` to be a thin array, but offering `DataView<int>` [1]. The problem with that is that people implementing APIs will default to the easy thing (`int[]`) instead of the more-appropriate-but-more-letters `DataView<int>`. Ah, dilemmas.

[1] (not actual syntax, but I figured I'd use familiar syntax to get the point across)

discuss

No comments yet.