Lua uses the table type to represent both dictionaries (hash tables) and arrays of values. This seems to have been predicated on keeping the language “simple” with a minimal number of defined types. A laudable goal.
However, arrays of a single type are just enormously common in applications. Support for arrays is pretty much ubiquitous in other languages, including ones that are in the same general dynamic space.
Internally Lua does treat arrays in their own pathway to keep performance reasonable. There is also some user facing special syntax for arrays. Arrays should be part of the core language — some learning overhead for the newcomer but worth it.
I think the real issue here is not whether there is a separate type for tables and arrays, but whether the arrays are homogenous (all elements must have the same type). In most dynamic languages, the arrays are heterogeneous. For example, Python has a separate array type, but if you want homogenous arrays you have to reach for something like numpy.
I wonder, in reality, if a Lua program uses large (consecutive) arrays, its values will likely have the same type? At the very least it is a common use-case: large arrays of only strings, numbers etc.
Wouldn’t it make sense to (also) optimize just for this case with a flag and a single type tag. Simple and it optimizes memory use for 98% of use cases?
The main catch is that if the optimization guesses wrong and a different type is inserted into the table afterwards, then it would incurr an O(n) operation to transfer all the data to a deoptimized table.
Another caveat is that Lua can have more than one internal representation for the same type, and those have different type tag variants. For instance: strings can be represented internally as either short or long strings; Functions can be Lua closures, C closures, or perhaps even an object with a __call metamethod; Objects can be either tables or userdata.
This seems likely to create some inexplicable performance elbows where you have 1000 strings, but there's one code path that replaces one with a number, and now the whole array needs to be copied. Tracking that down won't be fun.
"However, this attribute is a gcc extension not present in ISO C. Moreover, even in gcc
it is not guaranteed to work [3]. As portability is a hallmark of Lua, this almost magical solution is a no-go."
`__attribute__((packed))` wouldn't help here since the issue is about Lua's array/hash hybrid table design and memory allocation strategy, not C struct padding.
ufo|8 months ago
https://github.com/lua/lua/blob/f71156744851701b5d5fabdda506...
nzzn|8 months ago
However, arrays of a single type are just enormously common in applications. Support for arrays is pretty much ubiquitous in other languages, including ones that are in the same general dynamic space.
Internally Lua does treat arrays in their own pathway to keep performance reasonable. There is also some user facing special syntax for arrays. Arrays should be part of the core language — some learning overhead for the newcomer but worth it.
ufo|8 months ago
marhee|8 months ago
ufo|8 months ago
Another caveat is that Lua can have more than one internal representation for the same type, and those have different type tag variants. For instance: strings can be represented internally as either short or long strings; Functions can be Lua closures, C closures, or perhaps even an object with a __call metamethod; Objects can be either tables or userdata.
tedunangst|8 months ago
Jyaif|8 months ago
The Lua folks want a simple codebase, so they (knowingly) leave a lot of performance on the table in favor of simplicity.
kzrdude|8 months ago
Jyaif|8 months ago
Irresponsible of them of not advertising this as an option in luaconf.h
sfpotter|8 months ago
"However, this attribute is a gcc extension not present in ISO C. Moreover, even in gcc it is not guaranteed to work [3]. As portability is a hallmark of Lua, this almost magical solution is a no-go."
ethan_smith|8 months ago