top | item 21823272

(no title)

sjustinas | 6 years ago

The indexing operator [] does bound checks, the method `get_unchecked()` does not. If you need the extra speed at the expense of safety, you can do it, but you have to consciously choose this trade-off. Unlike in C, in Rust safety is opt-out.

discuss

scoutt|6 years ago

But in this case the overhead may come from calling get_unchecked() for every array access, is this correct? Unless the function it's inlined and does a quantifiable amount of work (for a given arch), but this may also have an impact on code size. In a way or another, there should be a trade-off for accessing array elements (a very common operation).

I don't know how all of this can be coupled with the embedded ecosystem we are used to work with.

steveklabnik|6 years ago

So, here's a fun example of how this works out in practice: https://godbolt.org/z/DXo25P

Here, rustc is able to see that we always have a first element, and will actually completely remove the unchecked one, and replace the body with the "checked" one, which has no checks.

For some reason, I can't get it to show the assembly for just the two functions; it always optimzies everything out. Putting it on the rust playground says this:

  playground::access_first_element: # @playground::access_first_element
  # %bb.0:
 pushq %rax
 cmpq $2, %rsi
 jb .LBB5_2
  # %bb.1:
 movq %rdi, %rax
 addq $4, %rax
 popq %rcx
 retq

  .LBB5_2:
 movq %rsi, %rdx
 leaq .L__unnamed_2(%rip), %rdi
 movl $1, %esi
 callq *core::panicking::panic_bounds_check@GOTPCREL(%rip)
 ud2
                                        # -- End function

  playground::access_first_unchecked: # @playground::access_first_unchecked
  # %bb.0:
 leaq 4(%rdi), %rax
 retq
                                        # -- End function

as you can see, it gets super inlined, and does no work, compared to the bound checked version.

paavohtl|6 years ago

get_unchecked() is marked as an #[inline] function (like most other low level primitives), and through the magic of compiler optimization it should have exactly 0 overhead. The function should most likely compile down to a single CPU instruction.