top | item 45461230

(no title)

wren6991 | 5 months ago

You still have the overhead of a function call. If you just use / % operators then you'll get a call inserted to the libgcc or compiler-rt routine if you don't have the M extension, and those routines are div or mod only. Using stdlib for integer division seems like an odd choice.

If stdlib div() were promoted to a builtin one day (it currently is not in GCC afaict), and its implementation were inlined, then the compiler would recognise the common case of one side of the struct being dead, and you'd still end up with a single div/rem instruction.

discuss

order

cpgxiii|5 months ago

Interesting, this is a case where GCC and Clang are "dumb" and MSVC does a better job. For code

  #include <cstdlib>
  #include <cstdint>
  #include <utility>
  
  std::pair<int64_t, int64_t> LibDivWithRemainder(int64_t numerator, int64_t denominator) {
    const auto res = std::div(numerator, denominator);
    return std::make_pair(res.quot, res.rem);
  }
  
  std::pair<int64_t, int64_t> ManDivWithRemainder(int64_t numerator, int64_t denominator) {
    const int64_t quot = numerator / denominator;
    const int64_t rem = numerator % denominator;
    return std::make_pair(quot, rem);
  }
GCC (x86-64 trunk @ -O2) produces

  "LibDivWithRemainder(long, long)":
    sub     rsp, 8
    call    "ldiv"
    add     rsp, 8
    ret
  "ManDivWithRemainder(long, long)":
    mov     rax, rdi
    cqo
    idiv    rsi
    ret
Clang (x86-64 @ -O2) produces

  LibDivWithRemainder(long, long):
    jmp     ldiv@PLT
  ManDivWithRemainder(long, long):
    mov     rax, rdi
    mov     rcx, rdi
    or      rcx, rsi
    shr     rcx, 32
    je      .LBB1_1
    cqo
    idiv    rsi
    ret
  .LBB1_1:
    xor     edx, edx
    div     esi
    ret
while MSVC (x64 @ /O2) produces

  mov     rax, rdx
  cdq
  idiv    r8
  mov     QWORD PTR [rcx], rax
  mov     rax, rcx
  mov     QWORD PTR [rcx+8], rdx
  ret     0
for both