GistTree.Com
Entertainment at it's peak. The news is by your side.

Efficient Integer Overflow Checking in LLVM

0

(Here’s some non-compulsory background reading self-discipline subject.)

We desire quick integer overflow checking. Why? First, if the undefined conduct sanitizers walk faster then sorting out goes faster. Second, when overhead drops below a supreme level folks will change into keen to utilize UBSan to harden production code against integer overflows. Here’s already being performed in parts of Android. It isn’t the form of thing to pause calmly: just a few of LLVM’s sanitizers, corresponding to ASan, will lengthen an utility’s attack surface. Even UBSan in trapping mode — which doesn’t lengthen attack surface that I do know of — would possibly maybe without concerns enable a ways away DoS assaults. But right here is inappropriate.

Let’s begin with this just:

int foo(int x, int y) {
  return x + y; 
}

When compiled with trapping integer overflow exams (as against exams that present diagnostics and optionally proceed executing) Clang 3.8 at -O2 affords:

foo:
        addl    %esi, %edi
        jo      .LBB0_1
        movl    %edi, %eax
        retq
.LBB0_1:
        ud2

This code is efficient; right here Chandler Carruth explains why. On the quite loads of hand, signed integer overflow checking slows down SPEC CINT 2006 by 11.8% total, with slowdown starting from negligible (GCC, Perl, OMNeT++) to about 20% (Sjeng, H264Ref) to about 40% (HMMER).

Why pause quick overflow exams add overhead? The lengthen in object code measurement attributable to overflow checking is less than 1% so there’s not going to be critical concern in the icache. Taking a take into fable at HMMER, as an instance, we leer that it spends >95% of its execution time in a just known as P7Viterbi(). This just is also partially vectorized, nonetheless the model with integer overflow exams doesn’t rep vectorized in any respect. In varied phrases, many of the slowdown comes from integer overflow exams interfering with loop optimizations. In distinction, GCC and Perl doubtlessly don’t rep critical fill the support of developed loop optimizations in the indispensable space, hence the shortcoming of slowdown there.

Here I’ll purchase a 2nd to mention that I needed to hack SPEC a bit so as that signed integer overflows wouldn’t derail my experiments. The adjustments appear to be performance-fair. Handiest GCC, Perl, and H254Ref fill signed overflows. Here’s the patch for SPEC CPU 2006 V1.2. All performance numbers in this post had been taken on an i7-5820Okay (6-core Haswell-E at 3.3 GHz), working Ubuntu 14.04 in 64-bit mode, with frequency scaling disabled.

Now for the stress-free part: making code faster by eradicating overflow exams that provably don’t fire. At -O0 the SPEC INT 2006 benchmarks fill 67,678 integer overflow exams, whereas at -O3 there are 38,527. So as that’s wonderful: LLVM 3.8 can already rep rid of 43% of naively inserted exams.

Let’s take into fable at some indispensable functions. First, the lawful info: the entire overflow exams in these functions are optimized away by LLVM:

int foo2(int x, int y) {
  return (short)x + (short)y;
}

int foo3(int x, int y) {
  return (lengthy)x + (lengthy)y;
}

int foo4(int x, int y) {
  return (x >> 1) + (y >> 1);
}

int32_t foo5(int32_t x, int32_t y) {
  const int32_t hide = ~(3U << 30);
  return (x & hide) + (y & hide);
}

int32_t foo6(int32_t x, int32_t y) {
  const int32_t hide = 3U << 30;
  return (x | hide) + (y | hide);
}

int32_t foo7(int32_t x, int32_t y) {
  const int32_t hide = 1U << 31;
  return (x | hide) + (y & ~hide);
}

Expect yourself right here. The frequent theme across these functions is that every overflow test is also seen to be pointless by attempting at the high-characterize bits of its inputs. The code for these optimizations is right here.

Now on the quite loads of hand, LLVM is unable to peer that these functions don’t require overflow exams:

int foo8(int x) {
  return x < INT_MAX ? x + 1 : INT_MAX;
}

int foo9(int x, int y) {
  if (x < 10 && x > -10 && y < 10 && y > -10)
    return x + y;
  return 0;
}

Here’s one other one the place the test doesn’t rep eradicated:

void foo10(char *a) {
  for (int i = 0; i < 10; i++)
    a[i] = 0;
}

There’s lawful info: Sanjoy Das has a couple of patches that, collectively, resolve this command. Their total enact on SPEC is to gash the overhead of signed integer overflow checking to eight.7%.

In the end, this just must rep one overflow test in preference to a pair:

unsigned foo11(unsigned *a, int i) {
  return a[i + 3] + a[i + 5] + a[i + 2];
}

This situation (or something uncover it irresistible) is from Nadav Rotem on twitter; his redundant overflow test elimination walk for Swift eliminates this model of thing at the SIL level. I’ve performed quite of labor on bringing the following tips to LLVM, and can optimistically fill more to write about that in a while.

In summary, signed integer overflow exams in LLVM are quick nonetheless they rep in the kind of the optimizers, which haven’t yet been systematically taught to peer by them or rep rid of them. There’s a wonderful deal of low-placing fruit, and optimistically we can rep the overhead the entire arrangement down to the level the place folks can flip on overflow checking in production codes without thinking too not easy about the tradeoffs.

Addenda:

I haven’t forgotten about Souper! We’ve taught it to rep rid of integer overflow exams. I’ll write about that later too.

Watch Dan Luu’s article on this subject.

Read More

Leave A Reply

Your email address will not be published.