The C and C++ language standards say that overflow of a signed value is undefined behaviour. In the C99 standard this is in section 6.5. In the C++98 standard it is in section 5 [expr], paragraph 5. This means that a correct C/C++ program must never generate signed overflow when computing an expression. It also means that a compiler may assume that a program will never generated signed overflow.
gcc has taken advantage of this fact for a long time when optimizing. For example, consider this function:
int f(int x) { return 0x7ffffff0 < x && x + 32 < 0x7fffffff; }
If signed arithmetic simply wraps on overflow, then this is equivalent to 0x7ffffff0 < x, since adding 32 to such a value will yield a negative number. However, even gcc 2.95.3, released in 2001, will optimize this function into code which simply returns 0. This is valid because the compiler may assume that signed overflow does not occur in a correct program, and getting a negative number from x + 32 can only happen from signed overflow.
Recently gcc has started using undefined signed overflow to implement better bounds tests. In particular, consider code like this:
int f()
{
int i;
int j = 0;
for (i = 1; i > 0; i += i)
++j;
return j;
}
This code is assuming that if it keeps doubling a positive number, it will eventually get a negative number. That is, it expects signed overflow to behave in a certain way. Current mainline gcc will see that the signed overflow can not occur in a correct program, and will compile this code into an infinite loop.
Some gcc users were surprised by this optimization, and there was an outcry around the beginning of 2007. For a while there was a suggestion that gcc compilations should always use the -fwrapv option. -fwrapv tells the compiler to treat signed overflow as wrapping. That is how Java defines signed overflow. The option was introduced in 2003 for gcc 3.3.
The disadvantage of -fwrapv is that it inhibits optimizations. Most programs do not produce signed overflow, and, as we have seen, no correct programs do. The compiler can generate better code when it can assume that signed overflow can not occur. This particularly arises in loops. When a loop uses a signed integer index, the compiler can do much better when it doesn’t have to consider the possibility that the index will overflow.
Consider this trivial example:
int f(int i) { int j, k = 0; for (j = i; j < i + 10; ++j) ++k; return k; }
What does this function return? When the compiler can assume that signed overflow is undefined, this function is compiled into code which simply returns 10. With the -fwrapv option, the code is not so simple,, since i might happen to have a value like 0x7ffffff8 which will overflow during the loop. While this is a trivial example, it is clear that loops like this occur in real code all the time.
However, gcc does need to respond to the concerns of the user community. I introduced two new options in gcc 4.2. The first is -fno-strict-overflow. This tells the compiler that it may not assume that signed overflow is undefined. The second is -Wstrict-overflow. This tells the compiler to warn about cases where it is using the fact that signed overflow is undefined to implement an optimization. With these options it is possible to detect cases where signed overflow occurs in a program, and it is possible to disable optimizations which rely on signed overflow until the program is fixed. The -Wstrict-overflow warning even found one minor case where gcc itself relied on wrapping signed overflow, in the handling of division by the constant 0x80000000.
This naturally leads to the question: what is the difference between -fwrapv and -fno-strict-overflow? There is a clear difference on a processor which does not use ordinary twos-complement arithmetic: -fwrapv requires twos-complement overflow, and -fno-strict-overflow does not. However, no such processor is in common use today. In practice, I think that the code generated by the two options will always behave the same. However, they affect the optimizers differently. For example, this code
int f(int x) { return (x + 0x7fffffff) * 2; }
is compiled differently with -fwrapv and -fno-strict-overflow. The difference occurs because -fwrapv precisely specifies how the overflow should behave. -fno-strict-overflow merely says that the compiler should not optimize away the overflow. With the current compiler, with -fwrapv (and -O2 -fomit-frame-pointer), I see this:
movl $1, %eax
subl 4(%esp), %eax
addl %eax, %eax
negl %eax
ret
With -fno-strict-overflow I see this:
movl 4(%esp), %eax
leal -2(%eax,%eax), %eax
ret
Same result, different algorithm.
Leave a Reply
You must be logged in to post a comment.