Optimization and Wraparound

Next: Signed Overflow Advice, Previous: Signed Overflow Examples, Up: Integer Overflow

12.2.3 Optimizations That Break Wraparound Arithmetic

Compilers sometimes generate code that is incompatible with wraparound integer arithmetic. A simple example is an algebraic simplification: a compiler might translate (i * 2000) / 1000 to i * 2 because it assumes that i * 2000 does not overflow. The translation is not equivalent to the original when overflow occurs: e.g., in the typical case of 32-bit signed two's complement wraparound int, if i has type int and value 1073742, the original expression returns −2147483 but the optimized version returns the mathematically correct value 2147484.

More subtly, loop induction optimizations often exploit the undefined behavior of signed overflow. Consider the following contrived function sumc:

     int
     sumc (int lo, int hi)
     {
       int sum = 0;
       int i;
       for (i = lo; i <= hi; i++)
         sum ^= i * 53;
       return sum;
     }

To avoid multiplying by 53 each time through the loop, an optimizing compiler might internally transform sumc to the equivalent of the following:

     int
     transformed_sumc (int lo, int hi)
     {
       int sum = 0;
       int hic = hi * 53;
       int ic;
       for (ic = lo * 53; ic <= hic; ic += 53)
         sum ^= ic;
       return sum;
     }

This transformation is allowed by the C standard, but it is invalid for wraparound arithmetic when INT_MAX / 53 < hi, because then the overflow in computing expressions like hi * 53 can cause the expression i <= hi to yield a different value from the transformed expression ic <= hic.

For this reason, compilers that use loop induction and similar techniques often do not support reliable wraparound arithmetic when a loop induction variable like ic is involved. Since loop induction variables are generated by the compiler, and are not visible in the source code, it is not always trivial to say whether the problem affects your code.

Hardly any code actually depends on wraparound arithmetic in cases like these, so in practice these loop induction optimizations are almost always useful. However, edge cases in this area can cause problems. For example:

     int j;
     for (j = 1; 0 < j; j *= 2)
       test (j);

Here, the loop attempts to iterate through all powers of 2 that int can represent, but the C standard allows a compiler to optimize away the comparison and generate an infinite loop, under the argument that behavior is undefined on overflow. As of this writing this optimization is not done by any production version of GCC with -O2, but it might be performed by other compilers, or by more aggressive GCC optimization options, and the GCC developers have not decided whether it will continue to work with GCC and -O2.