Last Update: 8 Nov., 2017

Changing Rounding Mode and Compile Options

Masahide Kashiwagi

1. Introduction

The kv library provides the following three types of interval arithmeric by default: Here, as for the former two (double, dd), there are various methods of changing rounding mode, and they are controlled by the compile option. This document summarizes the methods of changing rounding mode and compile options corresponding to them.

2. Changing Rounding Mode and Compile Options

No Compile Option

First, with no compile option, kv library changes rounding mode using: See kv/hwround.hpp for implementation details. If the compiler is C99 compilant, fenv.h should be able to be used, so it should work as it is in many environments.

The reason of different implementation only for Visual C++ is as follows:

-DKV_FASTROUND

Intel's CPU has two types of arithmetic units, FPU and SSE2, and the register for rounding mode is different for each unit. Therefore, fesetround and _controlfp_s change the rounding mode of both registers. However, since only SSE2 unit is generally used under Intel CPU's 64bit mode, speed up is expected by changing only SSE2 register for rounding mode.

Also, fesetround and _controlfp_s do the following operations:

Here, if it is assumed that the rest bits other than the bits corresponding to the rounding mode of the control register does not change during program execution, speed up is expected by the following:

Using -DKV_FASTROUND speeds up using these two methods. But, It can only be used in Intel CPU's 64bit mode.

See kv/hwround.hpp for implementation details.

-DKV_NOHWROUND

Using this option, kv library no longer changes ronuding mode. Kv library uses emulation of directed rounding using twosum and twoproduct. Since it works without changing the rounding mode by hardware and only in the nearest rounding mode, it is expected to work in very wide environments. The speed is not so fast.

For details of the algorithm, see emu.pdf (Jun 9, 2014, in Japanese).

See

for implementation details.

[Note] Emulation of directed rounding by -DKV_NOHWROUND is designed on the premise that the operation of double conforms to IEEE 754, so it works in many standard environments, but in Intel CPU's 32bit mode it may not work correctly. In the 32bit mode of the Intel CPU, calculation is executed by FPU which has internal 80bit register, so the calculation result based on IEEE 754 may not be obtained in some cases.

-DKV_USE_AVX512

On some Intel's latest CPUs, the function AVX-512 is installed. As of November 2017, the following CPUs have AVX-512: AVX-512 provides vectorization with 512bit registers, but in addition to it, AVX-512 provides "addition/subtraction/multiplication/division/sqrt including the rounding mode specification to the command itself". Using this function, it is possible to implement interval arithmetic without overhead of rounding mode change. Of course, it only works with the latest CPU with AVX-512. Also, compile option is often required to generate AVX-512 instructions. As of November 2017, gcc requires -mavx512f .

See

for implementation details.

3. Implemented Code for Changing Roundging Mode

As an exmaple, implemented codes of "add_up" (addition with rounding upward) are shown below:

No Compile Option

#include <fenv.h>

static double add_up(const double& x, const double& y) {
    volatile double r, x1 = x, y1 = y;
    fesetround(FE_UPWARD);
    r = x1 + y1;
    fesetround(FE_TONEAREST);
    return r;
}

-DKV_FASTROUND

#include <emmintrin.h>

unsigned long int reg = _mm_getcsr();
unsigned long int nearest = (reg & ~0x00006000) | 0x00000000;
unsigned long int down = (reg & ~0x00006000) | 0x00002000;
unsigned long int up =   (reg & ~0x00006000) | 0x00004000;
unsigned long int chop = (reg & ~0x00006000) | 0x00006000;

static double add_up(const double& x, const double& y) {
    volatile double r, x1 = x, y1 = y;
    _mm_setcsr(up);
    r = x1 + y1;
    _mm_setcsr(nearest);
    return r;
}

-DKV_NOHWROUND

#include <cmath>
#include <limits>

static void twosum(const double& a, const double& b, double& x, double& y) {
    double tmp;

    x = a + b;
    if (std::fabs(a) > std::fabs(b)) {
        tmp = x - a;
        y = b - tmp;
    } else {
        tmp = x - b;
        y = a - tmp;
    }
}

static double succ(const double& x) {
    static const double th1 = std::ldexp(1., -969);
    static const double th2 = std::ldexp(1., -1021);
    static const double c1 = std::ldexp(1., -53) + std::ldexp(1., -105);
    static const double c2 = std::ldexp(1., -1074);
    static const double c3 = std::ldexp(1., 53);
    static const double c4 = std::ldexp(1., -53);

    double a, c, e;

    a = std::fabs(x);
    if (a >= th1) return x + a * c1;
    if (a < th2) return x + c2;
    c = c3 * x;
    e = c1 * std::fabs(c);
    return (c + e) * c4;
}

static double add_up(const double& x, const double& y) {
    double r, r2;

    twosum(x, y, r, r2);
    if (r == std::numeric_limits<double>::infinity()) {
        return r;
    } else if (r == -std::numeric_limits<double>::infinity()) {
        if (x == -std::numeric_limits≤double>::infinity() || y == -std::numeric_limits::infinity()) {
            return r;
        } else {
            return -(std::numeric_limits≤double>::max)();
        }
    }

    if (r2 > 0.) {
        return succ(r);
    }

    return r;
}

-DKV_USE_AVX512

#include <immintrin.h>

static double add_up(const double& x, const double& y) {
    volatile __m128d x1, y1, z1;
    double r;
    x1 = _mm_set_sd(x);
    y1 = _mm_set_sd(y);
    z1 = _mm_add_round_sd(x1, y1, _MM_FROUND_TO_POS_INF | _MM_FROUND_NO_EXC);
    _mm_store_sd(&r, z1);
    return r;
}

4. Using FMA for twoproduct

The twoproduct algorithm is used in the emulation of directed rounding and double-double arithmeric. It is known that the twoproduct speeds up using FMA instruction. By using -DKV_USE_TPFMA option, kv library replace the original twoproduct by the fast implementation of twoproduct using FMA instruction. Implemented Codes are shown below:

without -DKV_USE_TPFMA

static void split(const double& a, double& x, double& y) {
    static const double sigma = (double)((1L << 27) + 1);
    double tmp;

    tmp = a * sigma;
    x = tmp - (tmp - a);
    y = a - x;
}

static void twoproduct(const double& a, const double& b, double& x, double& y) {
    static const double th = std::ldexp(1., 996);
    static const double c1 = std::ldexp(1., -28);
    static const double c2 = std::ldexp(1., 28);
    static const double th2 = std::ldexp(1., 1023);

    double na, nb, a1, a2, b1, b2;

    x = a * b;
    if (std::fabs(a) > th) {
        na = a * c1;
        nb = b * c2;
    } else if (std::fabs(b) > th) {
        na = a * c2;
        nb = b * c1;
    } else {
        na = a;
        nb = b;
    }
    split(na, a1, a2);
    split(nb, b1, b2);
    if (std::fabs(x) > th2) {
        y = ((((a1 * 0.5) * b1 - (x * 0.5)) * 2 + a2 * b1) + a1 * b2) + a2 * b2;
    } else {
        y = (((a1 * b1 - x) + a2 * b1) + a1 * b2) + a2 * b2;
    }
}

with -DKV_USE_TPFMA

static void twoproduct(const double& a, const double& b, double& x, double& y) {
    x = a * b;
    y = fma(a, b, -x);
}
This option seems to be usable for Intel CPUs after Haswell and OS/compiler which are not so old. The usability of the option can be checked by the following procedure:

5. Benchmark

Using example/test-nishi.cc as a sample program that heavily used interval<double> or interval<dd>, we examined the calculation speed of the above compile options. Note that, for test-nishi.cc, all calculations are done in dd precision by adding -DTEST_DD .

The computer environment used is as follows:

Intel Core i9 7900X 3.3GHz/4.3GHz, Memory 128G
Ubuntu 16.04 LTS
gcc 5.4.0

As explained in the previous section, in some cases the calculation speed will be improved with the -DKV_USE_TPFMA option. Calculation time of both is shown in that cases.

precision (type of endpoint) compile option calculation time
without -DKV_USE_TPFMA with -DKV_USE_TPFMA
double -O3 10.81 sec
-O3 -DKV_FASTROUND 8.40 sec
-O3 -DKV_NOHWROUND 19.91 sec 12.68 sec
-O3 -DKV_USE_AVX512 -mavx512f 5.55 sec
dd -O3 54.10 sec 44.94 sec
-O3 -DKV_FASTROUND 46.08 sec 36.72 sec
-O3 -DKV_NOHWROUND 185.7 sec 118.9 sec
-O3 -DKV_USE_AVX512 -mavx512f 21.42 sec

The calculation time is the average measured 3 times.