Main Navigation

Secondary Navigation

Page Contents

Contents

Compilers

Different compilers to the same language all obey the language standard but have different capabilities to generate optimized code and different possibilities, i.e. options and pragmas, to interact. Here is a collection of the most important options and pragmas for the Intel compiler and the GNU compiler.

Options and there Meaning

MeaningIntel compilerGNU compiler
compiler invocationicc (C) icpc (C++)gcc(C) g++ (C++)
Basic optimization-O-O
compile only-c-c
Vectorizing reductions -ffast-math
Vectorizing AVX (256-bit)-xavx-mavx
Vectorizing AVX (512-bit)-xCOMMON-AVX512-mavx512f -mavx512cd
Reports-qopt-report=1
-qopt-report-phase=vec
-ftree-vectorize
-fopt-info-loop-optimized
Assembler code-S-S
Optimizing function calls-ipo-flto
OpenMP-qopenmp-fopenmp

Pragmas

MeaningIntel compilerGNU compiler
ignore potential dependenciesivdepGCC ivdep

AVX specific options

The Intel compiler provides several AVX specific instructions at different compiler versions. There difference is not that clear seen from documentation but in the generated code. Best results are in all cases seen with the option -axCOMMON-AVX512 which is documented to be the option for any Intel processor that supports AVX512 instructions.

The option -axCORE-AVX512 is documented to be the option for the XEON PROCESSOR SCALABLE FAMILY which covers for example all Skylake processors with AVX512 instructions thus the difference is not really clear. But (s. AOS vs. SOA) -axCORE-AVX512 seems not to be able to load elements in a AOS data layout into its AVX512 registers while -axCOMMON-AVX512 does.

-axCOMMON-AVX512 is therefore our recommendation.

Upwards version 7.0, the GNU compiler is capable of generating AVX-512 code. Several options are required to achieve good vectorization. Most import from a performance point of view seems to be the option for link time optimization (-flto). It appears that function calls in different files are not properly optimized and information about non aliased pointers are not properly handled without this option. For performance we recommend the options:

-O3 -ftree-vectorize -mavx512f -mavx512cd -ffast-math -march=skylake-avx512 -flto