March Sibi Still ?

For -O0, whether -march=native or -march= is the default still specifies the same family, so both are perfectly compatibly with -O0; and whenever another optimization level is specified, -march=native is beneficial to performance. So, for me, the fact that -O0 is the default doesn't matter for -march 's default.
March Sibi Still   ? 1

Using -march will also allow you more possibilities to use 3rd party closed source as well. You should be able to link -mcpu=cortex-r5 with -march=armv7-r code; well it is fine in one directions, so the tools may complain.

What are the differences and tradeoffs between -march=haswell, -march=core-avx2, and -mavx2 for compiling avx2 intrinsics? I know that -mavx2 is a flag and -march=haswell/core-avx2 are architectures which just translate to a bunch of flags. So -mavx2 is a subset of the other two. But beyond that, how do I choose the right one for my application?

As I understand it, -march=native will detect the ISA and extensions to use from cpuid (which include model, family and stepping information). -march=xxx will use a baseline set of extensions and a baseline ISA. There are a lot of possible combinations of extensions, so only the most relevant were chosen (e.g. skylake-avx512 was added to reflect an important extension of some skylakes). -march ...

March Sibi Still   ? 4

Internet search for "-march=armv8.2-a+i8mm" turns up nearly nothing helpful. Either build_aar.sh is asking for an arch that doesn't make sense, or I need to plug in a version of clang that supports that arch.

March Sibi Still   ? 5

-march: generate instructions for a specific machine type. Defaults to x86-64-v3 on AMD64 and armv8-a on AArch64. Use -march=compatibility for best compatibility, or -march=native for best performance if a native executable is deployed on the same machine or on a machine with the same CPU features. To list all available machine types, use ...

March Sibi Still   ? 6

I'm compiling my C++ app using GCC 4.3. Instead of manually selecting the optimization flags I'm using -march=native, which in theory should add all optimization flags applicable to the hardware I'm

March Sibi Still   ? 7

On x86 processors, just use -march=native. GCC will handle the rest by setting arch and tune to same value. ARM is trickier since GCC sometimes segfault's when using -march=native. You should also use a modern GCC or maybe Clang. Clang creates better code than GCC with some SIMD source code. You will need to benchmark to determine which performs best for your code.