The X86_NATIVE_CPU Kconfig build time parameter has been merged for the Linux 6.16 merge window as an easy way to enforce the “-march=native” compiler behavior on AMD and Intel processors to optimize kernel builds for your system’s local CPU architecture/family. For those who want “-march=native” for Linux kernel builds on AMD/Intel x86_64 processors, you can easily include a new CONFIG_X86_NATIVE_CPU parameter to set this compiler behavior in local kernel builds. The CONFIG_X86_NATIVE_CPU parameter is considered when compiling the Linux x86_64 kernel with GCC or LLVM Clang when using Clang 19 or later due to a compiler bug in the Linux kernel in older compiler versions. In addition to setting the compiler parameter “-march=native” for Linux kernel C code, enabling this new Kconfig build parameter also sets “-Ctarget-cpu=native” for Rust kernel code.

  • signofzeta@lemmygrad.ml
    link
    fedilink
    English
    arrow-up
    7
    ·
    3 days ago

    This is a compile-time option that will tell the compiler to optimize for the CPU in your computer, rather than any CPU.

    By default, the x86_64 kernel will build itself so that it can boot and run on any 64-bit Intel or AMD processor. This means it may have to ignore or check for newer instruction sets like (let’s say, totally at random) AVX512:

    if (CPU supports AVX512)
        do_efficient_avx512_thing (a, b, c)
    else
        a = something()
        b = some_nonavx512_prep_work()
        c = some_other_old_way_of_doing_things()
        do_nonavx512_thing (a, b, c)
    

    So, if you have an AVX512-capable CPU, it still has to check before using that instruction. Plus, your compiled kernel will be slightly larger because it needs to contain both ways of doing the thing.

    Using this option tells the compiler to compile code optimized for your current processor:

    do_efficient_avx512_thing (a, b, c)
    

    This is a gross oversimplication. The compiler will also take things into consideration such as instruction sets, scheduling, core and thread counts, big and small cores, and more.

    But the tl;dr is that optimized code is smaller, faster, and maybe a teensy bit more power efficient.

    The downside? If you try to boot this optimized code on an older CPU (or rarely, a newer CPU), it will eventually say “illegal instruction” and crash.