[llvm-dev] Discuss about the LLVM SW mitigation to Jump Conditional Code Erratum

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

[llvm-dev] Discuss about the LLVM SW mitigation to Jump Conditional Code Erratum

Chris Lattner via llvm-dev

Hi all,

 

I’d like to discuss about the LLVM SW mitigation to Jump Conditional Code Erratum in this mailing thread. The patch was submitted in phabricator https://reviews.llvm.org/D70157. There were many review comments about its performance/code size impact, and some suggestions how to make the patches more generic to apply to other scenarios.

 

Let’s start from the performance/code size/build time impact. Below is the data we got from the test suite.

LLVM test-suite

Baseline

sw_prefix

hw

hw_sw_prefix

compile_time

0.276

0.282

0.276

0.282

exec_time

286.465

285.017

291.294

287.766

code_size

3.868

3.889

3.868

3.889

 

 

LLVM test-suite

Baseline

sw_prefix

hw

hw_sw_prefix

compile_time

1.000

1.021

1.000

1.021

exec_time

1.000

0.995

1.017

1.005

code_size

1.000

1.005

1.000

1.005

 

Test date:

              2019/11/25

 

System Configuration:

OS: Red Hat* 8.0 x86_64

Memory: 191 GB

CPUCount: 2

CoreCount: 40

Intel HyperThreading: yes

CPU Model: Intel(R) Xeon(R) Gold 6148 CPU @ 2.40GHz

Microcode w/o hw mitigation: 0x200005e

Microcode with hw mitigation: 0x2000063

 

 

 

 

 

 

  1. Baseline means the system w/o MCU mitigation and w/o SW mitigation.
  2. SW_prefix means prefix mitigation is applied to a Non MCU system.
  3. HW means the MCU mitigation is applied w/o SW mitigation.
  4. HW+SW means both MCU and prefix mitigations are applied.
  5. The data in 2nd table is normalized as the ratio vs. baseline. The smaller the better.
  6. The test is done on an engineering build plus the SW mitigation patch. It may be variant from build to build.  

 

The data indicates that there’s some performance penalty (1.7%) in HW mitigation. And it reduced to 0.5% with prefix mitigated. The code size increase in test suite is about 0.5%. And the compile time increase is about 2%.

 

More data will follow.

 

Thanks,

Annita

 

For more complete information about performance and benchmark results, visit www.intel.com/benchmarks.  For specific information and notices/disclaimers regarding the Jump Conditional Code Erratum, visit https://www.intel.com/content/dam/support/us/en/documents/processors/mitigations-jump-conditional-code-erratum.pdf.

 

 

 

 

 

 


_______________________________________________
LLVM Developers mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] Discuss about the LLVM SW mitigation to Jump Conditional Code Erratum

Chris Lattner via llvm-dev
Refine and resend it. Hope it’s more readable.
 
Hi all,
 
I'd like to discuss about the LLVM SW mitigation to Jump Conditional Code Erratum in this mailing thread. The patch was submitted in phabricator https://reviews.llvm.org/D70157. There were many review comments about its performance/code size impact, and some suggestions how to make the patches more generic to apply to other scenarios.
 
Let's start from the performance/code size/build time impact. Below is the data we got from the test suite.
LLVM test-suite        Baseline       sw_prefix      hw        hw_sw_prefix
compile_time           0.276          0.282          0.276        0.282
exec_time              286.465        285.017         291.294 287.766
code_size              3.868          3.889          3.868        3.889
 
LLVM test-suite        Baseline       sw_prefix      hw        hw_sw_prefix
compile_time           1.000          1.021          1.000        1.021
exec_time              1.000          0.995          1.017        1.005
code_size              1.000          1.005          1.000        1.005
 
Test date:
              2019/11/25
 
System Configuration:
OS: Red Hat* 8.0 x86_64
Memory: 191 GB
CPUCount: 2
CoreCount: 40
Intel HyperThreading: yes
CPU Model: Intel(R) Xeon(R) Gold 6148 CPU @ 2.40GHz
Microcode w/o hw mitigation: 0x200005e
Microcode with hw mitigation: 0x2000063
 
  1.  Baseline means the system w/o MCU mitigation and w/o SW mitigation.
  2.  SW_prefix means prefix mitigation is applied to a Non MCU system.
  3.  HW means the MCU mitigation is applied w/o SW mitigation.
  4.  HW+SW means both MCU and prefix mitigations are applied.
  5.  The data in 2nd table is normalized as the ratio vs. baseline. The smaller the better.
  6.  The test is done on an engineering build plus the SW mitigation patch. It may be variant from build to build.
 
The data indicates that there's some performance penalty (1.7%) in HW mitigation. And it reduced to 0.5% with prefix mitigated. The code size increase in test suite is about 0.5%. And the compile time increase is about 2%.
 
More data will follow.
 
Thanks,
Annita
 
For more complete information about performance and benchmark results, visit www.intel.com/benchmarks<http://www.intel.com/benchmarks>.  For specific information and notices/disclaimers regarding the Jump Conditional Code Erratum, visit https://www.intel.com/content/dam/support/us/en/documents/processors/mitigations-jump-conditional-code-erratum.pdf.
 
 
 
 

 


_______________________________________________
LLVM Developers mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] Discuss about the LLVM SW mitigation to Jump Conditional Code Erratum

Chris Lattner via llvm-dev
In reply to this post by Chris Lattner via llvm-dev

Below is the performance and code size ratio of SPEC2017.

 

Table 1 shows the observed performance impact of the MCU on the specint2017 and specfp2017 benchmark suite when compiled with LLVM compiler. The columns labeled HW show that there is ~2% performance loss as measured by the geomean. Performance loss on individual components was observed to be as high as 7%.

 

Software-based tools to mitigate these effects are being developed and are outlined below. From our experiments, recompiling the benchmarks recovered the geomean performance to within 99% of the originally observed performance, and the maximum performance loss in SPEC benchmarks was subsequently reduced to within 4% of the original performance.

 

We also measured the increase in code size due to the addition of padding to instructions to align branches correctly (Table 2). The geomean increase in code size is 2-3% with individual outliers of up to 4%.

 

Table 1 - SPEC2017 SW/HW mitigation vs. baseline performance ratio:

test                                                   SW vs baseline                HW vs baseline               HW+SW vs baseline

500.perlbench_r                            0.97                                   0.97                                   0.96

502.gcc_r                                        1.00                                   0.99                                   0.99

505.mcf_r                                       1.00                                   0.97                                   1.01

520.omnetpp_r                             1.00                                   0.99                                   0.99

523.xalancbmk_r                          0.99                                   0.99                                   0.99

525.x264_r                                     1.00                                   0.96                                   0.99

531.deepsjeng_r                           1.00                                   0.98                                   0.99

541.leela_r                                     1.00                                   1.00                                   1.00

557.xz_r                                          1.02                                   0.95                                   1.02

SIR                                                    1.00                                   0.98                                   0.99

 

508.namd_r                                   1.00                                   0.99                                   1.00

510.parest_r                                  1.00                                   1.00                                   1.01

511.povray_r                                 1.02                                   0.96                                   1.02

519.lbm_r                                       1.00                                   1.01                                   1.00

526.blender_r                                0.99                                   0.93                                   1.00

538.imagick_r                                0.99                                   0.99                                   1.00

544.nab_r                                       1.00                                   0.98                                   1.00

FIR                                                    1.00                                   0.98                                   1.00

 

Table 2 - SPEC2017 SW/HW mitigation vs. baseline Code Size ratio:

test                                                   baseline                            SW

500.perlbench_r                            1                                        1.04

502.gcc_r                                        1                                         1.04

505.mcf_r                                       1                                         1.02

520.omnetpp_r                            1                                         1.04

523.xalancbmk_r                          1                                         1.03

525.x264_r                                     1                                         1.02

531.deepsjeng_r                           1                                         1.02

541.leela_r                                     1                                         1.03

557.xz_r                                          1                                         1.03

SIR Geomean                                 1                                         1.03

 

508.namd_r                                   1                                         1.01

510.parest_r                                  1                                         1.03

511.povray_r                                 1                                         1.02

519.lbm_r                                       1                                         1.01

526.blender_r                                1                                         1.03

538.imagick_r                                1                                         1.03

544.nab_r                                       1                                         1.03

SFR Geomean                                1                                         1.02

 

Test date:

              2019/11/10

 

System Configuration:

OS: Red Hat* 8.0 x86_64

Memory: 191 GB

CPUCount: 2

CoreCount: 40

Intel HyperThreading: yes

CPU Model: Intel(R) Xeon(R) Gold 6148 CPU @ 2.40GHz

Microcode w/o hw mitigation: 0x200005e

Microcode with hw mitigation: 0x2000063

Compiler options:

unmitigated:      -march=skylake-avx512 -mfpmath=sse -Ofast -funroll-loops –flto

mitigated:          -march=skylake-avx512 -mfpmath=sse -Ofast -funroll-loops –flto -Wl,-plugin-opt=-x86-branches-within-32B-boundaries

 

  1.  Baseline means the system w/o MCU mitigation and w/o SW mitigation.

  2.  SW means prefix mitigation is applied to a Non MCU system.

  3.  HW means the MCU mitigation is applied w/o SW mitigation.

  4.  HW+SW means both MCU and prefix mitigations are applied.

  5.  LLVM measurements are only limited to C/C++ benchmarks. All Fortran benchmarks are excluded.

  6.  The test is done on an engineering build plus the SW mitigation patch. It may be variant from build to build.

 

##Disclaimer:

For more complete information about performance and benchmark results, visit www.intel.com/benchmarks<http://www.intel.com/benchmarks>.  For specific information and notices/disclaimers regarding the Jump Conditional Code Erratum, visit https://www.intel.com/content/dam/support/us/en/documents/processors/mitigations-jump-conditional-code-erratum.pdf.


_______________________________________________
LLVM Developers mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] Discuss about the LLVM SW mitigation to Jump Conditional Code Erratum

Chris Lattner via llvm-dev
In reply to this post by Chris Lattner via llvm-dev

I will reply those comments tomorrow.


_______________________________________________
LLVM Developers mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] Discuss about the LLVM SW mitigation to Jump Conditional Code Erratum

Chris Lattner via llvm-dev
Annita, do you have measurements with inserting NOPs instead of redundant segment prefixes. The code review was asking for performance data to show that prefixes are better than NOPs.

~Craig


On Wed, Dec 4, 2019 at 5:40 AM Zhang, Annita via llvm-dev <[hidden email]> wrote:

I will reply those comments tomorrow.

_______________________________________________
LLVM Developers mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

_______________________________________________
LLVM Developers mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] Discuss about the LLVM SW mitigation to Jump Conditional Code Erratum

Chris Lattner via llvm-dev
On Wed, Dec 4, 2019 at 8:41 AM Craig Topper via llvm-dev
<[hidden email]> wrote:
>
> Annita, do you have measurements with inserting NOPs instead of redundant segment prefixes. The code review was asking for performance data to show that prefixes are better than NOPs.
>
> ~Craig

Also -ffunction-sections. With more sections, it may be easier to
exhibit non-convergence problems. The compile time may not be slowed
much because functions will be smaller. The code size may increase if
more sections need over-alignment.
_______________________________________________
LLVM Developers mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Reply | Threaded
Open this post in threaded view
|

Re: [llvm-dev] Discuss about the LLVM SW mitigation to Jump Conditional Code Erratum

Chris Lattner via llvm-dev
Yes, we will test it. I think -ffunction-sections will increase the section numbers, but not the function numbers. The code size may be increased because every code section has to be aligned with 32-byte for the padding.


-----Original Message-----
From: Fāng-ruì Sòng <[hidden email]>
Sent: Friday, December 6, 2019 4:21 PM
To: Zhang, Annita <[hidden email]>
Cc: [hidden email]; Craig Topper <[hidden email]>
Subject: Re: [llvm-dev] Discuss about the LLVM SW mitigation to Jump Conditional Code Erratum

On Wed, Dec 4, 2019 at 8:41 AM Craig Topper via llvm-dev <[hidden email]> wrote:
>
> Annita, do you have measurements with inserting NOPs instead of redundant segment prefixes. The code review was asking for performance data to show that prefixes are better than NOPs.
>
> ~Craig

Also -ffunction-sections. With more sections, it may be easier to exhibit non-convergence problems. The compile time may not be slowed much because functions will be smaller. The code size may increase if more sections need over-alignment.
_______________________________________________
LLVM Developers mailing list
[hidden email]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev