Hyper-threading consists of a single CPU core presenting as two separate CPUs which the OS and applications see as having completely separate state. It is possible to turn off hyper-threading system-wide, but here is how to do it on an application-by-application basis using the numactl utility.

CPU topology representation in Linux is described in these kernel pages. There is some useful information on how to interpret this also in this article by AWS as well as on the use of the numactl utility in this article by nuralmagic.

The fundamental approach is straightforward:

  1. Determine which logical CPUs share a core either by parsing /sys/devices/system/cpu/cpu*/topology/thread_siblings_list directory structure or by the using lstopo utility

  2. Run an application on subset of available CPUs using the numactl program with --physcpubind= option to avoid the logical CPUs that share single physical core

Example test

Here are some example tests on a laptop CPU. The benchmark program is simply the 7zip inbuilt bechmark which can be run with 7zz b.

These tests are purely illustrative of the technique using numactl. It is essential to perform your own tests on your own target hardware and applications! The whole point of this article is that effect of hyperthreading is not easy to predict!

Topology of test machine

The test machine has an i7-1260p CPU with 4 hyper-threaded performance cores and 8 economy cores for a total 16 threads. A graphical representation made with lstopo is below:

Topology of the Intel i7-1260P processor

The thread siblings on performance cores are neighbouring CPU numbers:

$ cat /sys/devices/system/cpu/cpu0/topology/thread_siblings_list
0-1

Summary of results

On this CPU hyper-threading slows down the 7zip benchmark by about 15%. If avoiding hyperthreading, using two performance cores gives double the performance. Economy cores do not support hyper-threading and using two cores gives about double the performance regardless of the combination of CPUs used.

It is essential to do tests of your own applications on your own hardware as many factors will change the impact of hyperthreading!

Detailed results

Results on Performance Cores

Single “Performance” core benchmark

numactl --physcpubind=0 7zz b

7-Zip (z) 22.01 (x64) : Copyright (c) 1999-2022 Igor Pavlov : 2022-07-15
 64-bit locale=en_GB.UTF-8 Threads:16

Compiler: 12.2.0 GCC 12.2.0
Linux : 6.6.56-1-longterm : #1 SMP PREEMPT_DYNAMIC Thu Oct 10 15:20:52 UTC 2024 (67435e5) : x86_64
PageSize:4KB THP:always hwcap:2 hwcap2:2
12th Gen Intel(R) Core(TM) i7-1260P (906A3) 

1T CPU Freq (MHz):  1415  1521  1721  2190  2190  2158  1944

RAM size:   31796 MB,  # CPU hardware threads:   1 / 16 : 0001
RAM usage:    437 MB,  # Benchmark threads:      1

                       Compressing  |                  Decompressing
Dict     Speed Usage    R/U Rating  |      Speed Usage    R/U Rating
         KiB/s     %   MIPS   MIPS  |      KiB/s     %   MIPS   MIPS

22:       4100   100   3981   3989  |      27426   100   2342   2342
23:       4092   100   4185   4170  |      27280   100   2359   2361
24:       3499   100   3763   3763  |      28183   100   2478   2474
25:       2981   100   3408   3404  |      26446   100   2353   2354
----------------------------------  | ------------------------------
Avr:      3668   100   3834   3831  |      27334   100   2383   2383
Tot:             100   3108   3107

Benchmark avoiding hyperthreading

I avoid hyperthreading by pinning the two benchmark threads to CPUs which are on separate cores, i.e., CPUs 0 and 2.

$ numactl --physcpubind=0,2 7zz b

7-Zip (z) 22.01 (x64) : Copyright (c) 1999-2022 Igor Pavlov : 2022-07-15
 64-bit locale=en_GB.UTF-8 Threads:16

Compiler: 12.2.0 GCC 12.2.0
Linux : 6.6.56-1-longterm : #1 SMP PREEMPT_DYNAMIC Thu Oct 10 15:20:52 UTC 2024 (67435e5) : x86_64
PageSize:4KB THP:always hwcap:2 hwcap2:2
12th Gen Intel(R) Core(TM) i7-1260P (906A3) 

1T CPU Freq (MHz):  1632  1173  1440  2171  2126  2045  2062
1T CPU Freq (MHz): 100% 2009   100% 1640  

RAM size:   31796 MB,  # CPU hardware threads:   2 / 16 : 0005
RAM usage:    444 MB,  # Benchmark threads:      2

                       Compressing  |                  Decompressing
Dict     Speed Usage    R/U Rating  |      Speed Usage    R/U Rating
         KiB/s     %   MIPS   MIPS  |      KiB/s     %   MIPS   MIPS

22:       7543   168   4367   7338  |      45814   200   1956   3912
23:       8319   173   4891   8477  |      48361   200   2094   4186
24:       8079   181   4806   8687  |      39063   200   1718   3429
25:       5865   181   3701   6697  |      37978   200   1692   3380
----------------------------------  | ------------------------------
Avr:      7452   176   4441   7800  |      42804   200   1865   3727
Tot:             188   3153   5763

Forcing hyperthreading

Here I am selecting sibling CPUs 0 and 1, which share a single core and the performance is 10% less than in the non-hyperthreaded scenario:

 numactl --physcpubind=0,1 7zz b
7-Zip (z) 22.01 (x64) : Copyright (c) 1999-2022 Igor Pavlov : 2022-07-15
 64-bit locale=en_GB.UTF-8 Threads:16

Compiler: 12.2.0 GCC 12.2.0
Linux : 6.6.56-1-longterm : #1 SMP PREEMPT_DYNAMIC Thu Oct 10 15:20:52 UTC 2024 (67435e5) : x86_64
PageSize:4KB THP:always hwcap:2 hwcap2:2
12th Gen Intel(R) Core(TM) i7-1260P (906A3) 

1T CPU Freq (MHz):  1798  2130  1862  2104  2150  2137  2125
1T CPU Freq (MHz):  99% 2114   100% 2152  

RAM size:   31796 MB,  # CPU hardware threads:   2 / 16 : 0003
RAM usage:    444 MB,  # Benchmark threads:      2

                       Compressing  |                  Decompressing
Dict     Speed Usage    R/U Rating  |      Speed Usage    R/U Rating
         KiB/s     %   MIPS   MIPS  |      KiB/s     %   MIPS   MIPS

22:       6511   164   3854   6335  |      41278   200   1763   3524
23:       6648   168   4025   6774  |      37844   200   1639   3276
24:       6481   173   4019   6969  |      40424   200   1775   3549
25:       6309   176   4101   7204  |      39108   200   1740   3481
----------------------------------  | ------------------------------
Avr:      6487   170   4000   6820  |      39663   200   1729   3457
Tot:             185   2865   5139

Results on Economy cores

Economy cores are CPUs 8-15 and are not hyper-threaded.

Single Core

numactl --physcpubind=8 7zz b

7-Zip (z) 22.01 (x64) : Copyright (c) 1999-2022 Igor Pavlov : 2022-07-15
 64-bit locale=en_GB.UTF-8 Threads:16

Compiler: 12.2.0 GCC 12.2.0
Linux : 6.6.56-1-longterm : #1 SMP PREEMPT_DYNAMIC Thu Oct 10 15:20:52 UTC 2024 (67435e5) : x86_64
PageSize:4KB THP:always hwcap:2 hwcap2:2
12th Gen Intel(R) Core(TM) i7-1260P (906A3) 

1T CPU Freq (MHz):  1318  1586  1452  2157  1721  2047  2149

RAM size:   31796 MB,  # CPU hardware threads:   1 / 16 : 0100
RAM usage:    437 MB,  # Benchmark threads:      1

                       Compressing  |                  Decompressing
Dict     Speed Usage    R/U Rating  |      Speed Usage    R/U Rating
         KiB/s     %   MIPS   MIPS  |      KiB/s     %   MIPS   MIPS

22:       3782   100   3690   3680  |      29019   100   2485   2478
23:       3299    99   3392   3362  |      29115   100   2530   2520
24:       3052   100   3281   3282  |      29324   100   2574   2574
25:       2475    99   2849   2827  |      28162    99   2524   2507
----------------------------------  | ------------------------------
Avr:      3152   100   3303   3288  |      28905   100   2528   2520
Tot:             100   2916   2904

Neighbouring two CPUs

Economy cores are not hyperthreaded so neighbouring CPUs (in this case 8 and 9) are expected to have about the same joint performance as separated CPUs.

numactl --physcpubind=8,9 7zz b

7-Zip (z) 22.01 (x64) : Copyright (c) 1999-2022 Igor Pavlov : 2022-07-15
 64-bit locale=en_GB.UTF-8 Threads:16

Compiler: 12.2.0 GCC 12.2.0
Linux : 6.6.56-1-longterm : #1 SMP PREEMPT_DYNAMIC Thu Oct 10 15:20:52 UTC 2024 (67435e5) : x86_64
PageSize:4KB THP:always hwcap:2 hwcap2:2
12th Gen Intel(R) Core(TM) i7-1260P (906A3) 

1T CPU Freq (MHz):  1519  1494  1703  2193  2192  2107  2055
1T CPU Freq (MHz): 101% 2109    98% 1912  

RAM size:   31796 MB,  # CPU hardware threads:   2 / 16 : 0300
RAM usage:    444 MB,  # Benchmark threads:      2

                       Compressing  |                  Decompressing
Dict     Speed Usage    R/U Rating  |      Speed Usage    R/U Rating
         KiB/s     %   MIPS   MIPS  |      KiB/s     %   MIPS   MIPS

22:       6476   163   3854   6300  |      46604   199   2000   3979
23:       5790   167   3534   5900  |      45188   199   1966   3911
24:       6280   172   3930   6752  |      45620   198   2020   4005
25:       5889   172   3913   6724  |      43859   199   1963   3904
----------------------------------  | ------------------------------
Avr:      6108   169   3808   6419  |      45318   199   1987   3950
Tot:             184   2897   5184

Not adjacent two CPUs

numactl --physcpubind=8,10 7zz b

7-Zip (z) 22.01 (x64) : Copyright (c) 1999-2022 Igor Pavlov : 2022-07-15
 64-bit locale=en_GB.UTF-8 Threads:16

Compiler: 12.2.0 GCC 12.2.0
Linux : 6.6.56-1-longterm : #1 SMP PREEMPT_DYNAMIC Thu Oct 10 15:20:52 UTC 2024 (67435e5) : x86_64
PageSize:4KB THP:always hwcap:2 hwcap2:2
12th Gen Intel(R) Core(TM) i7-1260P (906A3) 

1T CPU Freq (MHz):  1536  1449  1444  2187  2193  2164  1965
1T CPU Freq (MHz):  99% 2108   100% 1978  

RAM size:   31796 MB,  # CPU hardware threads:   2 / 16 : 0500
RAM usage:    444 MB,  # Benchmark threads:      2

                       Compressing  |                  Decompressing
Dict     Speed Usage    R/U Rating  |      Speed Usage    R/U Rating
         KiB/s     %   MIPS   MIPS  |      KiB/s     %   MIPS   MIPS

22:       6584   161   3981   6405  |      47833   199   2050   4084
23:       6151   168   3722   6268  |      47703   199   2071   4129
24:       5967   171   3747   6416  |      43583   198   1929   3826
25:       6234   173   4115   7118  |      45644   199   2041   4063
----------------------------------  | ------------------------------
Avr:      6234   168   3891   6552  |      46191   199   2023   4026
Tot:             184   2957   5289