.. _summary_performance_change: Performance change ================== This page compares the performance of the various FINUFFT releases and the latest commit on the master branch. The graphs illustrate the change in the average duration of each step: makeplan, setpts, and execute across all versions. Parameters represent different input variations that were mentioned in this `Github discussion `__ by FINUFFT users. The results of this test are automatically generated by the Github Action. Library versions are always recompiled, with the ``cmake`` flags ``-DFINUFFT_BUILD_TESTS=ON -DCMAKE_BUILD_TYPE=Release`` and tested in a single run. This ensures that all tests are optimized by the same compiler and are executed on the same CPU. However, the exact CPU and compiler version depend on the runner. FINUFFT can use 2 backend libraries for the Fast Fourier Transform: FFTW and DUCC. Although plots for both backends are generated using the same parameters, they should not be compared directly, because a separate runner executes tests for each backend. The headers of the backend sections show a summary of the CPU characteristics and compiler flags for each runner. In addition to the backend, all plots are grouped by transform type and dimensionality. The reported speedup label (e.g. ``1.10x``) means *faster than the baseline by that factor* — the baseline is the leftmost bar (oldest version, or master in PR comparisons). .. PERFTEST_BACKENDS_BELOW Performance (FFTW backend) ----------------------------------- CPU name: ``AMD EPYC 7763 64-Core Processor``. Arch: ``X86_64``. Core count: ``2``. ISA extensions present: ``3dnowext, 3dnowprefetch, abm, adx, aes, aperfmperf, apic, arat, avx, avx2, bmi1, bmi2, clflush, clflushopt, clwb, clzero, cmov, cmp_legacy, constant_tsc, cpuid, cr8_legacy, cx16, cx8, de, decodeassists, erms, extd_apicid, f16c, flushbyasid, fma, fpu, fsgsbase, fsrm, fxsr, fxsr_opt, ht, hypervisor, invpcid, lahf_lm, lm, mca, mce, misalignsse, mmx, mmxext, movbe, msr, mtrr, nonstop_tsc, nopl, npt, nrip_save, nx, osvw, osxsave, pae, pat, pausefilter, pcid, pclmulqdq, pdpe1gb, pfthreshold, pge, pni, popcnt, pse, pse36, rdpid, rdpru, rdrand, rdrnd, rdseed, rdtscp, rep_good, sep, sha, sha_ni, smap, smep, sse, sse2, sse4_1, sse4_2, sse4a, ssse3, svm, syscall, topoext, tsc, tsc_known_freq, tsc_reliable, tsc_scale, umip, user_shstk, v_vmsave_vmload, vaes, vmcb_clean, vme, vmmcall, vpclmulqdq, xgetbv1, xsave, xsavec, xsaveerptr, xsaveopt, xsaves``. Compiler version: ``c++ (Ubuntu 13.3.0-6ubuntu2~24.04.1) 13.3.0``. Compiler flags: ``-march=native``. 1D Transforms ~~~~~~~~~~~~~~~~~~~~~ Type 1 ^^^^^^^^^^^^^^^^ Parameters: ``prec:f N1:10000.0 N2:1 N3:1 ntransf:1 threads:1 M:10000000.0 tol:0.002`` .. image:: https://raw.githubusercontent.com/flatironinstitute/finufft/perftest-results/docs/pics/perftestci_d83ca6fb-7fee-4ca8-8b5e-e9a5adf63e50.png :alt: pics/perftestci_d83ca6fb-7fee-4ca8-8b5e-e9a5adf63e50.png :width: 100% Parameters: ``prec:d N1:10000.0 N2:1 N3:1 ntransf:1 threads:1 M:10000000.0 tol:1e-09`` .. image:: https://raw.githubusercontent.com/flatironinstitute/finufft/perftest-results/docs/pics/perftestci_64414738-dc45-41ee-a299-83e5472bc6e2.png :alt: pics/perftestci_64414738-dc45-41ee-a299-83e5472bc6e2.png :width: 100% Type 2 ^^^^^^^^^^^^^^^^ Parameters: ``prec:f N1:10000.0 N2:1 N3:1 ntransf:1 threads:1 M:10000000.0 tol:0.002`` .. image:: https://raw.githubusercontent.com/flatironinstitute/finufft/perftest-results/docs/pics/perftestci_d0c80369-f9d7-4b32-820a-67a9bea03ce0.png :alt: pics/perftestci_d0c80369-f9d7-4b32-820a-67a9bea03ce0.png :width: 100% Parameters: ``prec:d N1:10000.0 N2:1 N3:1 ntransf:1 threads:1 M:10000000.0 tol:1e-09`` .. image:: https://raw.githubusercontent.com/flatironinstitute/finufft/perftest-results/docs/pics/perftestci_5b957cca-f8e5-4ca2-84d0-3207d8f129a6.png :alt: pics/perftestci_5b957cca-f8e5-4ca2-84d0-3207d8f129a6.png :width: 100% Type 3 ^^^^^^^^^^^^^^^^ Parameters: ``prec:f N1:10000.0 N2:1 N3:1 ntransf:1 threads:1 M:10000000.0 tol:0.002`` .. image:: https://raw.githubusercontent.com/flatironinstitute/finufft/perftest-results/docs/pics/perftestci_ff3c8211-f280-488f-b033-c34d5291e140.png :alt: pics/perftestci_ff3c8211-f280-488f-b033-c34d5291e140.png :width: 100% Parameters: ``prec:d N1:10000.0 N2:1 N3:1 ntransf:1 threads:1 M:10000000.0 tol:1e-09`` .. image:: https://raw.githubusercontent.com/flatironinstitute/finufft/perftest-results/docs/pics/perftestci_90caeca7-23f6-41b5-a811-714de6fb1424.png :alt: pics/perftestci_90caeca7-23f6-41b5-a811-714de6fb1424.png :width: 100% 2D Transforms ~~~~~~~~~~~~~~~~~~~~~ Type 1 ^^^^^^^^^^^^^^^^ Parameters: ``prec:f N1:320 N2:320 N3:1 ntransf:1 threads:1 M:10000000.0 tol:0.0001`` .. image:: https://raw.githubusercontent.com/flatironinstitute/finufft/perftest-results/docs/pics/perftestci_ea93ca12-089b-49ae-968d-80f34fa10656.png :alt: pics/perftestci_ea93ca12-089b-49ae-968d-80f34fa10656.png :width: 100% Parameters: ``prec:d N1:320 N2:320 N3:1 ntransf:1 threads:1 M:10000000.0 tol:1e-09`` .. image:: https://raw.githubusercontent.com/flatironinstitute/finufft/perftest-results/docs/pics/perftestci_291196e5-5d9c-4d49-8428-4872a43cff20.png :alt: pics/perftestci_291196e5-5d9c-4d49-8428-4872a43cff20.png :width: 100% Parameters: ``prec:f N1:320 N2:320 N3:1 ntransf:1 threads:0 M:10000000.0 tol:0.0001`` .. image:: https://raw.githubusercontent.com/flatironinstitute/finufft/perftest-results/docs/pics/perftestci_d8181b63-8275-4a36-86fa-e25863fdd631.png :alt: pics/perftestci_d8181b63-8275-4a36-86fa-e25863fdd631.png :width: 100% Type 2 ^^^^^^^^^^^^^^^^ Parameters: ``prec:f N1:320 N2:320 N3:1 ntransf:1 threads:1 M:10000000.0 tol:0.0001`` .. image:: https://raw.githubusercontent.com/flatironinstitute/finufft/perftest-results/docs/pics/perftestci_31e4bd5b-7d01-45e3-bf0d-14b228c76605.png :alt: pics/perftestci_31e4bd5b-7d01-45e3-bf0d-14b228c76605.png :width: 100% Parameters: ``prec:d N1:320 N2:320 N3:1 ntransf:1 threads:1 M:10000000.0 tol:1e-09`` .. image:: https://raw.githubusercontent.com/flatironinstitute/finufft/perftest-results/docs/pics/perftestci_cb01aeb2-c3ce-4081-99b7-3563e864a715.png :alt: pics/perftestci_cb01aeb2-c3ce-4081-99b7-3563e864a715.png :width: 100% Parameters: ``prec:f N1:320 N2:320 N3:1 ntransf:1 threads:0 M:10000000.0 tol:0.0001`` .. image:: https://raw.githubusercontent.com/flatironinstitute/finufft/perftest-results/docs/pics/perftestci_437e973f-0dbc-4884-b235-709bd666f479.png :alt: pics/perftestci_437e973f-0dbc-4884-b235-709bd666f479.png :width: 100% Type 3 ^^^^^^^^^^^^^^^^ Parameters: ``prec:f N1:320 N2:320 N3:1 ntransf:1 threads:1 M:10000000.0 tol:0.0001`` .. image:: https://raw.githubusercontent.com/flatironinstitute/finufft/perftest-results/docs/pics/perftestci_5625b87a-bf23-4a8d-986b-013e3bf8e07d.png :alt: pics/perftestci_5625b87a-bf23-4a8d-986b-013e3bf8e07d.png :width: 100% Parameters: ``prec:d N1:320 N2:320 N3:1 ntransf:1 threads:1 M:10000000.0 tol:1e-09`` .. image:: https://raw.githubusercontent.com/flatironinstitute/finufft/perftest-results/docs/pics/perftestci_ffa86523-9f67-4387-ab34-190b8639c3f9.png :alt: pics/perftestci_ffa86523-9f67-4387-ab34-190b8639c3f9.png :width: 100% Parameters: ``prec:f N1:320 N2:320 N3:1 ntransf:1 threads:0 M:10000000.0 tol:0.0001`` .. image:: https://raw.githubusercontent.com/flatironinstitute/finufft/perftest-results/docs/pics/perftestci_6152358c-5560-4121-a402-300c1031b1b8.png :alt: pics/perftestci_6152358c-5560-4121-a402-300c1031b1b8.png :width: 100% 3D Transforms ~~~~~~~~~~~~~~~~~~~~~ Type 1 ^^^^^^^^^^^^^^^^ Parameters: ``prec:d N1:192 N2:192 N3:128 ntransf:1 threads:0 M:10000000.0 tol:1e-07`` .. image:: https://raw.githubusercontent.com/flatironinstitute/finufft/perftest-results/docs/pics/perftestci_cc3c78fd-e41d-46df-b964-03f3627fb9fc.png :alt: pics/perftestci_cc3c78fd-e41d-46df-b964-03f3627fb9fc.png :width: 100% Type 2 ^^^^^^^^^^^^^^^^ Parameters: ``prec:d N1:192 N2:192 N3:128 ntransf:1 threads:0 M:10000000.0 tol:1e-07`` .. image:: https://raw.githubusercontent.com/flatironinstitute/finufft/perftest-results/docs/pics/perftestci_1c104f6b-cba2-46ac-8f69-169064f5ec42.png :alt: pics/perftestci_1c104f6b-cba2-46ac-8f69-169064f5ec42.png :width: 100% Type 3 ^^^^^^^^^^^^^^^^ Parameters: ``prec:d N1:192 N2:192 N3:128 ntransf:1 threads:0 M:10000000.0 tol:1e-07`` .. image:: https://raw.githubusercontent.com/flatironinstitute/finufft/perftest-results/docs/pics/perftestci_68d6af2e-8fec-4ff3-8542-62a737f33680.png :alt: pics/perftestci_68d6af2e-8fec-4ff3-8542-62a737f33680.png :width: 100% Performance (DUCC backend) ----------------------------------- CPU name: ``AMD EPYC 7763 64-Core Processor``. Arch: ``X86_64``. Core count: ``2``. ISA extensions present: ``3dnowext, 3dnowprefetch, abm, adx, aes, aperfmperf, apic, arat, avx, avx2, bmi1, bmi2, clflush, clflushopt, clwb, clzero, cmov, cmp_legacy, constant_tsc, cpuid, cr8_legacy, cx16, cx8, de, decodeassists, erms, extd_apicid, f16c, flushbyasid, fma, fpu, fsgsbase, fsrm, fxsr, fxsr_opt, ht, hypervisor, invpcid, lahf_lm, lm, mca, mce, misalignsse, mmx, mmxext, movbe, msr, mtrr, nonstop_tsc, nopl, npt, nrip_save, nx, osvw, osxsave, pae, pat, pausefilter, pcid, pclmulqdq, pdpe1gb, pfthreshold, pge, pni, popcnt, pse, pse36, rdpid, rdpru, rdrand, rdrnd, rdseed, rdtscp, rep_good, sep, sha, sha_ni, smap, smep, sse, sse2, sse4_1, sse4_2, sse4a, ssse3, svm, syscall, topoext, tsc, tsc_known_freq, tsc_reliable, tsc_scale, umip, user_shstk, v_vmsave_vmload, vaes, vmcb_clean, vme, vmmcall, vpclmulqdq, xgetbv1, xsave, xsavec, xsaveerptr, xsaveopt, xsaves``. Compiler version: ``c++ (Ubuntu 13.3.0-6ubuntu2~24.04.1) 13.3.0``. Compiler flags: ``-march=native``. 1D Transforms ~~~~~~~~~~~~~~~~~~~~~ Type 1 ^^^^^^^^^^^^^^^^ Parameters: ``prec:f N1:10000.0 N2:1 N3:1 ntransf:1 threads:1 M:10000000.0 tol:0.002`` .. image:: https://raw.githubusercontent.com/flatironinstitute/finufft/perftest-results/docs/pics/perftestci_6b56a0b3-9f0b-42bd-b434-c91b41b7adc2.png :alt: pics/perftestci_6b56a0b3-9f0b-42bd-b434-c91b41b7adc2.png :width: 100% Parameters: ``prec:d N1:10000.0 N2:1 N3:1 ntransf:1 threads:1 M:10000000.0 tol:1e-09`` .. image:: https://raw.githubusercontent.com/flatironinstitute/finufft/perftest-results/docs/pics/perftestci_3173eebc-bc9e-48b6-9c8a-430731176a49.png :alt: pics/perftestci_3173eebc-bc9e-48b6-9c8a-430731176a49.png :width: 100% Type 2 ^^^^^^^^^^^^^^^^ Parameters: ``prec:f N1:10000.0 N2:1 N3:1 ntransf:1 threads:1 M:10000000.0 tol:0.002`` .. image:: https://raw.githubusercontent.com/flatironinstitute/finufft/perftest-results/docs/pics/perftestci_9300ca79-e3cc-42b0-91a2-abec78958b9f.png :alt: pics/perftestci_9300ca79-e3cc-42b0-91a2-abec78958b9f.png :width: 100% Parameters: ``prec:d N1:10000.0 N2:1 N3:1 ntransf:1 threads:1 M:10000000.0 tol:1e-09`` .. image:: https://raw.githubusercontent.com/flatironinstitute/finufft/perftest-results/docs/pics/perftestci_b74d5b10-f1bd-4f34-a66c-87121caa1eda.png :alt: pics/perftestci_b74d5b10-f1bd-4f34-a66c-87121caa1eda.png :width: 100% Type 3 ^^^^^^^^^^^^^^^^ Parameters: ``prec:f N1:10000.0 N2:1 N3:1 ntransf:1 threads:1 M:10000000.0 tol:0.002`` .. image:: https://raw.githubusercontent.com/flatironinstitute/finufft/perftest-results/docs/pics/perftestci_19110b46-bd8e-4cf9-b597-43c896d072e6.png :alt: pics/perftestci_19110b46-bd8e-4cf9-b597-43c896d072e6.png :width: 100% Parameters: ``prec:d N1:10000.0 N2:1 N3:1 ntransf:1 threads:1 M:10000000.0 tol:1e-09`` .. image:: https://raw.githubusercontent.com/flatironinstitute/finufft/perftest-results/docs/pics/perftestci_40c3fd95-bf12-4f96-9819-0c36fc94b8ef.png :alt: pics/perftestci_40c3fd95-bf12-4f96-9819-0c36fc94b8ef.png :width: 100% 2D Transforms ~~~~~~~~~~~~~~~~~~~~~ Type 1 ^^^^^^^^^^^^^^^^ Parameters: ``prec:f N1:320 N2:320 N3:1 ntransf:1 threads:1 M:10000000.0 tol:0.0001`` .. image:: https://raw.githubusercontent.com/flatironinstitute/finufft/perftest-results/docs/pics/perftestci_de01817f-b221-4e9d-b454-e4f132f2e482.png :alt: pics/perftestci_de01817f-b221-4e9d-b454-e4f132f2e482.png :width: 100% Parameters: ``prec:d N1:320 N2:320 N3:1 ntransf:1 threads:1 M:10000000.0 tol:1e-09`` .. image:: https://raw.githubusercontent.com/flatironinstitute/finufft/perftest-results/docs/pics/perftestci_80dee282-0f3b-42a1-8807-2ac552ebb42c.png :alt: pics/perftestci_80dee282-0f3b-42a1-8807-2ac552ebb42c.png :width: 100% Parameters: ``prec:f N1:320 N2:320 N3:1 ntransf:1 threads:0 M:10000000.0 tol:0.0001`` .. image:: https://raw.githubusercontent.com/flatironinstitute/finufft/perftest-results/docs/pics/perftestci_eda5bdc5-e49b-4119-a171-8eca9507803d.png :alt: pics/perftestci_eda5bdc5-e49b-4119-a171-8eca9507803d.png :width: 100% Type 2 ^^^^^^^^^^^^^^^^ Parameters: ``prec:f N1:320 N2:320 N3:1 ntransf:1 threads:1 M:10000000.0 tol:0.0001`` .. image:: https://raw.githubusercontent.com/flatironinstitute/finufft/perftest-results/docs/pics/perftestci_dd3f3a66-e689-40b5-9f83-81ebdef5f74f.png :alt: pics/perftestci_dd3f3a66-e689-40b5-9f83-81ebdef5f74f.png :width: 100% Parameters: ``prec:d N1:320 N2:320 N3:1 ntransf:1 threads:1 M:10000000.0 tol:1e-09`` .. image:: https://raw.githubusercontent.com/flatironinstitute/finufft/perftest-results/docs/pics/perftestci_0d13fd2d-333f-4d2c-a5ab-7924fd5cd23c.png :alt: pics/perftestci_0d13fd2d-333f-4d2c-a5ab-7924fd5cd23c.png :width: 100% Parameters: ``prec:f N1:320 N2:320 N3:1 ntransf:1 threads:0 M:10000000.0 tol:0.0001`` .. image:: https://raw.githubusercontent.com/flatironinstitute/finufft/perftest-results/docs/pics/perftestci_b64dd647-1ad3-42c0-bfb7-8859e39dd4d1.png :alt: pics/perftestci_b64dd647-1ad3-42c0-bfb7-8859e39dd4d1.png :width: 100% Type 3 ^^^^^^^^^^^^^^^^ Parameters: ``prec:f N1:320 N2:320 N3:1 ntransf:1 threads:1 M:10000000.0 tol:0.0001`` .. image:: https://raw.githubusercontent.com/flatironinstitute/finufft/perftest-results/docs/pics/perftestci_7294ae98-8782-4f6a-961c-b8d20c1d3df3.png :alt: pics/perftestci_7294ae98-8782-4f6a-961c-b8d20c1d3df3.png :width: 100% Parameters: ``prec:d N1:320 N2:320 N3:1 ntransf:1 threads:1 M:10000000.0 tol:1e-09`` .. image:: https://raw.githubusercontent.com/flatironinstitute/finufft/perftest-results/docs/pics/perftestci_3b9dbab9-1c4f-4c85-b27d-b7ea8d876ec9.png :alt: pics/perftestci_3b9dbab9-1c4f-4c85-b27d-b7ea8d876ec9.png :width: 100% Parameters: ``prec:f N1:320 N2:320 N3:1 ntransf:1 threads:0 M:10000000.0 tol:0.0001`` .. image:: https://raw.githubusercontent.com/flatironinstitute/finufft/perftest-results/docs/pics/perftestci_11ab115d-0f06-44e3-8611-d63bbb3368d7.png :alt: pics/perftestci_11ab115d-0f06-44e3-8611-d63bbb3368d7.png :width: 100% 3D Transforms ~~~~~~~~~~~~~~~~~~~~~ Type 1 ^^^^^^^^^^^^^^^^ Parameters: ``prec:d N1:192 N2:192 N3:128 ntransf:1 threads:0 M:10000000.0 tol:1e-07`` .. image:: https://raw.githubusercontent.com/flatironinstitute/finufft/perftest-results/docs/pics/perftestci_e7b4999d-b160-4cfe-ba08-f2895073687e.png :alt: pics/perftestci_e7b4999d-b160-4cfe-ba08-f2895073687e.png :width: 100% Type 2 ^^^^^^^^^^^^^^^^ Parameters: ``prec:d N1:192 N2:192 N3:128 ntransf:1 threads:0 M:10000000.0 tol:1e-07`` .. image:: https://raw.githubusercontent.com/flatironinstitute/finufft/perftest-results/docs/pics/perftestci_355fb114-d2db-4ad6-a43b-a70fd21ccc8e.png :alt: pics/perftestci_355fb114-d2db-4ad6-a43b-a70fd21ccc8e.png :width: 100% Type 3 ^^^^^^^^^^^^^^^^ Parameters: ``prec:d N1:192 N2:192 N3:128 ntransf:1 threads:0 M:10000000.0 tol:1e-07`` .. image:: https://raw.githubusercontent.com/flatironinstitute/finufft/perftest-results/docs/pics/perftestci_454ec675-8710-4c33-b53b-32a835ae1415.png :alt: pics/perftestci_454ec675-8710-4c33-b53b-32a835ae1415.png :width: 100%