Example usage from C++ and C¶
Quick-start example in C++¶
Here’s how to perform a 1D type-1 transform in double precision from C++, using STL complex vectors. First include our header, and some others needed for the demo:
#include "finufft.h"
#include <vector>
#include <complex>
#include <stdlib.h>
We need nonuniform points x
and complex strengths c
. Let’s create random ones for now:
int M = 1e7; // number of nonuniform points
vector<double> x(M);
vector<complex<double> > c(M);
complex<double> I = complex<double>(0.0,1.0); // the imaginary unit
for (int j=0; j<M; ++j) {
x[j] = M_PI*(2*((double)rand()/RAND_MAX)-1); // uniform random in [-pi,pi)
c[j] = 2*((double)rand()/RAND_MAX)-1 + I*(2*((double)rand()/RAND_MAX)-1);
}
With N
as the desired number of Fourier mode coefficients,
allocate their output array:
int N = 1e6; // number of output modes
vector<complex<double> > F(N);
Now do the NUFFT (with default options, indicated by the NULL
in the following call). Since the interface is
C-compatible, we pass pointers to the start of the arrays (rather than
C++-style vector objects), and also pass N
:
int ier = finufft1d1(M,&x[0],&c[0],+1,1e-9,N,&F[0],NULL);
This fills F
with the output modes, in increasing ordering
with the integer frequency indices from -N/2
up to N/2-1
(since N
is even; for odd is would be -(N-1)/2
up to (N-1)/2
).
The transform (\(10^7\) points to \(10^6\) modes) takes 0.4 seconds on a laptop.
The index is thus offset by N/2
(this is integer division in the odd case), so that frequency k
is output in
F[N/2 + k]
.
Here +1
sets the sign of \(i\) in the exponentials
(see definitions),
1e-9
requests 9-digit relative tolerance, and ier
is a status output
which is zero if successful (otherwise see error codes).
Note
FINUFFT works with a periodicity of \(2\pi\) for type 1 and 2 transforms; see definitions. For example, nonuniform points \(x=\pm\pi\) are equivalent. Points must lie in the input domain \([-3\pi,3\pi)\), which allows the user to assume a convenient periodic domain such as \([-\pi,\pi)\) or \([0,2\pi)\). To handle points outside of \([-3\pi,3\pi)\) the user must fold them back into this domain before passing to FINUFFT. FINUFFT does not handle this case, for speed reasons. To use a different periodicity, linearly rescale your coordinates.
If instead you want to change some options, first
put default values in a finufft_opts
struct,
make your changes, then pass the pointer to FINUFFT:
finufft_opts* opts = new finufft_opts;
finufft_default_opts(opts);
opts->debug = 1; // prints timing/debug info
int ier = finufft1d1(M,&x[0],&c[0],+1,tol,N,&F[0],opts);
Warning
Without the
finufft_default_opts
call, options may take on arbitrary values which may cause a crash.Note that, as of version 2.0,
opts
is passed as a pointer in both places.
See examples/simple1d1.cpp
for a simple full working demo of the above, including a test of the math. If you instead use single-precision arrays,
replace the tag finufft
by finufftf
in each command; see examples/simple1d1f.cpp
.
From the examples/
directory, to compile on a linux/GCC system, linking to the static library, use eg:
g++ simple1d1.cpp -o simple1d1 -I../include ../lib-static/libfinufft.a -lfftw3_omp -lfftw3
Executing ./simple1d1
should now work. Better is instead to link to the dynamic shared (.so
) library, via eg:
g++ simple1d1.cpp -o simple1d1 -I../include -Wl,-rpath,$FINUFFT/lib/ -lfinufft
where $FINUFFT
must be replaced by (or be an environment variable set to) the absolute install path for this repository.
Notice how rpath
was used to make an executable that may be called from, or moved to, anywhere.
The examples
and test
directories are good places to see further
usage examples. The documentation for all 18 simple interfaces,
and the more flexible guru interface, is further down this page.
Quick-start example in C¶
The FINUFFT C++ interface is intentionally also C-compatible, for simplity.
Thus, to use from C, the above example only needs to replace the C++
vector
with C-style array creation. Using C99 style, the
above code, with options setting, becomes:
#include <finufft.h>
#include <stdlib.h>
#include <complex.h>
int M = 1e7; // number of nonuniform points
double* x = (double *)malloc(sizeof(double)*M);
double complex* c = (double complex*)malloc(sizeof(double complex)*M);
for (int j=0; j<M; ++j) {
x[j] = M_PI*(2*((double)rand()/RAND_MAX)-1); // uniform random in [-pi,pi)
c[j] = 2*((double)rand()/RAND_MAX)-1 + I*(2*((double)rand()/RAND_MAX)-1);
}
int N = 1e6; // number of modes
double complex* F = (double complex*)malloc(sizeof(double complex)*N);
finufft_opts opts; // make an opts struct
finufft_default_opts(&opts); // set default opts (must do this)
opts.debug = 2; // more debug/timing to stdout
int ier = finufft1d1(M,x,c,+1,1e-9,N,F,&opts);
// (now do something with F here!...)
free(x); free(c); free(F);
See examples/simple1d1c.c
and examples/simple1d1cf.c
for
double- and single-precision C examples, including the math check to insure
the correct indexing of output modes.
2D example in C++¶
We assume Fortran-style contiguous multidimensional arrays, as opposed to C-style arrays of pointers; this allows the widest compatibility with other languages. Assuming the same headers as above, we first create points \((x_j,y_j)\) in the square \([-\pi,\pi)^2\), and strengths as before:
int M = 1e7; // number of nonuniform points
vector<double> x(M), y(M);
vector<complex<double> > c(M);
for (int j=0; j<M; ++j) {
x[j] = M_PI*(2*((double)rand()/RAND_MAX)-1);
y[j] = M_PI*(2*((double)rand()/RAND_MAX)-1);
c[j] = 2*((double)rand()/RAND_MAX)-1 + I*(2*((double)rand()/RAND_MAX)-1);
}
Let’s say we want N1=1000
by N2=2000
2D Fourier coefficients.
We allocate and do the (default options) transform thus:
int N1=1000, N2=2000;
vector<complex<double> > F(N1*N2);
int ier = finufft2d1(M,&x[0],&y[0], &c[0], +1, 1e-6, N1, N2, &F[0], NULL);
This transform takes 0.6 seconds on a laptop.
The modes have increasing ordering
of integer frequency indices from -N1/2
up to N1/2-1
in the fast (x
) dimension,
then indices from -N2/2
up to N2/2-1
in the slow (y
) dimension
(since both N1
and N2
are even).
So, the output frequency (k1,k2)
is found in
F[N1/2 + k1 + (N2/2 + k2)*N1]
.
See opts.modeord
in Options
to instead use FFT-style mode ordering, which
simply differs by an “fftshift” (as it is commonly called).
See examples/simple2d1.cpp
for an example with a math check, to
insure that the mode indexing is correctly understood.
Vectorized interface example¶
A common use case is to perform a stack of identical transforms with the
same size and nonuniform points, but for new strength vectors.
(Applications include interpolating vector-valued data, or processing
MRI images collected with a fixed set of k-space sample points.)
Because it amortizes sorting, FFTW planning, and FFTW plan lookup,
it can be faster to use a “vectorized”
interface (which does the entire stack in one call)
than to repeatedly call the above “simple” interfaces.
This is especially true for many small problems.
Here we show how to do a stack of ntrans=10
1D type 1 NUFFT transforms, in C++,
assuming the same headers as in the first example above.
The strength data vectors are taken to be contiguous (the whole
first vector, followed by the second, etc, rather than interleaved.)
Ie, viewed as a matrix in Fortran storage, each column is a strength vector.
int ntrans = 10; // how many transforms
int M = 1e7; // number of nonuniform points
vector<double> x(M);
vector<complex<double> > c(M*ntrans); // ntrans strength vectors
complex<double> I = complex<double>(0.0,1.0); // the imaginary unit
for (int j=0; j<M; ++j)
x[j] = M_PI*(2*((double)rand()/RAND_MAX)-1);
for (int j=0; j<M*ntrans; ++j) // fill all ntrans vectors...
c[j] = 2*((double)rand()/RAND_MAX)-1 + I*(2*((double)rand()/RAND_MAX)-1);
int N = 1e6; // number of output modes
vector<complex<double> > F(N*trans); // ntrans output vectors
int ier = finufft1d1(M,&x[0],&c[0],+1,1e-9,N,&F[0],NULL); // default opts
This takes 2.6 seconds on a laptop, around 1.4x faster than
making 10 separate “simple” calls.
The frequency index k
in transform number t
(zero-indexing the transforms) is in F[k + (int)N/2 + N*t]
.
See examples/many1d1.cpp
and test/finufft?dmany_test.cpp
for more examples.
Guru interface examples¶
If you want more flexibility than the above, use the “guru” interface: this is similar to that of FFTW3, and to the main interface of NFFT3. It lets you change the nonuniform points while keeping the same pointer to an FFTW plan for a particular number of stacked transforms with a certain number of modes. This avoids the overhead (typically 0.1 ms per thread) of FFTW checking for previous wisdom which would be significant when doing many small transforms. You may also send in a new set of stacked strength data (for type 1 and 3, or coefficients for type 2), reusing the existing FFTW plan and sorted points. Now we redo the above 2D type 1 C++ example with the guru interface.
One first makes a plan giving transform parameters, but no data:
// (assume x, y, c are filled, and F allocated, as in the 2D code above...)
int type=1, dim=2, ntrans=1;
int64_t Ns[] = {1000,2000}; // N1,N2 as 64-bit int array
// step 1: make a plan...
finufft_plan plan;
int ier = finufft_makeplan(type, dim, Ns, +1, ntrans, 1e-6, &plan, NULL);
// step 2: send in pointers to M nonuniform points (just x, y in this case)...
finufft_setpts(plan, M, &x[0], &y[0], NULL, 0, NULL, NULL, NULL);
// (user should not change x, y nonuniform point arrays here!)
// step 3: do the planned transform to the c strength data, output to F...
finufft_execute(plan, &c[0], &F[0]);
// ... you could now send in new points, and/or do transforms with new c data
// ...
// step 4: when done, free the memory used by the plan...
finufft_destroy(plan);
This writes the Fourier coefficients to F
just as in the earlier 2D example.
One difference from the above simple and vectorized interfaces
is that the int64_t
type (aka long long int
)
is needed since the Fourier coefficient dimensions are passed as an array.
Warning
You must not change the nonuniform point arrays (here x
, y
) between passing them to finufft_setpts
and performing finufft_execute
. The latter call expects these arrays to be unchanged. We chose this style of interface since it saves RAM and time (by avoiding unnecessary duplication), allowing the largest possible problems to be solved.
Warning
You must destroy a plan before making a new plan using the same plan object, otherwise a memory leak results.
The complete code with a math test is in examples/guru2d1.cpp
, and for
more examples see examples/guru1d1*.c*
Using the guru interface to perform a vectorized transform (multiple 1D type 1
transforms each with the same nonuniform points) is demonstrated in
examples/gurumany1d1.cpp
. This is similar to the single-command vectorized
interface, but allowing more control (changing the nonuniform points without
re-planning the FFT, etc).
Thread safety for single-threaded transforms, and global state¶
It is possible to call FINUFFT from within multithreaded code, e.g. in an
OpenMP parallel block. In this case opts.nthreads=1
should be set, otherwise
a segfault will occur. This is useful if you don’t want to synchronize
independent transforms.
For demos of this “parallelize over single-threaded transforms” use case, see
the following, which are built as part of the make examples
task:
examples/threadsafe1d1
which runs a 1D type-1 separately on each thread, checking the math, andexamples/threadsafe2d2f
which runs a 2D type-2 on each “slice” (in the MRI language), parallelized over slices with an OpenMP parallel for loop. (In this code there is no math check, just status check.)
However, if you have multiple transforms with the same nonuniform points for
each transform, it is probably much faster to use the vectorized interface,
and do all these transforms with a single such multithreaded FINUFFT call
(see examples/many1d1.cpp
and examples/gurumany1d1.cpp
).
This may be less convenient if you want to leave your slices unsynchronized.
Note
A design decision of FFTW is to have a global state which stores
wisdom and settings. Such global state can cause unforeseen effects on other
routines that also use FFTW. In contrast, FINUFFT uses pointers to plans to store
its state, and does not have a global state (other than one static
flag used as a lock on FFTW initialization in the FINUFFT plan
stage). This means different FINUFFT calls should not affect each other,
although they may affect other codes that use FFTW via FFTW’s global state.