# Example usage from C++ and C¶

## Quick-start example in C++¶

Here’s how to perform a 1D type-1 transform in double precision from C++, using STL complex vectors. First include our header, and some others needed for the demo:

```
#include "finufft.h"
#include <vector>
#include <complex>
#include <stdlib.h>
```

We need nonuniform points `x`

and complex strengths `c`

. Let’s create random ones for now:

```
int M = 1e7; // number of nonuniform points
vector<double> x(M);
vector<complex<double> > c(M);
complex<double> I = complex<double>(0.0,1.0); // the imaginary unit
for (int j=0; j<M; ++j) {
x[j] = M_PI*(2*((double)rand()/RAND_MAX)-1); // uniform random in [-pi,pi)
c[j] = 2*((double)rand()/RAND_MAX)-1 + I*(2*((double)rand()/RAND_MAX)-1);
}
```

With `N`

as the desired number of Fourier mode coefficients,
allocate their output array:

```
int N = 1e6; // number of output modes
vector<complex<double> > F(N);
```

Now do the NUFFT (with default options, indicated by the `NULL`

in the following call). Since the interface is
C-compatible, we pass pointers to the start of the arrays (rather than
C++-style vector objects), and also pass `N`

:

```
int ier = finufft1d1(M,&x[0],&c[0],+1,1e-9,N,&F[0],NULL);
```

This fills `F`

with the output modes, in increasing ordering
with the integer frequency indices from `-N/2`

up to `N/2-1`

(since `N`

is even; for odd is would be `-(N-1)/2`

up to `(N-1)/2`

).
The transform (\(10^7\) points to \(10^6\) modes) takes 0.4 seconds on a laptop.
The index is thus offset by `N/2`

(this is integer division in the odd case), so that frequency `k`

is output in
`F[N/2 + k]`

.
Here `+1`

sets the sign of \(i\) in the exponentials
(see definitions),
`1e-9`

requests 9-digit relative tolerance, and `ier`

is a status output
which is zero if successful (otherwise see error codes).

Note

FINUFFT works with a periodicity of \(2\pi\) for type 1 and 2 transforms; see definitions. For example, nonuniform points \(x=\pm\pi\) are equivalent. Points must lie in the input domain \([-3\pi,3\pi)\), which allows the user to assume a convenient periodic domain such as \([-\pi,\pi)\) or \([0,2\pi)\). To handle points outside of \([-3\pi,3\pi)\) the user must fold them back into this domain before passing to FINUFFT. FINUFFT does not handle this case, for speed reasons. To use a different periodicity, linearly rescale your coordinates.

If instead you want to change some options, first
put default values in a `finufft_opts`

struct,
make your changes, then pass the pointer to FINUFFT:

```
finufft_opts* opts = new finufft_opts;
finufft_default_opts(opts);
opts->debug = 1; // prints timing/debug info
int ier = finufft1d1(M,&x[0],&c[0],+1,tol,N,&F[0],opts);
```

Warning

Without the

`finufft_default_opts`

call, options may take on arbitrary values which may cause a crash.Note that, as of version 2.0,

`opts`

is passed as a pointer in both places.

See `examples/simple1d1.cpp`

for a simple full working demo of the above, including a test of the math. If you instead use single-precision arrays,
replace the tag `finufft`

by `finufftf`

in each command; see `examples/simple1d1f.cpp`

.

Then to compile on a linux/GCC system, linking to the double-precision static library, use eg:

```
g++ simple1d1.cpp -o simple1d1 -I$FINUFFT/include $FINUFFT/lib-static/libfinufft.a -fopenmp -lfftw3_omp -lfftw3 -lm
```

where `$FINUFFT`

denotes the absolute path of your FINUFFT installation.
Better is instead link to the dynamic shared (`.so`

) library, via eg:

```
g++ simple1d1.cpp -o simple1d1 -I$FINUFFT/include -L$FINUFFT/lib -lfinufft -lm
```

The `examples`

and `test`

directories are good places to see further
usage examples. The documentation for all 18 simple interfaces,
and the more flexible guru interface, follows below.

## Quick-start example in C¶

The FINUFFT C++ interface is intentionally also C-compatible, for simplity.
Thus, to use from C, the above example only needs to replace the C++
`vector`

with C-style array creation. Using C99 style, the
above code, with options setting, becomes:

```
#include <finufft.h>
#include <stdlib.h>
#include <complex.h>
int M = 1e7; // number of nonuniform points
double* x = (double *)malloc(sizeof(double)*M);
double complex* c = (double complex*)malloc(sizeof(double complex)*M);
for (int j=0; j<M; ++j) {
x[j] = M_PI*(2*((double)rand()/RAND_MAX)-1); // uniform random in [-pi,pi)
c[j] = 2*((double)rand()/RAND_MAX)-1 + I*(2*((double)rand()/RAND_MAX)-1);
}
int N = 1e6; // number of modes
double complex* F = (double complex*)malloc(sizeof(double complex)*N);
finufft_opts opts; // make an opts struct
finufft_default_opts(&opts); // set default opts (must do this)
opts.debug = 2; // more debug/timing to stdout
int ier = finufft1d1(M,x,c,+1,1e-9,N,F,&opts);
// (now do something with F here!...)
free(x); free(c); free(F);
```

See `examples/simple1d1c.c`

and `examples/simple1d1cf.c`

for
double- and single-precision C examples, including the math check to insure
the correct indexing of output modes.

## 2D example in C++¶

We assume Fortran-style contiguous multidimensional arrays, as opposed to C-style arrays of pointers; this allows the widest compatibility with other languages. Assuming the same headers as above, we first create points \((x_j,y_j)\) in the square \([-\pi,\pi)^2\), and strengths as before:

```
int M = 1e7; // number of nonuniform points
vector<double> x(M), y(M);
vector<complex<double> > c(M);
for (int j=0; j<M; ++j) {
x[j] = M_PI*(2*((double)rand()/RAND_MAX)-1);
y[j] = M_PI*(2*((double)rand()/RAND_MAX)-1);
c[j] = 2*((double)rand()/RAND_MAX)-1 + I*(2*((double)rand()/RAND_MAX)-1);
}
```

Let’s say we want `N1=1000`

by `N2=2000`

2D Fourier coefficients.
We allocate and do the (default options) transform thus:

```
int N1=1000, N2=2000;
vector<complex<double> > F(N1*N2);
int ier = finufft2d1(M,&x[0],&y[0], &c[0], +1, 1e-6, N1, N2, &F[0], NULL);
```

This transform takes 0.6 seconds on a laptop.
The modes have increasing ordering
of integer frequency indices from `-N1/2`

up to `N1/2-1`

in the fast (`x`

) dimension,
then indices from `-N2/2`

up to `N2/2-1`

in the slow (`y`

) dimension
(since both `N1`

and `N2`

are even).
So, the output frequency `(k1,k2)`

is found in
`F[N1/2 + k1 + (N2/2 + k2)*N1]`

.

See `opts.modeord`

in Options
to instead use FFT-style mode ordering, which
simply differs by an “fftshift” (as it is commonly called).

See `examples/simple2d1.cpp`

for an example with a math check, to
insure that the mode indexing is correctly understood.

## Vectorized interface example¶

A common use case is to perform a stack of identical transforms with the
same size and nonuniform points, but for new strength vectors.
(Applications include interpolating vector-valued data, or processing
MRI images collected with a fixed set of k-space sample points.)
Because it amortizes sorting, FFTW planning, and FFTW plan lookup,
it can be faster to use a “vectorized”
interface (which does the entire stack in one call)
than to repeatedly call the above “simple” interfaces.
This is especially true for many small problems.
Here we show how to do a stack of `ntrans=10`

1D type 1 NUFFT transforms, in C++,
assuming the same headers as in the first example above.
The strength data vectors are taken to be contiguous (the whole
first vector, followed by the second, etc, rather than interleaved.)
Ie, viewed as a matrix in Fortran storage, each column is a strength vector.

```
int ntrans = 10; // how many transforms
int M = 1e7; // number of nonuniform points
vector<double> x(M);
vector<complex<double> > c(M*ntrans); // ntrans strength vectors
complex<double> I = complex<double>(0.0,1.0); // the imaginary unit
for (int j=0; j<M; ++j)
x[j] = M_PI*(2*((double)rand()/RAND_MAX)-1);
for (int j=0; j<M*ntrans; ++j) // fill all ntrans vectors...
c[j] = 2*((double)rand()/RAND_MAX)-1 + I*(2*((double)rand()/RAND_MAX)-1);
int N = 1e6; // number of output modes
vector<complex<double> > F(N*trans); // ntrans output vectors
int ier = finufft1d1(M,&x[0],&c[0],+1,1e-9,N,&F[0],NULL); // default opts
```

This takes 2.6 seconds on a laptop, around 1.4x faster than
making 10 separate “simple” calls.
The frequency index `k`

in transform number `t`

(zero-indexing the transforms) is in `F[k + (int)N/2 + N*t]`

.

See `examples/many1d1.cpp`

and `test/finufft?dmany_test.cpp`

for more examples.

## Guru interface example¶

If you want more flexibility than the above, use the “guru” interface: this is similar to that of FFTW3, and to the main interface of NFFT3. It lets you change the nonuniform points while keeping the same pointer to an FFTW plan for a particular number of stacked transforms with a certain number of modes. This avoids the overhead (typically 0.1 ms per thread) of FFTW checking for previous wisdom which would be significant when doing many small transforms. You may also send in a new set of stacked strength data (for type 1 and 3, or coefficients for type 2), reusing the existing FFTW plan and sorted points. Now we redo the above 2D type 1 C++ example with the guru interface.

One first makes a plan giving transform parameters, but no data:

```
// (assume x, y, c are filled, and F allocated, as in the 2D code above...)
int type=1, dim=2, ntrans=1;
int64_t Ns[] = {1000,2000}; // N1,N2 as 64-bit int array
// step 1: make a plan...
finufft_plan plan;
int ier = finufft_makeplan(type, dim, Ns, +1, ntrans, 1e-6, &plan, NULL);
// step 2: send in pointers to M nonuniform points (just x, y in this case)...
finufft_setpts(plan, M, &x[0], &y[0], NULL, 0, NULL, NULL, NULL);
// (user should not change x, y nonuniform point arrays here!)
// step 3: do the planned transform to the c strength data, output to F...
finufft_execute(plan, &c[0], &F[0]);
// ... you could now send in new points, and/or do transforms with new c data
// ...
// step 4: when done, free the memory used by the plan...
finufft_destroy(plan);
```

This writes the Fourier coefficients to `F`

just as in the earlier 2D example.
One difference from the above simple and vectorized interfaces
is that the `int64_t`

type (aka `long long int`

)
is needed since the Fourier coefficient dimensions are passed as an array.

Warning

You must not change the nonuniform point arrays (here `x`

, `y`

) between passing them to `finufft_setpts`

and performing `finufft_execute`

. The latter call expects these arrays to be unchanged. We chose this style of interface since it saves RAM and time (by avoiding unnecessary duplication), allowing the largest possible problems to be solved.

Warning

You must destroy a plan before making a new plan using the same plan object, otherwise a memory leak results.

The complete code with a math test is in `examples/guru2d1.cpp`

, and for
more examples see `examples/guru1d1*.c*`

## Thread safety and global state¶

It is possible to call FINUFFT from within multithreaded code, e.g. in an
OpenMP parallel block. In this case `opts.nthreads=1`

should be set,
and FINUFFT must have been compiled with the `-DFFTW_PLAN_SAFE`

flag,
making it thread-safe.
For demos of this, see

`examples/threadsafe1d1`

which runs a 1D type-1 separately on each thread, checking the math, and`examples/threadsafe2d2f`

which runs a 2D type-2 on each slice, which are parallelized over via an OpenMP parallel for loop (without any math check, just dummy inputs)

which are both built by `make examples`

if the above flag has been set.

Note: A design decision of FFTW is to have a global state which stores
wisdom and settings. Such global state can cause unforeseen effects on other routines that also use FFTW. In contrast, FINUFFT uses pointers to plans to store
its state, and does not have a global state (other than one `static`

flag used as a lock on FFTW initialization in the FINUFFT plan
stage). This means different FINUFFT calls should not affect each other,
although they may affect other codes that use FFTW via FFTW’s global state.

Alternatively it is possible to achieve thread safety without compiling with `-DFFTW_PLAN_SAFE`

. This can lead to higher performance.
This way to achieve thread safety is to use the guru interface and split initialization and transformation into two steps. The initialization step is not thread safe, but the transformation step is. The initialization step can be parallelized over threads, but the transformation step cannot. See `examples/threadsafe1d1`

for an example of this.
In particular:

```
finufft_makeplan(...args...) // not thread safe
```

All the other functions are thread safe. With OpenMP is possible to use:

```
#pragma omp critical
finufft_makeplan(...args...) // not thread safe
```

to achieve thread safety.