c++ - 3D FFT Using Intel MKL with Zero Padding -
i want compute 3d fft
using intel mkl
of array has about 300×200×200
elements. 3d array stored 1d array of type double
in columnwise fashion:
for( int k = 0; k < nk; k++ ) // loop through height. for( int j = 0; j < nj; j++ ) // loop through rows. for( int = 0; < ni; i++ ) // loop through columns. { ijk = + ni * j + ni * nj * k; my3darray[ ijk ] = 1.0; }
i want perform not-in-place
fft on input array , prevent getting modified (i need use later in code) , backward computation in-place
. want have 0 padding.
my questions are:
- how can perform zero-padding?
- how should deal size of arrays used
fft
functions when 0 padding included in computation? - how can take out 0 padded results , actual result?
here attempt problem, absolutely thankful comment, suggestion, or hint.
#include <stdio.h> #include "mkl.h" int max(int a, int b, int c) { int m = a; (m < b) && (m = b); (m < c) && (m = c); return m; } void fft3d_r2c( // real complex 3d fft. double *in, int nrowsin , int ncolsin , int nheightsin , double *out ) { int n = max( nrowsin , ncolsin , nheightsin ); // round next highest power of 2. unsigned int n = (unsigned int) n; // compute next highest power of 2 of 32-bit n. n--; n |= n >> 1; n |= n >> 2; n |= n >> 4; n |= n >> 8; n |= n >> 16; n++; /* strides describe data layout in real , conjugate-even domain. */ mkl_long rs[4], cs[4]; // dfti descriptor. dfti_descriptor_handle fft_desc = 0; // variables needed out-of-place computations. mkl_complex16 *in_fft = new mkl_complex16 [ n*n*n ]; mkl_complex16 *out_fft = new mkl_complex16 [ n*n*n ]; double *out_zeropadded = new double [ n*n*n ]; /* compute strides */ rs[3] = 1; cs[3] = 1; rs[2] = (n/2+1)*2; cs[2] = (n/2+1); rs[1] = n*(n/2+1)*2; cs[1] = n*(n/2+1); rs[0] = 0; cs[0] = 0; // create dfti descriptor. mkl_long sizes[] = { n, n, n }; dfticreatedescriptor( &fft_desc, dfti_double, dfti_real, 3, sizes ); // configure dfti descriptor. dftisetvalue( fft_desc, dfti_conjugate_even_storage, dfti_complex_complex ); dftisetvalue( fft_desc, dfti_placement, dfti_not_inplace ); // out-of-place transformation. dftisetvalue( fft_desc, dfti_input_strides , rs ); dftisetvalue( fft_desc, dfti_output_strides , cs ); dfticommitdescriptor( fft_desc ); dfticomputeforward ( fft_desc, in , in_fft ); // change strides compute backward transform. dftisetvalue ( fft_desc, dfti_input_strides , cs); dftisetvalue ( fft_desc, dfti_output_strides, rs); dfticommitdescriptor( fft_desc ); dfticomputebackward ( fft_desc, out_fft, out_zeropadded ); // printing 0 padded 3d fft result. for( long long = 0; < (long long)n*n*n; i++ ) printf("%f\n", out_zeropadded[i] ); /* don't know how take out 0 padded results , save actual result in variable named "out" */ dftifreedescriptor ( &fft_desc ); delete[] in_fft; delete[] out_zeropadded ; } int main() { int n = 10; double *a = new double [n*n*n]; // array real. double *afft = new double [n*n*n]; // fill array 'real' numbers. for( int = 0; < n*n*n; i++ ) a[ ] = 1.0; // calculate fft. fft3d_r2c( a, n, n, n, afft ); printf("fft results:\n"); for( int = 0; < n*n*n; i++ ) printf( "%15.8f\n", afft[i] ); delete[] a; delete[] afft; return 0; }
just few hints:
power of 2 size
- i don't way computing size
- so let
nx,ny,nz
size of input matrix and
nx,ny,nz
size of padded matrixfor (nx=1;nx<nx;nx<<=1); (ny=1;ny<ny;ny<<=1); (nz=1;nz<nz;nz<<=1);
now 0 pad memset 0 first , copy matrix lines
- padding
n^3
instead ofnx*ny*nz
can result in big slowdowns - if
nx,ny,nz
not close each other
output complex
- if right
a
input real matrix - and
afft
output complex matrix - so why not allocate space correctly?
double *afft = new double [2*nx*ny*nz];
- complex number real+imaginary part 2 values per number
- that goes final print of result
- and
"\r\n"
after lines viewing
- if right
3d dfft
- i not use nor know dfft library
- i use mine own, anyway 3d dfft can done 1d dfft
- if lines ... see 2d dfct 1d dfft
- in 3d same need add 1 pass , different normalization constant
- this way can have single line buffer
double lin[2*max(nx,ny,nz)];
- and make 0 padding on run (so no need have bigger matrix in memory)...
- but involves coping lines on each 1d dfft ...
Comments
Post a Comment