pycudafft

FFT library for PyCuda
Download

pycudafft Ranking & Summary

Advertisement

  • Rating:
  • License:
  • MIT/X Consortium Lic...
  • Price:
  • FREE
  • Publisher Name:
  • Bogdan Opanchuk
  • Publisher web site:
  • http://github.com/Manticore

pycudafft Tags


pycudafft Description

FFT library for PyCuda pycudafft is an FFT library for PyCuda, written in Python.Developer commentsAt the moment of writing this, PyCuda experienced problems with CUFFT caused by nVidia's architectural decisions. As a result, this piece of software was written. It contains implementation of batched FFT, ported from Apple's OpenCL implementation. OpenCL's ideology of constructing kernel code on the fly maps perfectly on PyCuda, and variety of Python's templating engines makes code generation simpler. I used mako templating engine, simply because of the personal preference. The code can be easily changed to use any other engine.In addition, project repository contains wrapper, which adds batch support for CUFFT and its mapping to PyCuda. They were added for testing purposes and will not be actively supported.Quick StartThe usage is quite simple. First, import pycudafft and numpy: >>> import pycuda.autoinit >>> import pycuda.gpuarray as gpuarray >>> from pycudafft import FFTPlan >>> import numpyThen the plan must be created. The creation is not very fast, mainly because of the compilation speed. But, fortunately, PyCuda caches compiled sources, so if you use the same plan for each run of your program, it will be created pretty fast. >>> plan = FFTPlan(16, 16)Now, let's prepare simple test array and try to execute plan over it: >>> data = numpy.ones((16, 16), dtype=numpy.complex64) >>> gpu_data = gpuarray.to_gpu(data) >>> plan.execute(gpu_data) >>> result = gpu_data.get() >>> print result # doctest: +ELLIPSIS ... ]As expected, we got array with the first non-zero element, equal to array size. Let's now perform the inverse transform. >>> plan.execute(gpu_data, inverse=True) >>> result = gpu_data.get()Since data is non-integer, we cannot simply compare it. We will just calculate error instead. >>> error = numpy.sum(numpy.abs(data) - numpy.abs(result)) / 256 >>> error < 1e-6 TrueThat's good enough for single precision numbers.ReferenceFFTPlanClass, containing precalculated FFT plan.Arguments: FFTPlan(x, y=None, z=None, split=False, dtype=numpy.complex64)x, y, z: Problem size. If z and/or y are not defined, 2D or 1D plan will be created. Warning: 2D and 3D plans with y == 1 or z == 1 are not supported at the moment.split: If True, the plan will support separate data arrays with real and imaginary parts instead of interleaved arrays.dtype: Data type to use. At the moment only complex64 (single precision) is supported.FFTPlan.execute()Execute plan for interleaved data arrays.Arguments: execute(data_in, data_out=None, inverse=False, batch=1)data_in: Input array.data_out: Output array. If not defined, the execution will be performed in-place and the results will be stored in data_in.inverse: If True, inverse transform will be performed.batch: Number of data sets to process. They should be located successively in data_in.FFTPlan.executeSplit()Execute plan for split data arrays.Arguments: executeSplit(data_in_re, data_in_im, data_out_re=None, data_out_im=None, inverse=False, batch=1)data_in_re, data_in_im: Input arrays with real and imaginary data parts correspondingly.data_out_re, data_out_im: Output arrays. If not defined, the execution will be performed in-place and the results will be stored in data_in_re and data_in_im.inverse: If True, inverse transform will be performed.batch: Number of data sets to process. They should be located successively in data_in. Requirements: · Python


pycudafft Related Software