org.apfloat.internal

## Class FloatTwoPassFNTStrategy

• All Implemented Interfaces:
ParallelNTTStrategy, NTTStrategy

public class FloatTwoPassFNTStrategy
extends FloatParallelFNTStrategy
Fast Number Theoretic Transform that uses a "two-pass" algorithm to calculate a very long transform on data that resides on a mass storage device. The storage medium should preferably be a solid state disk for good performance; on normal hard disks performance is usually inadequate.

The "two-pass" algorithm only needs to do two passes through the data set. In comparison, a basic FFT algorithm of length 2n needs to do n passes through the data set. Although the algorithm is fairly optimal in terms of amount of data transferred between the mass storage and main memory, the mass storage access is not linear but done in small incontinuous pieces, so due to disk seek times the performance can be quite lousy.

When the data to be transformed is considered to be an n1 x n2 matrix of data, instead of a linear array, the two passes go as follows:

1. Do n2 transforms of length n1 by transforming the matrix columns. Do this by fetching n1 x b blocks in memory so that the blocks are as large as possible but fit in main memory.
2. Then do n1 transforms of length n2 by transforming the matrix rows. Do this also by fetching b x n2 blocks in memory so that the blocks just fit in the available memory.
The algorithm requires reading blocks of b elements from the mass storage device. The smaller the amount of memory compared to the transform length is, the smaller is b also. Reading very short blocks of data from hard disks can be prohibitively slow.

When reading the column data to be transformed, the data can be transposed to rows by reading the b-length blocks to proper locations in memory and then transposing the b x b blocks.

In a convolution algorithm the data elements can remain in any order after the transform, as long as the inverse transform can transform it back. The convolution's element-by-element multiplication is not sensitive to the order in which the elements are, of course.

This algorithm is parallelized so that the row transforms are done in parallel using multiple threads, if the number of processors is greater than one in ApfloatContext.getNumberOfProcessors().

This transform uses the maximum amount of memory available as retrieved from ApfloatContext.getMaxMemoryBlockSize(). All access on memory is synchronized on the shared memory lock retrieved from ApfloatContext.getSharedMemoryLock().

Version:
1.5.1
Author:
Mikko Tommila
DataStorage.getTransposedArray(int,int,int,int)
• ### Constructor Detail

• #### FloatTwoPassFNTStrategy

public FloatTwoPassFNTStrategy()
Default constructor.
• ### Method Detail

• #### transform

public void transform(DataStorage dataStorage,
int modulus)
throws ApfloatRuntimeException
Description copied from interface: NTTStrategy
Perform a forward transform on the data.

Multiple moduli can be used, if the convolution algorithm uses the Chinese Remainder Theorem to calculate the final result.

Specified by:
transform in interface NTTStrategy
Overrides:
transform in class FloatTableFNTStrategy
Parameters:
dataStorage - The data to be transformed.
modulus - Number of modulus to use (in case the transform supports multiple moduli).
Throws:
ApfloatRuntimeException
• #### inverseTransform

public void inverseTransform(DataStorage dataStorage,
int modulus,
long totalTransformLength)
throws ApfloatRuntimeException
Description copied from interface: NTTStrategy
Perform an inverse transform on the data.

Multiple moduli can be used, if the convolution algorithm uses the Chinese Remainder Theorem to calculate the final result.

Specified by:
inverseTransform in interface NTTStrategy
Overrides:
inverseTransform in class FloatTableFNTStrategy
Parameters:
dataStorage - The data to be transformed.
modulus - Number of modulus to use (in case the transform supports multiple moduli).
totalTransformLength - Total transform length; the final result elements are divided by this value.
Throws:
ApfloatRuntimeException