CMSIS-DSP  Verison 1.1.0
CMSIS DSP Software Library
Finite Impulse Response (FIR) Sparse Filters

Functions

void arm_fir_sparse_f32 (arm_fir_sparse_instance_f32 *S, float32_t *pSrc, float32_t *pDst, float32_t *pScratchIn, uint32_t blockSize)
 Processing function for the floating-point sparse FIR filter.
void arm_fir_sparse_init_f32 (arm_fir_sparse_instance_f32 *S, uint16_t numTaps, float32_t *pCoeffs, float32_t *pState, int32_t *pTapDelay, uint16_t maxDelay, uint32_t blockSize)
 Initialization function for the floating-point sparse FIR filter.
void arm_fir_sparse_init_q15 (arm_fir_sparse_instance_q15 *S, uint16_t numTaps, q15_t *pCoeffs, q15_t *pState, int32_t *pTapDelay, uint16_t maxDelay, uint32_t blockSize)
 Initialization function for the Q15 sparse FIR filter.
void arm_fir_sparse_init_q31 (arm_fir_sparse_instance_q31 *S, uint16_t numTaps, q31_t *pCoeffs, q31_t *pState, int32_t *pTapDelay, uint16_t maxDelay, uint32_t blockSize)
 Initialization function for the Q31 sparse FIR filter.
void arm_fir_sparse_init_q7 (arm_fir_sparse_instance_q7 *S, uint16_t numTaps, q7_t *pCoeffs, q7_t *pState, int32_t *pTapDelay, uint16_t maxDelay, uint32_t blockSize)
 Initialization function for the Q7 sparse FIR filter.
void arm_fir_sparse_q15 (arm_fir_sparse_instance_q15 *S, q15_t *pSrc, q15_t *pDst, q15_t *pScratchIn, q31_t *pScratchOut, uint32_t blockSize)
 Processing function for the Q15 sparse FIR filter.
void arm_fir_sparse_q31 (arm_fir_sparse_instance_q31 *S, q31_t *pSrc, q31_t *pDst, q31_t *pScratchIn, uint32_t blockSize)
 Processing function for the Q31 sparse FIR filter.
void arm_fir_sparse_q7 (arm_fir_sparse_instance_q7 *S, q7_t *pSrc, q7_t *pDst, q7_t *pScratchIn, q31_t *pScratchOut, uint32_t blockSize)
 Processing function for the Q7 sparse FIR filter.

Description

This group of functions implements sparse FIR filters. Sparse FIR filters are equivalent to standard FIR filters except that most of the coefficients are equal to zero. Sparse filters are used for simulating reflections in communications and audio applications.

There are separate functions for Q7, Q15, Q31, and floating-point data types. The functions operate on blocks of input and output data and each call to the function processes blockSize samples through the filter. pSrc and pDst points to input and output arrays respectively containing blockSize values.

Algorithm:
The sparse filter instant structure contains an array of tap indices pTapDelay which specifies the locations of the non-zero coefficients. This is in addition to the coefficient array b. The implementation essentially skips the multiplications by zero and leads to an efficient realization.
   
     y[n] = b[0] * x[n-pTapDelay[0]] + b[1] * x[n-pTapDelay[1]] + b[2] * x[n-pTapDelay[2]] + ...+ b[numTaps-1] * x[n-pTapDelay[numTaps-1]]    
 
FIRSparse.gif
Sparse FIR filter. b[n] represents the filter coefficients
pCoeffs points to a coefficient array of size numTaps; pTapDelay points to an array of nonzero indices and is also of size numTaps; pState points to a state array of size maxDelay + blockSize, where maxDelay is the largest offset value that is ever used in the pTapDelay array. Some of the processing functions also require temporary working buffers.
Instance Structure
The coefficients and state variables for a filter are stored together in an instance data structure. A separate instance structure must be defined for each filter. Coefficient and offset arrays may be shared among several instances while state variable arrays cannot be shared. There are separate instance structure declarations for each of the 4 supported data types.
Initialization Functions
There is also an associated initialization function for each data type. The initialization function performs the following operations:
  • Sets the values of the internal structure fields.
  • Zeros out the values in the state buffer.
Use of the initialization function is optional. However, if the initialization function is used, then the instance structure cannot be placed into a const data section. To place an instance structure into a const data section, the instance structure must be manually initialized. Set the values in the state buffer to zeros before static initialization. The code below statically initializes each of the 4 different data type filter instance structures
    
arm_fir_sparse_instance_f32 S = {numTaps, 0, pState, pCoeffs, maxDelay, pTapDelay};    
arm_fir_sparse_instance_q31 S = {numTaps, 0, pState, pCoeffs, maxDelay, pTapDelay};    
arm_fir_sparse_instance_q15 S = {numTaps, 0, pState, pCoeffs, maxDelay, pTapDelay};    
arm_fir_sparse_instance_q7 S =  {numTaps, 0, pState, pCoeffs, maxDelay, pTapDelay};    
 
Fixed-Point Behavior
Care must be taken when using the fixed-point versions of the sparse FIR filter functions. In particular, the overflow and saturation behavior of the accumulator used in each function must be considered. Refer to the function specific documentation below for usage guidelines.

Function Documentation

void arm_fir_sparse_f32 ( arm_fir_sparse_instance_f32 S,
float32_t pSrc,
float32_t pDst,
float32_t pScratchIn,
uint32_t  blockSize 
)
Parameters:
[in]*Spoints to an instance of the floating-point sparse FIR structure.
[in]*pSrcpoints to the block of input data.
[out]*pDstpoints to the block of output data
[in]*pScratchInpoints to a temporary buffer of size blockSize.
[in]blockSizenumber of input samples to process per call.
Returns:
none.

References arm_circularRead_f32(), arm_circularWrite_f32(), blockSize, arm_fir_sparse_instance_f32::maxDelay, arm_fir_sparse_instance_f32::numTaps, arm_fir_sparse_instance_f32::pCoeffs, arm_fir_sparse_instance_f32::pState, arm_fir_sparse_instance_f32::pTapDelay, and arm_fir_sparse_instance_f32::stateIndex.

void arm_fir_sparse_init_f32 ( arm_fir_sparse_instance_f32 S,
uint16_t  numTaps,
float32_t pCoeffs,
float32_t pState,
int32_t *  pTapDelay,
uint16_t  maxDelay,
uint32_t  blockSize 
)
Parameters:
[in,out]*Spoints to an instance of the floating-point sparse FIR structure.
[in]numTapsnumber of nonzero coefficients in the filter.
[in]*pCoeffspoints to the array of filter coefficients.
[in]*pStatepoints to the state buffer.
[in]*pTapDelaypoints to the array of offset times.
[in]maxDelaymaximum offset time supported.
[in]blockSizenumber of samples that will be processed per block.
Returns:
none

Description:

pCoeffs holds the filter coefficients and has length numTaps. pState holds the filter's state variables and must be of length maxDelay + blockSize, where maxDelay is the maximum number of delay line values. blockSize is the number of samples processed by the arm_fir_sparse_f32() function.

References arm_fir_sparse_instance_f32::maxDelay, arm_fir_sparse_instance_f32::numTaps, arm_fir_sparse_instance_f32::pCoeffs, arm_fir_sparse_instance_f32::pState, arm_fir_sparse_instance_f32::pTapDelay, and arm_fir_sparse_instance_f32::stateIndex.

void arm_fir_sparse_init_q15 ( arm_fir_sparse_instance_q15 S,
uint16_t  numTaps,
q15_t pCoeffs,
q15_t pState,
int32_t *  pTapDelay,
uint16_t  maxDelay,
uint32_t  blockSize 
)
Parameters:
[in,out]*Spoints to an instance of the Q15 sparse FIR structure.
[in]numTapsnumber of nonzero coefficients in the filter.
[in]*pCoeffspoints to the array of filter coefficients.
[in]*pStatepoints to the state buffer.
[in]*pTapDelaypoints to the array of offset times.
[in]maxDelaymaximum offset time supported.
[in]blockSizenumber of samples that will be processed per block.
Returns:
none

Description:

pCoeffs holds the filter coefficients and has length numTaps. pState holds the filter's state variables and must be of length maxDelay + blockSize, where maxDelay is the maximum number of delay line values. blockSize is the number of words processed by arm_fir_sparse_q15() function.

References arm_fir_sparse_instance_q15::maxDelay, arm_fir_sparse_instance_q15::numTaps, arm_fir_sparse_instance_q15::pCoeffs, arm_fir_sparse_instance_q15::pState, arm_fir_sparse_instance_q15::pTapDelay, and arm_fir_sparse_instance_q15::stateIndex.

void arm_fir_sparse_init_q31 ( arm_fir_sparse_instance_q31 S,
uint16_t  numTaps,
q31_t pCoeffs,
q31_t pState,
int32_t *  pTapDelay,
uint16_t  maxDelay,
uint32_t  blockSize 
)
Parameters:
[in,out]*Spoints to an instance of the Q31 sparse FIR structure.
[in]numTapsnumber of nonzero coefficients in the filter.
[in]*pCoeffspoints to the array of filter coefficients.
[in]*pStatepoints to the state buffer.
[in]*pTapDelaypoints to the array of offset times.
[in]maxDelaymaximum offset time supported.
[in]blockSizenumber of samples that will be processed per block.
Returns:
none

Description:

pCoeffs holds the filter coefficients and has length numTaps. pState holds the filter's state variables and must be of length maxDelay + blockSize, where maxDelay is the maximum number of delay line values. blockSize is the number of words processed by arm_fir_sparse_q31() function.

References arm_fir_sparse_instance_q31::maxDelay, arm_fir_sparse_instance_q31::numTaps, arm_fir_sparse_instance_q31::pCoeffs, arm_fir_sparse_instance_q31::pState, arm_fir_sparse_instance_q31::pTapDelay, and arm_fir_sparse_instance_q31::stateIndex.

void arm_fir_sparse_init_q7 ( arm_fir_sparse_instance_q7 S,
uint16_t  numTaps,
q7_t pCoeffs,
q7_t pState,
int32_t *  pTapDelay,
uint16_t  maxDelay,
uint32_t  blockSize 
)
Parameters:
[in,out]*Spoints to an instance of the Q7 sparse FIR structure.
[in]numTapsnumber of nonzero coefficients in the filter.
[in]*pCoeffspoints to the array of filter coefficients.
[in]*pStatepoints to the state buffer.
[in]*pTapDelaypoints to the array of offset times.
[in]maxDelaymaximum offset time supported.
[in]blockSizenumber of samples that will be processed per block.
Returns:
none

Description:

pCoeffs holds the filter coefficients and has length numTaps. pState holds the filter's state variables and must be of length maxDelay + blockSize, where maxDelay is the maximum number of delay line values. blockSize is the number of samples processed by the arm_fir_sparse_q7() function.

References arm_fir_sparse_instance_q7::maxDelay, arm_fir_sparse_instance_q7::numTaps, arm_fir_sparse_instance_q7::pCoeffs, arm_fir_sparse_instance_q7::pState, arm_fir_sparse_instance_q7::pTapDelay, and arm_fir_sparse_instance_q7::stateIndex.

void arm_fir_sparse_q15 ( arm_fir_sparse_instance_q15 S,
q15_t pSrc,
q15_t pDst,
q15_t pScratchIn,
q31_t pScratchOut,
uint32_t  blockSize 
)
Parameters:
[in]*Spoints to an instance of the Q15 sparse FIR structure.
[in]*pSrcpoints to the block of input data.
[out]*pDstpoints to the block of output data
[in]*pScratchInpoints to a temporary buffer of size blockSize.
[in]*pScratchOutpoints to a temporary buffer of size blockSize.
[in]blockSizenumber of input samples to process per call.
Returns:
none.

Scaling and Overflow Behavior:

The function is implemented using an internal 32-bit accumulator. The 1.15 x 1.15 multiplications yield a 2.30 result and these are added to a 2.30 accumulator. Thus the full precision of the multiplications is maintained but there is only a single guard bit in the accumulator. If the accumulator result overflows it will wrap around rather than saturate. After all multiply-accumulates are performed, the 2.30 accumulator is truncated to 2.15 format and then saturated to 1.15 format. In order to avoid overflows the input signal or coefficients must be scaled down by log2(numTaps) bits.

References __SIMD32, arm_circularRead_q15(), arm_circularWrite_q15(), blockSize, arm_fir_sparse_instance_q15::maxDelay, arm_fir_sparse_instance_q15::numTaps, arm_fir_sparse_instance_q15::pCoeffs, arm_fir_sparse_instance_q15::pState, arm_fir_sparse_instance_q15::pTapDelay, and arm_fir_sparse_instance_q15::stateIndex.

void arm_fir_sparse_q31 ( arm_fir_sparse_instance_q31 S,
q31_t pSrc,
q31_t pDst,
q31_t pScratchIn,
uint32_t  blockSize 
)
Parameters:
[in]*Spoints to an instance of the Q31 sparse FIR structure.
[in]*pSrcpoints to the block of input data.
[out]*pDstpoints to the block of output data
[in]*pScratchInpoints to a temporary buffer of size blockSize.
[in]blockSizenumber of input samples to process per call.
Returns:
none.

Scaling and Overflow Behavior:

The function is implemented using an internal 32-bit accumulator. The 1.31 x 1.31 multiplications are truncated to 2.30 format. This leads to loss of precision on the intermediate multiplications and provides only a single guard bit. If the accumulator result overflows, it wraps around rather than saturate. In order to avoid overflows the input signal or coefficients must be scaled down by log2(numTaps) bits.

References arm_circularRead_f32(), arm_circularWrite_f32(), blockSize, arm_fir_sparse_instance_q31::maxDelay, arm_fir_sparse_instance_q31::numTaps, arm_fir_sparse_instance_q31::pCoeffs, arm_fir_sparse_instance_q31::pState, arm_fir_sparse_instance_q31::pTapDelay, and arm_fir_sparse_instance_q31::stateIndex.

void arm_fir_sparse_q7 ( arm_fir_sparse_instance_q7 S,
q7_t pSrc,
q7_t pDst,
q7_t pScratchIn,
q31_t pScratchOut,
uint32_t  blockSize 
)
Parameters:
[in]*Spoints to an instance of the Q7 sparse FIR structure.
[in]*pSrcpoints to the block of input data.
[out]*pDstpoints to the block of output data
[in]*pScratchInpoints to a temporary buffer of size blockSize.
[in]*pScratchOutpoints to a temporary buffer of size blockSize.
[in]blockSizenumber of input samples to process per call.
Returns:
none.

Scaling and Overflow Behavior:

The function is implemented using a 32-bit internal accumulator. Both coefficients and state variables are represented in 1.7 format and multiplications yield a 2.14 result. The 2.14 intermediate results are accumulated in a 32-bit accumulator in 18.14 format. There is no risk of internal overflow with this approach and the full precision of intermediate multiplications is preserved. The accumulator is then converted to 18.7 format by discarding the low 7 bits. Finally, the result is truncated to 1.7 format.

References __PACKq7, __SIMD32, arm_circularRead_q7(), arm_circularWrite_q7(), blockSize, arm_fir_sparse_instance_q7::maxDelay, arm_fir_sparse_instance_q7::numTaps, arm_fir_sparse_instance_q7::pCoeffs, arm_fir_sparse_instance_q7::pState, arm_fir_sparse_instance_q7::pTapDelay, and arm_fir_sparse_instance_q7::stateIndex.