Programmer Guide/SPU Reference/ADASEG: Difference between revisions
From STX Wiki
Jump to navigationJump to search
m (1 revision: Initial import) |
m (moved Programmer Guide/SPU Reference/ADASEG - RMS based automatic segmentation to Programmer Guide/SPU Reference/ADASEG) |
Latest revision as of 10:21, 28 April 2011
ADASEG - RMS based automatic segmentation
Automatic segmentation using adaptive background amplitude estimation.
The segmentation algorithm is as follows. The input spectrum is truncated to the selected frequency range (fmin, fmax) and the A-weighting function is applied (if aw=1). Then the long-time RMS value PL and the short-time RMS value PS are computed from the input spectra (see note 1). A segment (event) is detected, if the short-time RMS PS is higher than PL+lsignal for a duration of at least tmin seconds. The segment time values and some parameters are stored in the output table.
Notes
- Measurement method vor PL and PS: The RMS values computed from the preprocessed spectra are delayed and buffered for the specified time tla/tsa. The value PL/PS is set to the level where plrms/psrms percent of the delayed RMS values are greater than PL/PS.
- Signal regions with a short-time RMS PS that is higher than PL+lpause are not included in the long-time level measurement.
- During the long-time measurement also a long-time spectrum amplitude AL is computed. For each input spectrum included in the long-time measurement, the amplitude aL is set to the spectral amplitude where plamp percent of the spectrum amplitudes are greater than aL. The amplitude AL ist the running average of all aL over the time tla. E.g. this value can be used as offset level (floor) for the computation of centroid and spread of the center-segment/s.
- The center-segment is the part of the segment where the RMS level is higher than P01-oamax. The center-segment boundaries (tbc and tec) are deteced by scanning the RMS track backward and forward starting at the maximum RMS value inside the segment.
- The RMS percentage levels pXX are computed for each channel over the whole RMS track of the segment. The value pXX is the RMS level where XX percent of the RMS values are higher than pXX.
- All computations of RMS values are performed on the squared energy track and than converted to dB.
- Stereo signal processing is performed if the input A2 is connected. For the segmentation algorithm the both channels are mixed, but the center-segment detection and parameter extraction is performed for each channel.
This SPAtom was developed for the NOIDESc project in 2006.
Usage:
ADASEG A1 [ A2 ] TABPAR TABDAT TABSEG
Inputs:
- A1
- The FFT amplitude spectrum (linear) of the 1st channel.
- A2
- The FFT amplitude spectrum (linear) of the 2nd channel.
- TABPAR
- The control table. This is an extended or parameter table with 1 numeric field and at least 2 rows.
Row/Column Index | Name | Description | Default Value |
[0,0] | dt | frame hopsize in seconds | no default |
[1,0] | dfz, | FFT frequency resolution in Hz | no default |
[2,0] | aw | enable (1 ) or disable (0 ) spectral A weighting
|
1 |
[3,0] | aref | reference amplitude (for dB conversions) | 20e-6 |
[4,0] | fmin | lower boundary of analysis band in Hz | 0 |
[5,0] | fmax | upper boundary of analysis band in Hz | 8000 |
[6,0] | tla | long-time average time in seconds | 60 |
[7,0] | plrms | percentage for long-time RMS in % | 95 |
[8,0] | plamp | percentage for long-time amplitude in % | 25 |
[9,0] | tsa | short-time average time in seconds | 1 |
[10,0] | psrms | percentage for short-time RMS in % | 95 |
[11,0] | lsignal | signal offset level in dB | 10 |
[12,0] | lpause | pause offset level in dB | 6 |
[13,0] | tmin | minimum segment duration in seconds | 3 |
[14,0] | oamax | offset level for segment center in dB | 10 |
- The following conditions apply:
- df > 0, 0 = fmin < fmax < nA*df (nA = length of A1, A2)
- dt > 0, tsa = 20*dt, tla = 10*tsa, tmin = 2*tsa
- 1 = plrms = 99, 1 = psrms = 99
- lpause = 3, lsignal = lpause, oamax = 3
- TABSEG
- The segment output table. This is an extended or parameter table with 10 numeric fields and must be empty on initialization.
- General segment parameters for the i-th segment:
[i,0] | tb | segment begin (frame index) n seconds |
[i,1] | te | segment end time in seconds |
[i,2] | tl | segment duration in seconds |
[i,3] | pl | long time RMS in dB |
[i,4] | al | long time spectral cut-off amplitude in dB |
- Center-segment parameters for channel 1 (always computed):
(for each channel where ch = 1..n) | ||
[i,5] | tbc | begin time in seconds |
[i,6] | tbe | end time in seconds |
[i,7] | tcl | duration in seconds |
[i,8] | toc | offset to segment begin (tbc-tb) |
[i,9] | p95 | RMS level reached by 95% of the segment frames (dB). See note 5. |
[i,10] | p05 | RMS level reached by 5% of the segment frames (dB). |
[i,11] | p01 | RMS level reached by 1% of the segment frames (dB). |
[i,12] | pamax | The average energy level of the center-segment (dB). |
[i,13] | peq | The average energy level for the whole segment (dB). |
- Center-segment parameters for channel 2 (computed only if A2 is connected).