DOCTYPE
Although the flexibility of XML is of great advantage, it is often also necessary to define exactly how an XML document can be structured. For this purpose, a data definition language is used. There are tools for validating HTML against a DOCTYPE and XML against an XML Schema. In STx, we have our own tool which performs basic validation. We have called it the DOCTYPE as well.
You can define an STx DOCTYPE using the SET xmlfile DOCTYPE command. Once an XML shell file has a DOCTYPE associated with it, operations which are invalid according to the DOCTYPE will in most cases fail. If you need to validate a complete document, you can use the command SET xmlfile VALIDATE.
In order to guarantee data integrity in the STx DataSet, a DOCTYPE has been defined in the stxconfig.xml file and is associated with the DataSet when it is loaded.
The STXDataSet DOCTYPE
STx includes a DOCTYPE for the segment metadata and project XML structure. It is defined in the stxconfig.xml file and associated with the project when the project is opened in STx. Here is a summary of the defined elements and their relationship to each other.
AFile : ASet
The AFile element is associated with a sound file. It is derived from the ASet element. In addition to the ASet attributes, it has the following attributes:
File - the absolute path to the sound file.
ASet
The ASet element is short for 'audio set'. It is derived from the Set element. In addition to the Set attributes, it has the following attributes:
SR- the sampling rate.CH- the number of channels.
In addition to the elements a Set may contain, an ASet may contain one or more of the following elements:
ASeg elements
An ASet may not be contained within an ASet. The ASet itself is not directly used in the DataSet. Rather, it's derived elements (e.g. AFile and ASequence) are used.
ASeg
The ASeg element is short for 'audio segment' and stores the data specifying and describing an audio segment. The ASeg element has the following attributes:
ID - a string uniquely identifying the segment within the parent element.
P - an integer specifying the beginning of the segment in samples, as an offset from the beginning of the file.
L - an integer specifying the length of the segments in samples.
CH - an integer specifying which channel this segment addresses.
The following elements may be contained in an ASeg element:
APar elements
APar
The APar element contains parameters calculated for a specific audio segment. An APar must be the child of an ASeg element. The APar element has the following attributes:
type- - the type of method. The following types exist:
- F0 - Fundamental Frequency
- RMS - Signal Energy
- Phase - Phase Spectrum
- ASpec - Amplitude Spectrum
- Fof - Formant Frequencies
- Cep - Cepstrum
- LAT - Log attack time
- CLPC - LPC coefficient
- TFA - Amplitude Spectrogram
- TFP - Phase Spectrogram
method- the method. The following methods exist:
- FO - Formants
- RMSB - Frequency Band Energy
- RMS - Signal Energy (RMS)
- F0A - Fundamental Frequency (SIFT)
- F0B - Fundamental Frequency (autocorr.)
- F0H - Fundamental Frequency (harmonic grid)
- ALPHA - LPC - Error Energy
- IRRSP - Irrelevance Amplitude Spectrum
- IRRTH - Irrelevance Threshhold
- FFT - FFT Amplitude Spectrum
- FFTP - FFT Phase Spectrum
- CEPST - Cepstrum Smoothed Spectrum
- LOFAR - Spectrum - Cepstrum
- WAVE - Wavelet Amplitude Spectrum
- WAVEP - Wavelet Phase Spectrum
- STFT - FFT Amplitude Spectrogram
- STFTP - FFT Phase Spectrogram
- STWT - Wavelet Amplitude Spectrogram
- STWTP - Wavelet Phase Spectrogram
- LPC - LPC Smoothed Spectrum
- LPAI - LPC Inverse Filter Coefficients
- LPRC - LPC Reflection Coefficients
- LPAR - LPC Area Coefficients
nx- the number of x axis
ny- the number of y data vectors
CH- the channel this parameter was calculated for.
lfrm- the frame length in milliseconds
sfrm- the frame shift in milliseconds
nfrm- the number of frames
date- the date the parameter was saved
time- the time the parameter was saved