I just discovered that the version of mpg123 on the latest Redhat, which is what I use to decode mpegs into .wav, is actually not mpg123 but a link to mpg321, which silently ignores the options i give it telling to downsample by 2 and convert to mono. So all the pfiles (and hence the .htk files) are twice as big as they should be.
The annoying thing is that I'm not quite sure whether the anchor nets were trained on these files, or on files that were correctly downsampled. It looks OK: the nets are dated July 31 2002, which means they were trained at NEC. I looked there, and those pfiles (dated from that July as well) are OK, ie. half as small as they ones I have now on blush.
Posted by madadam at March 27, 2003 02:02 AMa: I finally got a great mp3 decoder with easy source (the others were too optimized/convoluted to adapt.) - libmad--an integer decoder. There's a simple hook you can use to run stuff against the pcm buffers. Also can get at the coeffs in the .mp3 chunks.