stretch(1) Sound Data Time Stretching stretch(1)

stretch - time stretching tool for sound data

stretch [options] [input [output]]

The stretch utility allows you to lengthen or shorten the duration of an audio data stream without affecting the pitch of the sampled data. It employs phase-vocoding to achieve the effect, usually leading to better results than time-domain techniques. Input audio data can be composed of multiple channels, in which case output also is.

Both output and input can be '-', indicating that signal data is to be read from stdin or written to stdout respectively (You can also omit the file names for the same effect, starting with output). In either case, the sample format is signed 16-bit little-endian (aka CD format).

For physical file access, the utility makes use of the sndfile library. The tool tries to guess the intended output file format from the filename extension; in addition, you can specify the output bit depth (quantization) with the -b (--bits) parameter.

The tool is not capable of directly replacing the contents of the input file.

The simplest invocation of the tool is:

stretch -t 1.5 infile.wav outfile.wav

This example stretches infile.wav to 150 % duration and writes the result to outfile.wav.

sets the ratio by which the duration of the sound signal is scaled. Extreme values can cause audible artefacts in the output signal, usually perceived as cyclic amplitude modulation. To achieve an exact time expansion/compression ratio, the product of this and the -o (--overlap) parameter needs to be an integer value.
sets the linear amplification factor for the output signal. The default value is 1.0 .
makes the utility be silent about what it does.

sets the length of the DFT (discrete Fourier transform), which should be a power of two for best results. Odd values may cause the utility to throw up. Larger DFT sizes imply better frequency resolution at the cost of increased computational demand. The default value is 1024 samples.
controls the interval at which recomposition of the signal takes place. Smaller values imply a smoother result in exchange for increased computational demand. The default value is 128 samples.

If signal input is from stdin, the default assumption about the data format (other than the aforementioned 16-bit signed little-endian convention) is two-channel audio at 44.1 kHz sample rate. The following options override these:

no surprises: 1 is mono, 2 stereo etc.
sets the sample rate.

sets the output bit depth (only for physical file output). 8, 16 or 24 for signed integers, 32 or 64 for floating point. The default is 16-bit. (Internal precision is 32 bit floating-point.)
this option causes the utility to clip the output signal to stay within the output quantization or the [-1,+1] range if output is in floating-point format. The use of this flag is highly recommended when output is in integer sample format (--bits less than 32 or stdout output).

prints version information and a list of supported file formats/extensions, then quits.
prints version information, then quits.

A few example invocations of the utility, all assuming source data is in CD format (signed 16-bit samples, stereo, 44.1 kHz):

stretches cdda.wav to exactly 120 %, piping the output to the aplay program for immediate playback.

plays back an ogg stream, accelerated to 80 % duration. If you prefer mp3, an equivalent incantation is:

Lousy command line parsing and error feedback. Lots of combinations of output file formats and quantization untested, as is the operation on big-endian systems. Phase vocoding chews a lot of CPU cycles.

The original phase vocoder implementation was done by Mark Dolson at UCSD, part of the CARL suite, skillfully converted to streaming operation by Richard Dobson.

The stretch tool and this manual page were written by Tim Goetze <tim@quitte.de>.

The latest version of this tool can be obtained from http://quitte.de/dsp/pvoc.html .

March 25, 2004