sox/git/sox.1

10 '\" Replacement em-dash for nroff (default is too short).
34 SoX \- Sound eXchange, the Swiss Army knife of audio manipulation
37 \fBsox\fR [\fIglobal-options\fR] [\fIformat-options\fR] \fIinfile1\fR
38 	[[\fIformat-options\fR] \fIinfile2\fR] ... [\fIformat-options\fR] \fIoutfile\fR
39 	[\fIeffect\fR [\fIeffect-options\fR]] ...
41 \fBplay\fR [\fIglobal-options\fR] [\fIformat-options\fR] \fIinfile1\fR
42 	[[\fIformat-options\fR] \fIinfile2\fR] ... [\fIformat-options\fR]
43 	[\fIeffect\fR [\fIeffect-options\fR]] ...
45 \fBrec\fR [\fIglobal-options\fR] [\fIformat-options\fR] \fIoutfile\fR
46 	[\fIeffect\fR [\fIeffect-options\fR]] ...
53 purpose audio player or a multi-track audio recorder. It also has
70 SoX is a command-line audio processing tool, particularly suited to making
100    sox recital.au \-b 16 recital.wav channels 1 rate 16k fade 3 norm
103 (down-mix to one channel, sample rate change, fade-in, nomalize),
104 and stores the result at a bit-depth of 16.
106    sox \-r 16k \-e signed \-b 8 \-c 1 voice-memo.raw voice-memo.wav
108 converts `raw' (a.k.a. `headerless') audio to a self-describing file format,
118    sox \-m music.mp3 voice.wav mixed.flac
126    play \-n \-c1 synth sin %\-12 sin %\-9 sin %\-5 sin %\-2 fade h 0.1 1 0.1
128 plays a synthesised `A minor seventh' chord with a pipe-organ sound,
130    rec \-c 2 radio.aiff trim 0 30:00
134    play \-q take1.aiff & rec \-M take1.aiff take1\-dub.aiff
137 records a new track in a multi-track recording.  Finally,
140    rec \-r 44100 \-b 16 \-e signed-integer \-p \\
142 	sox \-p song.ogg silence 1 0.50 0.1% 1 2.0 0.1% : \\
146 audio files at points with 2 seconds of silence.  Also, it does not start
157 SoX can work with `self-describing' and `raw' audio files.
158 `self-describing' formats (e.g. WAV, FLAC, MP3) have a header that
176 The number of bits used to store each sample.  Today, 16-bit is
177 commonly used. 8-bit was popular in the early days of computer
178 audio. 24-bit is used in the professional audio arena. Other sizes are
183 encodings have variants with different byte-orderings or bit-orderings.
186 parameters and the number of samples would imply.  Commonly-used
187 encoding types include floating-point, \(*m-law, ADPCM, signed-integer
195 The term `bit-rate' is a measure of the amount of storage occupied by an
197 above and is typically denoted as a number of kilo-bits per second
198 (kbps).  An A-law telephony signal has a bit-rate of 64 kbps. MP3-encoded
199 stereo music typically has a bit-rate of 128\-196 kbps. FLAC-encoded
200 stereo music typically has a bit-rate of 550\-760 kbps.
202 Most self-describing formats also allow textual `comments' to be
227 Command-line format options.
236 Command-line format options.
244 if the file type cannot be determined. Command-line format options may
254    play existing-file.wav
258    rec new-file.wav
262    sox existing-file.wav \-d
266    sox \-d new-file.wav
277 Some systems provide more than one type of (SoX-compatible) audio
281 built-in to SoX, and the default selected by SoX when recording or playing
297    sox ... \-t oss
303    sox ... \-t alsa
316    play ... rate \-m
319 .B \-\-play\-rate\-arg
328 To help with setting a suitable recording level, SoX includes a peak-level
331    rec \-n
333 The recording level should be adjusted (using the system-provided mixer
336 See also \fB\-S\fR below.
341 is the case for many formats used in telephony (e.g. A-law, GSM) where
355 is not less than in the source format.  E.g.  converting from an 8-bit
356 PCM format to a 16-bit PCM format is lossless but converting from an
357 8-bit PCM format to (8-bit) A-law isn't.
369 effect, and finally creates the output MP3 file by re-compressing the
382 stored at a particular bit-depth. Any distortion introduced by
389 when the output bit-depth is less than 24 and any
392 bit-depth reduction has been specified explicitly using a command-line
395 the output file format supports only bit-depths lower than that of the
398 an effect has increased effective bit-depth within the internal
405 bit-depth is 16, then SoX's internal representation will utilise 18
411 .B \-V
413 .B \-D
415 dithering manually (e.g. to select a noise-shaping curve), see the
457    sox dull.wav bright.wav gain \-6 treble +6
465 .B \-G
474 following methods: `concatenate', `sequence', `mix', `mix-power',
501 If either the `mix' or `mix-power' combining method is selected then two or
506 file.  A mixed audio file cannot be un-mixed without reference to the
513 input files. Un-merging is possible using multiple
522 corresponding channels (treated as numbers in the interval \-1 to +1).
535 .B \-v
541 The \fB\-V\fR option (below) can be used to show the input file volume
552 volume (amplitude) of each input signal by a factor of \(S1/\s-2n\s+2,
565 With the `mix-power' combine method, the
568 \(S1/\s-2\(srn\s+2 instead of \(S1/\s-2n\s+2.
576 This behaviour can be changed by specifying the pseudo-effect `newfile'
607 keyboard interrupt key which is normally Ctrl-C).  This is a natural requirement
609 that when using SoX to play multiple files, Ctrl-C behaves slightly
625 effect-name will not work since SoX will treat it as an effect
626 specification.  The only work-around to this is to avoid such
628 filenames have a filename `extension', whilst effect-names do not.
633 \fB\-\fR
635 filename `\-' which,
642 when using it for an input file, the file-type (see
643 .B \-t
650 .B \-
656    sox \-M "|genw \-\-imd \-" "|genw \-\-thd \-" out.wav
659 .B \-t
663 \fB\(dq\fIwildcard-filename\fB\(dq\fR
664 Specifies that filename `globbing' (wild-card matching) should be performed
669    play \-\-rate 6k *.vox
673    play \-\-rate 6k file1.vox file2.vox file3.vox
678    play \-\-rate 6k "*.vox"
682 \fB\-p\fR, \fB\-\-sox\-pipe\fR
687    play "|sox \-n \-p synth 2" "|sox \-n \-p synth 2 tremolo 10" stat
691 .B \-p
692 is in fact an alias for `\fB\-t sox \-\fR'.
694 \fB\-d\fR, \fB\-\-default\-device\fR
703 \fB\-n\fR, \fB\-\-null\fR
706 SoX-specific mechanism and is not related to any operating-system
721 file, this can be overridden if desired using command-line format
739    SOX_OPTS="\-\-buffer 20000 \-\-play\-rate\-arg \-hs \-\-temp /mnt/temp"
745 .B \-\-no\-clobber
750 clear SOX_OPTS at the start of the script, but this of course loses
751 the benefit of SOX_OPTS carrying some system-wide default options.  An
755    SOX_OPTS="\-V \-\-no-clobber"
757    sox \-V2 \-\-clobber $input $output ...
764    export SOX_OPTS="\-V \-\-no-clobber"
768    setenv SOX_OPTS "\-V \-\-no-clobber"
770 MS-DOS/MS-Windows:
772    set SOX_OPTS=\-V \-\-no-clobber
774 MS-Windows GUI: via Control Panel : System : Advanced : Environment
779 \fB\-\-buffer\fR \fBBYTES\fR, \fB\-\-input\-buffer\fR \fBBYTES\fR
781 .B \-\-buffer
783 .B \-\-input\-buffer
785 .B \-\-buffer
789 .B \-\-buffer
793 \fB\-\-clobber\fR
797 \fB\-\-combine concatenate\fR\^|\^\fBmerge\fR\^|\^\fBmix\fR\^|\^\fBmix\-power\fR\^|\^\fBmultiply\fR…
800 .B \-m
802 .B \-M
804 .B \-T
810 \fB\-D\fR, \fB\-\-no\-dither\fR
821 \fB\-\-effects\-file \fIFILENAME\fR
831 \fB\-G\fR, \fB\-\-guard\fR
836    sox \-G infile \-b 16 outfile rate 44100 dither \-s
840    sox infile \-b 16 outfile gain \-h rate 44100 gain \-rh dither \-s
849 \fB\-h\fR, \fB\-\-help\fR
852 \fB\-\-help\-effect \fINAME\fR
856 \fB\-\-help\-format \fINAME\fR
860 \fB\-\-i\fR, \fB\-\-info\fR
866 \fB\-m\fR\^|\^\fB\-M\fR
867 Equivalent to \fB\-\-combine mix\fR and \fB\-\-combine merge\fR, respectively.
869 .B \-\-magic
873 \fB\-\-multi\-threaded\fR | \fB\-\-single\-threaded\fR
875 If the \fB\-\-multi\-threaded\fR option is given however then SoX
876 will process audio channels for most multi-channel
877 effects in parallel on hyper-threading/multi-core architectures. This
880 to gain any benefit from multi-threaded processing
881 (e.g. 131072; see \fB\-\-buffer\fR above).
883 \fB\-\-no\-clobber\fR
902 \fB\-\-norm\fR[\fB=\fIdB-level\fR]
907    sox \-\-norm infile \-b 16 outfile rate 44100 dither \-s
911    sox infile \-b 16 outfile gain \-h rate 44100 gain \-nh dither \-s
916    sox \-\-norm=\-3 infile outfile
926 \fB\-\-play\-rate\-arg ARG\fR
932 \fB\-\-plot gnuplot\fR\^|\^\fBoctave\fR\^|\^\fBoff\fR
936 .B \-\-plot
939 and configuration of many of the transfer-function based effects.
944    sox \-\-plot octave input-file \-n highpass 1320 > highpass.plt
948 \fB\-q\fR, \fB\-\-no\-show\-progress\fR
950 This is the opposite of the \fB\-S\fR option.
952 \fB\-R\fR
954 applicable, SoX will embed a fixed time-stamp in the output file (e.g.
960 \fB\-\-replay\-gain track\fR\^|\^\fBalbum\fR\^|\^\fBoff\fR
961 Select whether or not to apply replay-gain adjustment to input files.
978 \fB\-S\fR, \fB\-\-show\-progress\fR
982 output file.  Also shown is a peak-level meter, and an indication if
983 clipping has occurred.  The peak-level meter shows up to two channels
991 \-25	\-	\-11	====
992 \-23	T{
994 T}	\-9	====\-
995 \-21	=\-	\-7	=====
996 \-19	==	\-5	=====\-
997 \-17	==\-	\-3	======
998 \-15	===	\-1	=====!
999 \-13	===\-
1003 A three-second peak-held value of headroom in dBs will be shown to the right
1009 \fB\-T\fR\fR
1010 Equivalent to \fB\-\-combine multiply\fR.
1012 \fB\-\-temp\fI DIRECTORY\fR
1015 This can be useful if there are permission or free-space problems with the
1016 default location. In this case, using `\fB\-\-temp .\fR' (to use the
1019 \fB\-\-version\fR
1021 .IP \fB\-V\fR[\fIlevel\fR]
1051 warnings). Each occurrence of the \fB\-V\fR option increases the
1056 .B \-V0
1063 \fB\-\-ignore\-length\fR
1068 \fB\-v\fR, \fB\-\-volume\fR \fIFACTOR\fR
1089 \fB\-b\fR \fIBITS\fR, \fB\-\-bits\fR \fIBITS\fR
1090 The number of bits (a.k.a. bit-depth or sometimes word-length) in each
1093 A/\(*m-law, ADPCM.
1099    sox \-r 16k \-e signed \-b 8 input.raw output.wav
1101 converts a particular `raw' file to a self-describing `WAV' file.
1110    sox input.cdda \-b 24 output.wav
1112 converts raw CD digital audio (16-bit, signed-integer) to a
1113 24-bit (signed-integer) `WAV' file.
1115 \fB\-c\fR \fICHANNELS\fR, \fB\-\-channels\fR \fICHANNELS\fR
1126    sox \-r 48k \-e float \-b 32 \-c 2 input.raw output.wav
1128 converts a particular `raw' file to a self-describing `WAV' file.
1130    play \-c 1 music.wav
1145    sox input.wav \-c 1 output.wav bass \-b 24
1146    sox input.wav      output.wav bass \-b 24 channels 1
1151 \fB\-e \fIENCODING\fR, \fB\-\-encoding\fR \fIENCODING\fR
1152 The audio encoding type.  Sometimes needed with file-types that
1157 .IP \fBsigned-integer\fR
1159 with a 16 or 24 \-bit encoding size.
1161 .IP \fBunsigned-integer\fR
1163 with an 8-bit encoding size.  A value of 0 represents maximum signal
1165 .IP \fBfloating-point\fR
1166 PCM data stored as IEEE 753 single precision (32-bit) or double
1167 precision (64-bit) floating-point (`real') numbers.
1169 .IP \fBa-law\fR
1171 sample.  It has a precision equivalent to roughly 13-bit PCM and is
1172 sometimes encoded with reversed bit-ordering (see the
1173 .B \-X
1175 .IP \fBu-law,\ mu-law\fR
1177 sample.  A.k.a. \(*m-law.  It has a precision equivalent to roughly
1178 14-bit PCM and is
1179 sometimes encoded with reversed bit-ordering (see the
1180 .B \-X
1182 .IP \fBoki-adpcm\fR
1183 OKI (a.k.a. VOX, Dialogic, or Intel) 4-bit ADPCM;
1184 it has a precision equivalent to roughly 12-bit PCM.
1187 .IP \fBima-adpcm\fR
1188 IMA (a.k.a. DVI) 4-bit ADPCM;
1189 it has a precision equivalent to roughly 13-bit PCM.
1190 .IP \fBms-adpcm\fR
1191 Microsoft 4-bit ADPCM; it has a precision equivalent to roughly 14-bit
1193 .IP \fBgsm-full-rate\fR
1196 formats with different bit-rates and associated speech quality.
1198 It is usually CPU-intensive to work with GSM audio.
1203 e.g. `unsigned-integer' can be given as `un', but not `u' (ambiguous
1204 with `u-law').
1209 .B \-b
1211 .B \-c
1218    sox input.cdda \-e float output1.wav
1220    sox input.cdda \-b 64 \-e float output2.wav
1222 convert raw CD digital audio (16-bit, signed-integer) to
1223 floating-point `WAV' files (single & double precision respectively).
1229 \fB\-\-no\-glob\fR
1230 Specifies that filename `globbing' (wild-card matching) should not be
1232 directory contains the two files `five-seconds.wav' and `five*.wav', then
1234    play \-\-no\-glob "five*.wav"
1238 \fB\-r, \fB\-\-rate\fR \fIRATE\fR[\fBk\fR]
1244 .B \-b
1246 .B \-c
1251 For example, if audio was recorded with a sample-rate of say 48k from
1254    sox \-r 48720 input.wav output.wav
1269    sox input.wav \-r 48k output.wav bass \-b 24
1270    sox input.wav        output.wav bass \-b 24 rate 48k
1276 \fB\-t\fR, \fB\-\-type\fR \fIFILE-TYPE\fR
1282    another-command | sox \-t mp3 \- output.wav
1284    sox input.wav \-t raw output.bin
1295 \fB\-L\fR, \fB\-\-endian little\fR
1297 \fB\-B\fR, \fB\-\-endian big\fR
1299 \fB\-x\fR, \fB\-\-endian swap\fR
1304 These options specify whether the byte-order of the audio data is,
1307 encoded as floating-point, or as signed or unsigned integers of 16 or
1310 self-describing files.  A given endian-setting option may be ignored
1317 file; so, for example, when the following is run on a little-endian system:
1319    sox \-B audio.s16 trimmed.s16 trim 2
1321 trimmed.s16 will be created as little-endian;
1323    sox \-B audio.s16 \-B trimmed.s16 trim 2
1325 must be used to preserve big-endianness in the output file.
1328 .B \-V
1331 \fB\-N\fR, \fB\-\-reverse\-nibbles\fR
1333 sometimes useful with ADPCM-based formats.
1337 .B \-x
1340 \fB\-X\fR, \fB\-\-reverse\-bits\fR
1346 .B \-x
1352 \fB\-\-add\-comment \fITEXT\fR
1355 \fB\-\-comment \fITEXT\fR
1363 .B "\-\-comment \(dq\(dq" .
1365 \fB\-\-comment\-file \fIFILENAME\fR
1369 \fB\-C\fR, \fB\-\-compression\fR \fIFACTOR\fR
1382 Note that applying multiple effects in real-time (i.e. when playing audio)
1388 the global SoX option \fB\-M\fR can be used to isolate then recombine
1389 tracks from a multi-track recording.
1409 .B \-\-buffer
1416 There are a few pseudo-effects that aid using multiple effects chains.
1419 which will start writing to a new output file before moving to the
1422 which will move back to the first effects chain.  Pseudo-effects
1460 [\fB=\fR\^|\^\fB+\fR\^|\^\fB\-\fR]\fItimespec\fR, where \fItimespec\fR is a
1462 whether the \fItimespec\fR is to be interpreted relative to the start
1463 (\fB=\fR) or end (\fB\-\fR) of audio, or to the previous \fIposition\fR if
1465 must be known for end-relative locations to work; some effects do accept
1466 \fB\-0\fR for end-of-audio, though, even if the length is unknown.  Which of
1467 \fB=\fR, \fB+\fR, \fB\-\fR is the default depends on the effect and is shown
1470 Examples: \fB=2:00\fR (two minutes into the audio stream), \fB\-100s\fR (one
1472 and ten samples after the previous position), \fB\-0.5+1s\fR (one sample less
1476 Used to specify the band-width of a filter.  A number of different
1489 q	Q-factor	See [2]
1512 Time specifications can also be chained with \fB+\fR or \fB\-\fR into a new
1514 left, respectively: `3:00\-200s' means two hundred samples less than three
1518 .B sox \-h
1525 Apply a two-pole all-pass filter with central frequency (in Hz)
1526 \fIfrequency\fR, and filter-width \fIwidth\fR.
1527 An all-pass filter changes the
1531 This effect supports the \fB\-\-plot\fR global option.
1533 \fBband\fR [\fB\-n\fR] \fIcenter\fR[\fBk\fR]\fR [\fIwidth\fR[\fBh\fR\^|\^\fBk\fR\^|\^\fBo\fR\^|\^\f…
1534 Apply a band-pass filter.
1548 \-
1554 The \fB\-n\fR (for noise) option uses the alternate mode
1555 for un-pitched audio (e.g. percussion).
1557 \fB\-n\fR introduces a power-gain of about 11dB in the filter, so beware
1565 This effect supports the \fB\-\-plot\fR global option.
1569 \fBbandpass\fR\^|\^\fBbandreject\fR [\fB\-c\fR] \fIfrequency\fR[\fBk\fR]\fI width\fR[\fBh\fR\^|\^\f…
1570 Apply a two-pole Butterworth band-pass or band-reject filter with
1571 central frequency \fIfrequency\fR, and (3dB-point) band-width
1573 .B \-c
1581 These effects support the \fB\-\-plot\fR global option.
1586 Apply a band-reject filter.
1591 using a two-pole shelving filter with a response similar to that
1592 of a standard hi-fi's tone-controls.  This is also
1597 useful range is about \-20 (for a large cut) to +20 (for a large
1603 If desired, the filter can be fine-tuned using the following
1622 These effects support the \fB\-\-plot\fR global option.
1626 \fBbend\fR [\fB\-f \fIframe-rate\fR(25)] [\fB\-o \fIover-sample\fR(16)] { \fIstart-position(+)\fB,\…
1628 Each given triple: \fIstart-position\fB,\fIcents\fB,\fIend-position\fR
1631 bend the pitch. The other values specify the points in time at which to start
1634 The pitch-bending algorithm utilises the Discrete Fourier Transform (DFT)
1635 at a particular frame rate and over-sampling rate.
1637 .B \-f
1639 .B \-o
1647    play \-n synth 2.5 sin 667 gain 1 \\
1648 	bend .35,180,.25  .15,740,.53  0,\-520,.3
1654 .B gain\ \-5
1666 This effect supports the \fB\-\-plot\fR global option.
1677 effect is invoked automatically if SoX's \fB\-c\fR option specifies a
1680 .B \-c
1685    sox input.wav \-c 1 output.wav bass \-b 24
1686    sox input.wav      output.wav bass \-b 24 channels 1
1695 \fBchorus \fIgain-in gain-out\fR <\fIdelay decay speed depth \fB\-s\fR\^|\^\fB\-t\fR>
1708 Each four-tuple parameter
1710 and the decay (relative to gain-in) with a modulation
1712 The modulation is either sinusoidal (\fB\-s\fR) or triangular
1713 (\fB\-t\fR).  Gain-out is the volume of the output.
1719    play guitar1.wav chorus 0.7 0.9 55 0.4 0.25 2 \-t
1724    play guitar1.wav chorus 0.6 0.9 50 0.4 0.25 2 \-t \\
1725 	 60 0.32 0.4 1.3 \-s
1730    play guitar1.wav chorus 0.5 0.9 50 0.4 0.25 2 \-t \\
1731 	 60 0.32 0.4 2.3 \-t 40 0.3 0.3 1.3 \-s
1735 [\fIsoft-knee-dB\fB:\fR]\fIin-dB1\fR[\fB,\fIout-dB1\fR]{\fB,\fIin-dB2\fB,\fIout-dB2\fR}
1737 [\fIgain\fR [\fIinitial-volume-dB\fR [\fIdelay\fR]]]
1764 .I out-dB1
1766 .IR in-dB1 ;
1768 .I in-dB1
1771 \fB0,\fIout-dBn\fR).
1773 .I soft-knee-dB
1790 .B \-90
1812    sox asz.wav asz-car.wav compand 0.3,1 6:\-70,\-60,\-20 \-5 \-90 0.2
1814 The transfer function (`6:\-70,...') says that very soft sounds (below
1815 \-70dB) will remain unchanged.  This will stop the compander from
1817 However, sounds in the range \-60dB to 0dB (maximum
1819 original music will be compressed 3-to-1 into a 20dB range, which is
1821 road noise.  The `6:' selects 6dB soft-knee companding.
1822 The \-5 (dB) output gain is needed to avoid clipping (the number is
1824 The \-90 (dB) for the initial volume will work fine for a clip that starts
1828 In the next example, compand is being used as a noise-gate for when the
1831    play infile compand .1,.2 \-inf,\-50.1,\-inf,\-50,\-50 0 \-90 .1
1833 Here is another noise-gate, this time for when the
1837    play infile compand .1,.1 \-45.1,\-45,\-inf,0,\-inf 45 \-90 .1
1839 This effect supports the \fB\-\-plot\fR global option (for the transfer function).
1843 for a multiple-band companding effect.
1845 \fBcontrast \fR[\fIenhancement-amount\fR(75)]
1848 .I enhancement-amount
1849 controls the amount of the enhancement and is a number in the range 0\-100.
1851 .I enhancement-amount
1872 of \(+-2 that indicates the amount to shift the audio (which is in the
1873 range of \(+-1).
1892    sox \-n dc.wav synth 5 sin %0 50
1897 Apply Compact Disc (IEC 60908) de-emphasis (a treble attenuation shelving
1900 Pre-emphasis was applied in the mastering of some CDs issued in the early
1902 sought-after issues of albums by The Beatles, Pink Floyd and others.
1903 Pre-emphasis should be removed at playback time by a de-emphasis
1905 this filter, and very few PC CD drives have it; playing pre-emphasised
1906 audio without the correct de-emphasis filter results in audio that sounds harsh
1911 effect, it is possible to apply the necessary de-emphasis to audio that
1912 has been extracted from a pre-emphasised CD, and then either burn the
1913 de-emphasised audio to a new CD (which will then play correctly on any
1914 CD player), or simply play the correctly de-emphasised audio files on the
1917    sox track1.wav track1\-deemph.wav deemph
1919 and then burn track1-deemph.wav to CD, or
1921    play track1\-deemph.wav
1927 The de-emphasis filter is implemented as a biquad and requires the input
1931 This effect supports the \fB\-\-plot\fR global option.
1936 Delay one or more audio channels such that they start at the given
1943 present un-delayed.
1947    play \-n synth \-j 3 sin %3 sin %\-2 sin %\-5 sin %\-9 \\
1948 	sin %\-14 sin %\-21 fade h .01 2 1.5 delay \\
1949 	1.3 1 .76 .54 .27 remix \- fade h 0 2.7 2.5 norm \-1
1954    play \-n synth pl G2 pl B2 pl D3 pl G3 pl D4 pl G4 \\
1955 	delay 0 .05 .1 .15 .2 .25 remix \- fade 0 4 .1 norm \-1
1958 \fBdither\fR [\fB\-S\fR\^|\^\fB\-s\fR\^|\^\fB\-f \fIfilter\fR] [\fB\-a\fR] [\fB\-p \fIprecision\fR]
1963 add triangular (TPDF) white noise.  Noise-shaping (only for certain
1967 .B \-f
1968 option, it is possible to select a particular noise-shaping filter from
1969 the following list: lipshitz, f-weighted, modified-e-weighted,
1970 improved-e-weighted, gesemann, shibata, low-shibata, high-shibata.  Note
1977 noise-shaping curves.
1980 .B \-S
1983 plain TPDF is probably better, and above \(~~ 37k, noise-shaping
1987 .B \-a
1988 option enables a mode where dithering (and noise-shaping if applicable)
1992 dithering is not fool-proof, so the fades should be carefully checked
1993 for any noise modulation; if this occurs, then either re-dither the whole
2000 .B \-p
2004 .B \-R
2005 option is not given, then the pseudo-random number generator used to
2022 For a general resampling effect with anti-aliasing, see \fBrate\fR.  See
2033 \fBecho \fIgain-in gain-out\fR <\fIdelay decay\fR>
2046 and the decay (relative to gain-in) of that echo.
2047 Gain-out is the volume of the output.
2068 \fBechos \fIgain-in gain-out\fR <\fIdelay decay\fR>
2073 and the decay (relative to gain-in) of that echo.
2074 Gain-out is the volume of the output.
2096 Apply a two-pole peaking equalisation (EQ) filter.
2097 With this filter, the signal-level at and around a selected frequency
2098 can be increased or decreased, whilst (unlike band-pass and band-reject
2102 \fIwidth\fR, the band-width,
2114 This effect supports the \fB\-\-plot\fR global option.
2118 \fBfade\fR [\fItype\fR] \fIfade-in-length\fR [\fIstop-position(=)\fR [\fIfade-out-length\fR]]
2127 A fade-in starts from the first sample and ramps the signal level from 0
2128 to full volume over the time given as \fIfade-in-length\fR.  Specify 0 if
2129 no fade-in is wanted.
2131 For fade-outs, the audio will be truncated at
2132 .I stop-position
2134 interval of \fIfade-out-length\fR before the \fIstop-position\fR.  If
2135 .I fade-out-length
2137 \fIfade-in-length\fR.
2138 No fade-out is performed if
2139 .I stop-position
2142 previous effects, then \fB\-0\fR (or, for historical reasons, \fB0\fR) may
2144 .I stop-position
2145 to indicate the usual case of a fade-out that ends at the end of the input
2148 Any time specification may be used for \fIfade-in-length\fR and
2149 \fIfade-out-length\fR.
2155 \fBfir\fR [\fIcoefs-file\fR\^|\^\fIcoefs\fR]
2159 containing the filter coefficients (white-space separated; may contain
2160 `#' comments).  If the given filename is `\-', or if no argument is
2165    sox infile outfile fir 0.0195 \-0.082 0.234 0.891 \-0.145 0.043
2174      1.2311233052619888e\-01
2175     \-4.4777096106211783e\-01
2176      5.1031563346705155e\-01
2177     \-6.6502926320995331e\-02
2181 This effect supports the \fB\-\-plot\fR global option.
2194 delay	0 \- 30	0	Base delay in milliseconds.
2195 depth	0 \- 10	2	Added swept delay in milliseconds.
2196 regen	\-95 \- 95	0	T{
2200 width	0 \- 100	71	T{
2204 speed	0\*d1 \- 10	0\*d5	Sweeps per second (Hz).
2206 phase	0 \- 100	25	T{
2208 Swept wave percentage phase-shift for multi-channel (e.g. stereo) flange;
2213 Digital delay-line interpolation: \fBlinear\fR\^|\^\fBquadratic\fR.
2218 \fBgain \fR[\fB\-e\fR\^|\^\fB\-B\fR\^|\^\fB\-b\fR\^|\^\fB\-r\fR] [\fB\-n\fR] [\fB\-l\fR\^|\^\fB\-h\…
2227 .B \-n
2232 .I gain-dB
2236 .I gain-dB
2240 .B \-e
2241 option, the levels of the audio channels of a multi-channel file are `equalised', i.e.
2249 .B \-B
2255 .B \-B
2260 .B \-B
2263 .B \-b
2265 .B \-B
2270 .B \-B
2272 .B \-b
2276 .B \-r
2280 .B \-h
2284 .B \-n
2286 .I gain-dB
2290    sox infile outfile gain \-n
2294    sox infile outfile gain \-n \-3
2296 normalises to \-3dB.
2299 .B \-l
2302    sox infile outfile gain \-l 6
2312 .B \-h
2313 option is used to apply gain to provide head-room for subsequent
2316    sox infile outfile gain \-h bass +6
2322 \fBgain \-h\fR rather than an explicit attenuation, is that if the
2324 \fBgain \-r\fR, for example:
2326    sox infile outfile gain \-h bass +6 rate 44100 gain \-r
2332 Output formatting (dithering and bit-depth reduction) also requires
2335    sox infile outfile gain \-h bass +6 rate 44100 gain \-rh dither
2343 .B \-G
2344 can be given to automatically invoke \fBgain \-h\fR and \fBgain \-r\fR.
2352 \fBhighpass\fR\^|\^\fBlowpass\fR [\fB\-1\fR|\fB\-2\fR] \fIfrequency\fR[\fBk\fR]\fR [\fRwidth\fR[\fB…
2353 Apply a high-pass or low-pass filter with 3dB point \fIfrequency\fR.
2354 The filter can be either single-pole (with
2356 or double-pole (the default, or with
2359 applies only to double-pole filters;
2362 double-pole filters are described in detail in [1].
2364 These effects support the \fB\-\-plot\fR global option.
2366 See also \fBsinc\fR for filters with a steeper roll-off.
2368 \fBhilbert\fR [\fB\-n \fItaps\fR]
2369 Apply an odd-tap Hilbert transform filter, phase-shifting the signal
2376 An odd-tap Hilbert transform filter has a bandpass characteristic,
2379 \fB\-n\fR.  By default, the number of taps is chosen for a cutoff
2382 This effect supports the \fB\-\-plot\fR global option.
2384 \fBladspa\fR [\fB-l\fR\^|\^\fB-r\fR] \fImodule\fR [\fIplugin\fR] [\fIargument\fR ...]
2386 Despite the name, LADSPA is not Linux-specific, and a wide range of
2397 .B \-r
2398 (replicate) option allows cloning a mono plugin to handle multi-channel
2403 .B \-l
2422 A default gain of \-10dB is used if a
2430 \fBlowpass\fR [\fB\-1\fR|\fB\-2\fR] \fIfrequency\fR[\fBk\fR]\fR [\fRwidth\fR[\fBq\fR\^|\^\fBo\fR\^|…
2431 Apply a low-pass filter.
2435 [\fIsoft-knee-dB\fB:\fR]\fIin-dB1\fR[\fB,\fIout-dB1\fR]{\fB,\fIin-dB2\fB,\fIout-dB2\fR}
2437 [\fIgain\fR [\fIinitial-volume-dB\fR [\fIdelay\fR]]]\(dq {\fIcrossover-freq\fR[\fBk\fR] \(dqattack1…
2439 The multi-band compander is similar to the single-band compander but the
2440 audio is first divided into bands using Linkwitz-Riley cross-over filters
2444 frequency for that band is given by \fIcrossover-freq\fR; these can be
2447 For example, the following (one long) command shows how multi-band
2451    play track1.wav gain \-3 sinc \-n 29 \-b 100 8000 mcompand \\
2452 	\(dq0.005,0.1 \-47,\-40,\-34,\-34,\-17,\-33\(dq 100 \\
2453 	\(dq0.003,0.05 \-47,\-40,\-34,\-34,\-17,\-33\(dq 400 \\
2454 	\(dq0.000625,0.0125 \-47,\-40,\-34,\-34,\-15,\-33\(dq 1600 \\
2455 	\(dq0.0001,0.025 \-47,\-40,\-34,\-34,\-31,\-31,\-0,\-30\(dq 6400 \\
2456 	\(dq0,0.025 \-38,\-31,\-28,\-28,\-0,\-25\(dq \\
2457 	gain 15 highpass 22 highpass 22 sinc \-n 255 \-b 16 \-17500 \\
2458 	gain 9 lowpass \-1 17801
2462 Note that the pipeline is set up with US-style 75us pre-emphasis.
2466 for a single-band companding effect.
2468 \fBnoiseprof\fR [\fIprofile-file\fR]
2472 \fBnoisered\fR [\fIprofile-file\fR [\fIamount\fR]]
2479 profile to \fIprofile-file\fR, or to stdout if no \fIprofile-file\fR or
2480 if `\-' is given.  E.g.
2482    sox speech.wav \-n trim 0 1.5 noiseprof speech.noise-profile
2490 .IR profile-file ,
2491 or from stdin if no \fIprofile-file\fR or if `\-' is given.  E.g.
2493    sox speech.wav cleaned.wav noisered speech.noise-profile 0.3
2500 with a noise-reduced version, experiment with different
2509    sox noisy.wav \-n trim 0 1 noiseprof | play noisy.wav noisered
2512 \fBnorm\fR [\fIdB-level\fR]
2515 is just an alias for \fBgain \-n\fR; see the
2521 Mixes stereo to twin-mono where each mono channel contains the
2530 in the over-driven output.
2551 position or specify a zero-length pad at the start.
2556 the audio on a channel-by-channel basis.
2558 \fBphaser \fIgain-in gain-out delay decay speed\fR [\fB\-s\fR\^|\^\fB\-t\fR]
2563 and the decay (relative to gain-in) with a modulation
2565 The modulation is either sinusoidal (\fB\-s\fR) \*mpreferable for multiple
2567 (\fB\-t\fR) \*mgives single instruments a sharper phasing effect.
2569 feedback, and usually no less than 0\*d1.  Gain-out is the volume of the output.
2573    play snare.flac phaser 0.8 0.74 3 0.4 0.5 \-t
2577    play snare.flac phaser 0.9 0.85 4 0.23 1.3 \-s
2581    play snare.flac phaser 0.89 0.85 1 0.24 2 \-t
2585    play snare.flac phaser 0.6 0.66 3 0.6 2 \-t
2588 \fBpitch \fR[\fB\-q\fR] \fIshift\fR [\fIsegment\fR [\fIsearch\fR [\fIoverlap\fR]]]
2602 \fBrate\fR [\fB\-q\fR\^|\^\fB\-l\fR\^|\^\fB\-m\fR\^|\^\fB\-h\fR\^|\^\fB\-v\fR] [override-options] \…
2605 (even non-integer if this is supported by the output file format)
2614 Band-width
2619 \-q	T{
2629 \-l	low	80%	100	T{
2633 \-m	medium	95%	100	T{
2637 \-h	high	95%	125	T{
2639 16-bit mastering (use with dither)
2641 \-v	T{
2644 T}	95%	175	24-bit mastering
2649 .I Band-width
2658 band-limited interpolation.  By default, all algorithms have
2664 effect is invoked automatically if SoX's \fB\-r\fR option specifies a
2667 .B \-r
2672    sox input.wav \-r 48k output.wav bass \-b 24
2673    sox input.wav        output.wav bass \-b 24 rate 48k
2689 Occasionally, however, it may be desirable to fine-tune the resampler's
2697 \-M/\-I/\-L	Phase response = minimum/intermediate/linear
2698 \-s	Steep filter (band-width = 99%)
2699 \-a	Allow aliasing/imaging above the pass-band
2700 \-b\ 74\-99\*d7	Any band-width %
2701 \-p\ 0\-100	T{
2715 (`pre-echo') than if they occur after it (`post-echo').  Note that
2722 `pre' and `post': with minimum phase, there is no pre-echo but the
2723 longest post-echo; with linear phase, pre and post echo are in equal
2726 length (and level) of pre-echo and a medium lengthed post-echo.
2732 .B \-L
2734 .B \-p
2738 A resampler's band-width setting determines how much of the frequency
2740 up-sampling, or the new sample rate when down-sampling) is preserved
2741 during conversion.  The term `pass-band' is used to refer to all frequencies
2742 up to the band-width point (e.g. for 44\*d1kHz sampling rate, and a
2743 resampling band-width of 95%, the pass-band represents frequencies from
2744 0Hz (D.C.) to circa 21kHz).  Increasing the resampler's band-width
2749 .B \-s
2750 `steep filter' option changes resampling band-width from the default 95%
2752 .B \-b
2753 option allows the band-width to be set to any value in the range
2754 74\-99\*d7 %, but note that band-width values greater than 99% are not
2758 .B \-a
2759 option is given, then aliasing/imaging above the pass-band is allowed.  For
2761 resampling band-width of 95%, this means that frequency content above
2762 21kHz can be distorted; however, since this is above the pass-band (i.e.
2767 the minimum band-width allowable with
2768 .B \-b
2773    sox input.wav \-b 16 output.wav rate \-s \-a 44100 dither \-s
2776 aliasing; to 44\*d1kHz sample rate; noise-shaped dither to 16-bit WAV
2779    sox input.wav \-b 24 output.aiff rate \-v \-I \-b 90 48k
2781 very high quality resampling; overrides: intermediate phase, band-width 90%;
2782 to 48k sample rate; store output to 24-bit AIFF file.
2798 \fBremix\fR [\fB\-a\fR\^|\^\fB\-m\fR\^|\^\fB\-p\fR] <\fIout-spec\fR>
2799 \fIout-spec\fR	= \fIin-spec\fR{\fB,\fIin-spec\fR} | \fB0\fR
2801 \fIin-spec\fR	= [\fIin-chan\fR]\^[\fB\-\fR[\fIin-chan2\fR]]\^[\fIvol-spec\fR]
2803 \fIvol-spec\fR	= \fBp\fR\^|\^\fBi\fR\^|\^\fBv\^\fR[\fIvolume\fR]
2807 channel is specified, in turn, by a given \fIout-spec\fR: a list of
2813 .B \-m
2816 are mix-combined before entering the effects chain).
2819 .I out-spec
2820 contains comma-separated input channel-numbers and hyphen-delimited
2821 channel-number ranges; alternatively,
2831    sox input.wav output.wav remix 1\-3,7 3
2834 is a mix-down of input channels 1, 2, 3, and 7, and the right channel is
2841    sox input.wav output.wav remix \-
2843 performs a mix-down of all input channels to mono.
2846 channels, each input channel will be scaled by a factor of \(S1/\s-2n\s+2.
2848 of input channels with a \fIvol-spec\fR (volume specification).
2864 1 = no change, 0\*d5 \(~= 6dB attenuation, 2 \(~= 6dB gain, \-1 = invert
2870 .I out-spec
2872 .I vol-spec
2873 then, by default, \(S1/\s-2n\s+2 scaling is not applied to any other channels in the
2874 same out-spec (though may be in other out-specs).
2875 The \-a (automatic)
2883    sox input.wav output.wav remix \-a 1,2 3,4v0.8
2887 The \-m (manual) option disables all automatic volume adjustments, so
2889    sox input.wav output.wav remix \-m 1,2 3,4v0.8
2907 If the \fB\-p\fR option is given, then any automatic \(S1/\s-2n\s+2 scaling
2908 is replaced by \(S1/\s-2\(srn\s+2 (`power') scaling; this gives a louder mix
2926 chans=\`soxi \-c "$1"\`
2927 while [ $chans \-ge 1 ]; do
2929    out=\`echo "$1"|sed "s/\\(.*\\)\\.\\(.*\\)/\\1\-$chans0.\\2/"\`
2931    chans=\`expr $chans \- 1\`
2938 .IR input-01.wav ,
2939 \fIinput-02.wav\fR, ...,
2940 .IR input-06.wav .
2944 \fBrepeat\fR [\fIcount\fR(1)|\fB\-\fR]
2946 The special value \fB\-\fR requests infinite repetition.
2951 \fBreverb\fR [\fB\-w\fR|\fB\-\-wet-only\fR] [\fIreverberance\fR (50%) [\fIHF-damping\fR (50%)
2952 [\fIroom-scale\fR (100%) [\fIstereo-depth\fR (100%)
2954 [\fIpre-delay\fR (0ms) [\fIwet-gain\fR (0dB)]]]]]]
2967    play dry.wav gain \-3 pad 0 3 reverb
2970 .B \-w
2974    play \-m voice.wav "|sox voice.wav \-p reverse reverb \-w reverse"
2986 This effect supports the \fB\-\-plot\fR global option.
2988 \fBsilence \fR[\fB\-l\fR] \fIabove-periods\fR [\fIduration threshold\fR[\fBd\fR\^|\^\fB%\fR]
2989 [\fIbelow-periods duration threshold\fR[\fBd\fR\^|\^\fB%\fR]]
2994 The \fIabove-periods\fR value is used to indicate if audio should be
2997 non-zero \fIabove-periods\fR, it trims audio up until it finds
2998 non-silence. Normally, when trimming silence from beginning of audio
2999 the \fIabove-periods\fR will be 1 but it can be increased to higher
3000 values to trim all audio up to a specific count of non-silence
3003 an \fIabove-period\fR of 2 to strip out both silence periods and the
3006 When \fIabove-periods\fR is non-zero, you must also specify a
3008 amount of time that non-silence must be detected before it stops
3018 a \fIbelow-periods\fR count.  In this case, \fIbelow-period\fR means
3023 at the end, you could set below-period to a value of 2 to skip over the
3026 For \fIbelow-periods\fR, \fIduration\fR specifies a period of silence
3036 By first reversing the audio, you can use the \fIabove-periods\fR
3041 \fIbelow-periods\fR that is negative.  This value is then
3044 \fIabove-periods\fR, making it suitable for removing periods of
3048 .B \-l
3049 indicates that \fIbelow-periods\fR \fIduration\fR length of audio
3066 The following example shows how this effect can be used to start a recording
3067 that does not contain the delay at the start which usually occurs between
3068 `pressing the record button' and the start of the performance:
3070    rec \fIparameters filename other-effects\fR silence 1 5 2%
3074 …-a\fI att\fR\^|\^\fB\-b\fI beta\fR] [\fB\-p\fI phase\fR\^|\^\fB\-M\fR\^|\^\fB\-I\fR\^|\^\fB\-L\fR]…
3076 Apply a sinc kaiser-windowed low-pass, high-pass, band-pass, or band-reject filter
3079 6dB points of a high-pass and low-pass filter that may be invoked
3081 given, then \fIfreqHP\fR less than \fIfreqLP\fR creates a band-pass filter,
3082 \fIfreqHP\fR greater than \fIfreqLP\fR creates a band-reject filter.
3086    sinc -4k
3087    sinc 3k-4k
3088    sinc 4k-3k
3090 create a high-pass, low-pass, band-pass, and band-reject filter
3093 The default stop-band attenuation of 120dB can be overridden with
3094 \fB\-a\fR; alternatively, the kaiser-window `beta' parameter can be
3095 given directly with \fB\-b\fR.
3097 The default transition band-width of 5% of the total band can be
3098 overridden with \fB\-t\fR (and \fItbw\fR in Hertz); alternatively, the
3099 number of filter taps can be given directly with \fB\-n\fR.
3101 If both \fIfreqHP\fR and \fIfreqLP\fR are given, then a \fB\-t\fR or
3102 \fB\-n\fR option given to the left of the frequencies applies to both
3111 .B \-L
3115 This effect supports the \fB\-\-plot\fR global option.
3120 \fBsox \-\-help\fR and check the list of supported effects to see if
3124 and shows time in the X-axis, frequency in the Y-axis, and audio
3125 signal magnitude in the Z-axis.  Z-axis values are represented by the
3126 colour (or optionally the intensity) of the pixels in the X-Y plane.
3133    sox my.wav \-n spectrogram
3139    sox my.wav \-n remix 2 trim 20 30 spectrogram
3147    sox my.wav \-n rate 6k spectrogram
3153    sox my.wav \-n trim 0 10 spectrogram \-x 600 \-y 200 \-z 100
3157 by 200 pixels in size and the Z-axis range will be 100 dB).  Note that
3161    sox \-n \-n synth 6 tri 10k:14k spectrogram \-z 100 \-w kaiser
3170    rate 2k spectrogram \-X 200 \-Z \-10 \-w kaiser
3172 Options are also available to control the appearance (colour-set,
3175    sox my.wav \-n spectrogram \-m \-l \-o print.png
3182 .IP \fB\-x\ \fInum\fR
3183 Change the (maximum) width (X-axis) of the spectrogram from its default
3185 See also \fB\-X\fR and \fB\-d\fR.
3186 .IP \fB\-X\ \fInum\fR
3187 X-axis pixels/second; the default is auto-calculated to fit the given
3188 or known audio duration to the X-axis size, or 100 otherwise.  If
3189 given in conjunction with \fB\-d\fR, this option affects the width of
3198 .B \-V
3200 See also \fB\-x\fR and \fB\-d\fR.
3201 .IP \fB\-y\ \fInum\fR
3202 Sets the Y-axis size in pixels (per channel); this is the number of
3206 Y-axis size is chosen automatically (depending on the number of
3208 .B \-Y
3210 .IP \fB\-Y\ \fInum\fR
3218 .B \-y
3220 .IP \fB\-z\ \fInum\fR
3221 Z-axis (colour) range in dB, default 120.  This sets the dynamic-range
3222 of the spectrogram to be \-\fInum\fR\ dBFS to 0\ dBFS.
3224 may range from 20 to 180.  Decreasing dynamic-range effectively
3226 .IP \fB\-Z\ \fInum\fR
3227 Sets the upper limit of the Z-axis in dBFS.
3232 .IP \fB\-n\fR
3234 are shown using the brightest colour in the palette - a kind of
3235 automatic \fB\-Z\fR flag.
3236 .IP \fB\-q\ \fInum\fR
3237 Sets the Z-axis quantisation, i.e. the number of different colours (or
3238 intensities) in which to render Z-axis
3239 values.  A small number (e.g. 4) will give a `poster'-like effect making
3243 colours to use inside the Z-axis range; two colours are reserved to
3244 represent out-of-range values.
3245 .IP \fB\-w\ \fIname\fR
3250 all-round frequency-resolution and dynamic-range properties.  For better
3251 frequency resolution (but lower dynamic-range), select a Hamming window;
3252 for higher dynamic-range (but poorer frequency-resolution), select a
3254 .IP \fB\-W\ \fInum\fR
3258 .IP \fB\-s\fR
3262 .B \-x
3264 .IP \fB\-m\fR
3266 .IP \fB\-h\fR
3267 Selects a high-colour palette\*mless visually pleasing than the default
3272 .IP \fB\-p\ \fInum\fR
3277 .IP \fB\-l\fR
3280 .IP \fB\-a\fR
3283 .IP \fB\-r\fR
3285 .IP \fB\-A\fR
3286 Selects an alternative, fixed colour-set.  This is provided only for
3289 differentiation at the bottom end which results in masking of low-level
3291 .IP \fB\-t\ \fItext\fR
3293 .IP \fB\-c\ \fItext\fR
3296 .IP \fB\-o\ \fIfile\fR
3298 If `-' is given, the spectrogram will be sent to standard output
3310 .IP \fB\-d\ \fIduration\fR
3311 This option sets the X-axis resolution such that audio with the given
3313 (a time specification) fits the selected (or default) X-axis width.  For
3316    sox input.mp3 output.wav \-n spectrogram \-d 1:00 stats
3325 .B \-X
3326 for an alternative way of setting the X-axis resolution.
3327 .IP \fB\-S\ \fIposition(=)\fR
3328 Start the spectrogram at the given point in the audio stream.  For
3331    sox input.aiff output.wav spectrogram \-S 1:00
3338 For the ability to perform off-line processing of spectral data, see the
3363 \fBsplice \fR [\fB\-h\fR\^|\^\fB\-t\fR\^|\^\fB\-q\fR] { \fIposition(=)\fR[\fB,\fIexcess\fR[\fB,\fIl…
3365 simple audio concatenation: a (usually short) cross-fade is applied at
3373 .B \-q
3374 may be given to select the fade envelope as half-cosine wave (the default),
3375 triangular (a.k.a. linear), or quarter-cosine wave respectively.
3411     -----------><--->
3416                 *   : :   * - - *
3421                       <--->   <----->
3427 For example, a long song begins with two verses which start (as
3432 (\fIstart\fR) effect) at times 0:30\*d125 and 1:03\*d432.
3435    sox too-long.wav part1.wav trim 0 30.130
3439    sox too-long.wav part2.wav trim 1:03.422
3443    sox part1.wav part2.wav just-right.wav splice 30.130
3447    play "|sox \-n \-p synth 1 sin %1" "|sox \-n \-p synth 1 sin %3"
3462 # acpo infile copy-start copy-stop paste-over-start outfile
3464 # (i.e. such that contain +/\-).
3467 sox "$1" piece.wav trim $2\-$e\-$l =$3+$e
3469 sox "$1" part2.wav trim $4+$3\-$2\-$e\-$l
3471    splice $4+$e +$3\-$2+$e+$l+$e
3482 It is also possible to use this effect to perform general cross-fades,
3486 .B \-q
3487 option would typically be given (to select an `equal power' cross-fade), and
3490 .B \-q
3492 to be cross-faded, then
3494    sox f1.wav f2.wav out.wav splice \-q $(soxi \-D f1.wav),3
3496 cross-fades the files where the point of equal loudness is 3 seconds
3497 before the end of f1.wav, i.e. the total length of the cross-fade is
3500 \fBstat\fR [\fB\-s \fIscale\fR] [\fB\-rms\fR] [\fB\-freq\fR] [\fB\-v\fR] [\fB\-d\fR]
3512 .I x\s-2\dk\u\s0
3513 represents the PCM value (in the range \-1 to +1 by default) of each successive
3521 Scaled by	\ 	See \-s below.
3522 Maximum amplitude	max(\fIx\s-2\dk\u\s0\fR)	T{
3525 Minimum amplitude	min(\fIx\s-2\dk\u\s0\fR)	T{
3528 Midline amplitude	\(12\^min(\fIx\s-2\dk\u\s0\fR)\^+\^\(12\^max(\fIx\s-2\dk\u\s0\fR)
3529 Mean norm	\(S1/\s-2n\s+2\^\(*S\^\^\(br\^\fIx\s-2\dk\u\s0\fR\^\(br\^	T{
3532 Mean amplitude	\(S1/\s-2n\s+2\^\(*S\^\fIx\s-2\dk\u\s0\fR	T{
3533 The average of each sample in the audio.  If this figure is non-zero, then it indicates the
3538 RMS amplitude	\(sr(\(S1/\s-2n\s+2\^\(*S\^\fIx\s-2\dk\u\s0\fR\(S2)	T{
3542 Maximum delta	max(\^\(br\^\fIx\s-2\dk\u\s0\fR\^\-\^\fIx\s-2\dk\-1\u\s0\fR\^\(br\^)
3543 Minimum delta	min(\^\(br\^\fIx\s-2\dk\u\s0\fR\^\-\^\fIx\s-2\dk\-1\u\s0\fR\^\(br\^)
3544 Mean delta	\(S1/\s-2n\-1\s+2\^\(*S\^\^\(br\^\fIx\s-2\dk\u\s0\fR\^\-\^\fIx\s-2\dk\-1\u\s0\fR\^\(br\^
3545 RMS delta	\(sr(\(S1/\s-2n\-1\s+2\^\(*S\^(\fIx\s-2\dk\u\s0\fR\^\-\^\fIx\s-2\dk\-1\u\s0\fR)\(S2)
3558 Note that the delta measurements are not applicable for multi-channel audio.
3561 .B \-s
3565 is 2147483647 (i.e. the maximum value of a 32-bit signed integer).
3571 .B \-rms
3576 .B \-v
3580 .B \-freq
3586 .B \-d
3588 displays a hex dump of the 32-bit signed PCM data
3591 sometimes occur in cross-platform versions of SoX.
3597 \fBstats\fR [\fB\-b \fIbits\fR\^|\^\fB\-x \fIbits\fR\^|\^\fB\-s \fIscale\fR] [\fB\-w \fIwindow-time…
3603 For example, for a typical well-mastered stereo music file:
3609 DC offset   0.000803 \-0.000391  0.000803
3610 Min level  \-0.750977 \-0.750977 \-0.653412
3612 Pk lev dB      \-2.49     \-2.49     \-3.69
3613 RMS lev dB    \-19.41    \-19.13    \-19.71
3614 RMS Pk dB     \-13.82    \-13.82    \-14.38
3615 RMS Tr dB     \-85.25    \-85.25    \-82.66
3616 Crest factor       \-      6.79      6.32
3619 Bit-depth      16/16     16/16     16/16
3632 are shown, by default, in the range \(+-1.
3634 .B \-b
3636 with the given number of bits; for example, for 16 bits, the scale would be \-32768 to +32767.
3638 .B \-x
3640 .B \-b
3643 .B \-s
3644 option scales the three measurements by a given floating-point number.
3670 The right-hand
3671 .I Bit-depth
3672 figure is the standard definition of bit-depth i.e. bits less
3673 significant than the given number are fixed at zero.  The left-hand
3675 one for negative numbers) subtracted from the right-hand figure (the
3679 For multi-channel audio, an overall figure for each of the above
3686 .IR Bit-depth :
3701 is equal to the sample-rate multiplied by
3731 it is retained as it can sometimes out-perform
3755 …-j \fIKEY\fR] [\fB\-n\fR] [\fIlen\fR [\fIoff\fR [\fIph\fR [\fIp1\fR [\fIp2\fR [\fIp3\fR]]]]]] {[\f…
3758 with various wave shapes, or to generate wide-band noise of various
3764 Audio for each channel in a multi-channel audio file can be synthesised
3771 file' (with the special name \fB\-n\fR) is often given instead (and the
3776 audio file containing a sine-wave swept from 300 to 3300\ Hz:
3778    sox \-n output.wav synth 3 sine 300\-3300
3782    sox \-r 8000 \-n output.wav synth 3 sine 300\-3300
3789    sox \-n output.wav synth 3 sine 300\-3300 brownnoise
3795    play \-n synth 0.5 sine 200\-500 synth 0.5 sine fmod 700\-100
3802    play \-n synth 4 pluck %\-29
3808 	play \-n synth 4 pluck $n repeat 2; done
3826 effect incorporates the functionality of \fBgain \-h\fR (see the
3830 .B \-n
3849 be used.  The default frequency is 440Hz.  By default, the tuning used
3851 .B \-j
3855 is an integer number of semitones relative to A (so for example, \-9
3864 one of the characters `:', `+', `/', or `\-'.  This character is used to
3870 Square: a second-order function is used to change the tone.
3873 .IP \fB\-\fR
3881 \fIoff\fR is the bias (DC-offset) of the signal in percent; default=0.
3892 or tone-1 (pluck); default=20.
3895 ends; default=60, or tone-2 (pluck); default=90.
3897 \fBtempo \fR[\fB\-q\fR] [\fB\-m\fR\^|\^\fB\-s\fR\^|\^\fB\-l\fR] \fIfactor\fR [\fIsegment\fR [\fIsea…
3900 shifted in the time domain and overlapped (cross-faded) at points where
3906 .B \-q
3913 .B \-m
3918 .B \-s
3923 .B \-l
3928 If \-m, \-s, or \-l is specified, the default value of segment will be
3942 of 2), 41\ ms may give a better result.  The \-m, \-s, and \-l flags will cause
3944 For example using \-s (for speech) with a tempo of 1.25 will calculate a
3955 quality. The \-m, \-s, and \-l flags will cause
3961 Default value is 12, but \-m, \-s, or \-l flags automatically
3976 Apply a treble tone-control effect.
4000    play infile trim 12:34 =15:00 -2:00
4004    play infile trim 12:34 2:26 -2:00
4011 Upsample the signal by an integer factor: \fIfactor\fR\-1 zero-value
4018 For a general resampling effect with anti-imaging, see \fBrate\fR.  See
4024 i.e. 16-bit, 44\-48kHz) recordings of speech.  The algorithm currently
4053 .IP \fB\-t\ \fInum\fR\ (7)
4057 .IP \fB\-T\ \fInum\fR\ (0.25)
4060 .IP \fB\-s\ \fInum\fR\ (1)
4063 .IP \fB\-g\ \fInum\fR\ (0.25)
4066 .IP \fB\-p\ \fInum\fR\ (0)
4074 These allow fine tuning of the algorithm's internal parameters.
4076 .IP \fB\-b\ \fInum\fR
4078 order to detect the start of the wanted audio.  This option sets the
4080 .IP \fB\-N\ \fInum\fR
4083 .IP \fB\-n\ \fInum\fR
4086 .IP \fB\-r\ \fInum\fR
4089 .IP \fB\-f\ \fInum\fR
4091 .IP \fB\-m\ \fInum\fR
4094 .IP \fB\-M\ \fInum\fR
4096 .IP \fB\-h\ \fInum\fR
4097 `Brick-wall' frequency of high-pass filter applied at the input to the
4099 .IP \fB\-l\ \fInum\fR
4100 `Brick-wall' frequency of low-pass filter applied at the input to the
4102 .IP \fB\-H\ \fInum\fR
4103 `Brick-wall' frequency of high-pass lifter used in the detector
4105 .IP \fB\-L\ \fInum\fR
4106 `Brick-wall' frequency of low-pass lifter used in the detector
4118 .B \-v
4132 if \fBpower\fR, then a power (i.e. wattage or voltage-squared) ratio,
4178 for a volume-changing effect with different capabilities, and
4180 for a dynamic-range compression/expansion/limiting effect.
4183 command-line parameters, or 2 if an error occurs during file processing.
4186 (sox-users@lists.sourceforge.net).
4203 R. Bristow-Johnson,
4205 https://webaudio.github.io/Audio-EQ-Cookbook/audio-eq-cookbook.html
4209 .IR "Q-factor" ,
4215 https://web.archive.org/web/20070320114719/http://www.harmony-central.com/Effects/effects-explained…
4237 Copyright 1998\-2013 Chris Bagwell and SoX Contributors.