
I'm trying to write a program in python but I am _very_ confused about how to set audio parameters. I can find documentation but it doesn't make any sense. I can't seem to find an example anywhere that I can follow. In the past I've just used /dev/dsp with the default settings of 8 bit mono, 8k samplerate. This time I want to record 16 bit, two channels, 44k1, splitting the recording into left.wav and right.wav and stripping out silence in realtime. How do I do it?

On Thu, Dec 16, 2004 at 08:33:05PM +1300, zcat wrote:
I'm trying to write a program in python but I am _very_ confused about how to set audio parameters. I can find documentation but it doesn't make any sense. I can't seem to find an example anywhere that I can follow.
In the past I've just used /dev/dsp with the default settings of 8 bit mono, 8k samplerate. This time I want to record 16 bit, two channels, 44k1, splitting the recording into left.wav and right.wav and stripping out silence in realtime.
How do I do it?
Heh, a few years ago I wrote a wxPython program to make recordings and play them back (although only in mono, not stereo). It's a bit ugly, with hardcoded constants from the kernel header files, but it should still work. I can't really help with splitting the channels, or with any signal processing for detecting silence, but I can help with changing the default settings of the oss device. (This assumes that if you are using alsa, you have the old-style oss emulation.) Here are the relevant bits from my python script. John ############################################################################## from fcntl import ioctl # for controlling the soundcard (x86/linux specific) import struct # for python -> c type conversions audio_device = "/dev/dsp" audio_mixer = "/dev/mixer" debug = 1 # # The global volume level, as 2 bytes (left and right). 100 decimal is the # maximum. Eg 40 out of 100 (decimal) = 50 octal master_volume = '\120\120' # mixer levels we use for playing/recording working_mic_level = '\120\120' # 80 (out of 100) decimal working_pcm_level = '\120\120' # 80 # this saves the speed at which samples are recorded record_rate = "" ################## # ioctl magic numbers for linux 2.0/2.2/2.4 kernel on i386 architectures.... # see /usr/include/linux/soundcard.h or "man ioctl_list" SNDCTL_DSP_SPEED = 0xC0045002 # pcm write rate SOUND_PCM_READ_RATE = 0x80045002 SNDCTL_DSP_SETFMT = 0xC0045005 # pcm write bits SOUND_PCM_READ_BITS = 0x80045005 SOUND_MIXER_READ_PCM = 0x80044D04 SOUND_MIXER_WRITE_PCM = 0xC0044D04 SOUND_MIXER_READ_MIC = 0x80044D07 SOUND_MIXER_WRITE_MIC = 0xC0044D07 SOUND_MIXER_READ_RECLEV = 0x80044D0B SOUND_MIXER_WRITE_RECLEV = 0xC0044D0B SOUND_MIXER_READ_VOLUME = 0x80044D00 SOUND_MIXER_WRITE_VOLUME = 0xC0044D00 SOUND_PCM_READ_CHANNELS = 0x80045006 SOUND_PCM_WRITE_CHANNELS = 0xC0045006 device = os.open(audio_device, os.O_RDONLY) # set recording/play to mono # SOUND_PCM_WRITE_CHANNELS is an alias for SNDCTL_DSP_CHANNELS... ret_string = ioctl(device, SOUND_PCM_WRITE_CHANNELS, '\1\0\0\0') byte_array = struct.unpack("1I", ret_string) if (debug == true): print "num channels was set to " + str(byte_array[0]) # attempt to set the recording frequency and bit rate to # 44100Hz and 16 bits per sample # should set channels and format first, and then rate, according # to the oss docs. samples_per_second = 48000 # 44100 # 16 corresponds to signed 16 bits samples. See <linux/soundcard.h> # for the other formats - it's a bit field, not just n => n bits/sample! dsp_format = 16 c_samples_per_second = struct.pack("1I", samples_per_second) c_dsp_format = struct.pack("1I", dsp_format) if (debug == true): ret_string = ioctl(device, SOUND_PCM_READ_BITS, '\0\0\0\0') set_dsp_format = struct.unpack("1I", ret_string)[0] print "dsp_format is currently " + str(set_dsp_format) # set bits per second (both recording and playing) ret_string = ioctl(device, SNDCTL_DSP_SETFMT, c_dsp_format) set_dsp_format = struct.unpack("1I", ret_string)[0] if (debug == true): print "dsp_format was set to " + str(set_dsp_format) if (set_dsp_format != dsp_format): print " (attempted to set to " + str(dsp_format) + ")" if (debug == true): ret_string = ioctl(device, SOUND_PCM_READ_RATE, '\0\0\0\0') set_samples_per_second = struct.unpack("1I", ret_string)[0] print "rate is currently " + str(set_samples_per_second) # set the sample rate (for recording and playback) ret_string = ioctl(device, SNDCTL_DSP_SPEED, c_samples_per_second) set_samples_per_second = struct.unpack("1I", ret_string)[0] global record_rate record_rate = str(set_samples_per_second) if (debug == true): print "rate was set to " + str(set_samples_per_second) if (set_samples_per_second != samples_per_second): print "(attempted to set to " + str(samples_per_second) + ")" def SetMixerRecord(): mixer = os.open(audio_mixer, os.O_WRONLY) ioctl(mixer, SOUND_MIXER_WRITE_MIC, working_mic_level) ioctl(mixer, SOUND_MIXER_WRITE_PCM, '\0\0') ioctl(mixer, SOUND_MIXER_WRITE_VOLUME, '\0\0') # mute all playback os.close(mixer) def SetMixerPlay(): global master_volume mixer = os.open(audio_mixer, os.O_WRONLY) ioctl(mixer, SOUND_MIXER_WRITE_MIC, '\0\0') ioctl(mixer, SOUND_MIXER_WRITE_PCM, working_pcm_level) ioctl(mixer, SOUND_MIXER_WRITE_VOLUME, master_volume) os.close(mixer)

Heh, a few years ago I wrote a wxPython program to make recordings and play them back (although only in mono, not stereo).
It's a bit ugly, with hardcoded constants from the kernel header files, but it should still work. I can't really help with splitting the channels, or with any signal processing for detecting silence, but I can help with changing the default settings of the oss device. (This assumes that if you are using alsa, you have the old-style oss emulation.)
Here are the relevant bits from my python script.
Yeow, that's complicated! I managed to dig up some more stuff on google, I think I have the parameters set correctly now and I was planning to use aumix to select the input and set levels. I also discovered there's a wave module which should take care of the wave headers for me.. This is what I have so far; -- #!/usr/bin/python -O import string,sys,re,wave,ossaudiodev from string import * from time import * from commands import getoutput fmt=ossaudiodev.AFMT_S16_LE channels=2 rate=44100 dsp=ossaudiodev.open("/dev/dsp",'r') (fmt, channels, rate) = dsp.setparameters(fmt, channels, rate) def listen(howmuch): chunk = dsp.read(howmuch) return chunk -- Now I'm guessing that if I call dsp.read(176400) I should get one second of signed 16 bit audio data in the form "[left low] [left high] [right low] [right high]", I should then jump into a loop parsing this into 16 bit integers and seeing if they exceed some threshold value. Or is there a cleaner way of doing dsp.read so that it returns some sort of pre-formatted structure?
participants (2)
-
John R. McPherson
-
zcat