Audio SDK
AudioSDK is the Python wrapper around the Dev Edition's audio hardware (ES8389 codec on the Allwinner-based Intern audio board, exposed as ALSA card index 1). It handles recording, playback, live streaming, mixer control, and a built-in health check.
The module is in audio/audio_sdk.py in the intern-developer-sdk repo.
Install
On your workstation (cross-platform)
pip install pyaudio
Platform notes:
- macOS —
brew install portaudiofirst ifpip install pyaudiofails - Linux (Debian/Ubuntu) —
sudo apt install -y libportaudio2 portaudio19-dev - Windows —
pip install pyaudioships wheels, no extra deps
On the device
The recommended path is the bundled onboarding script:
python onboarding/onboard_device.py <host> <user> --password <pw>
This is idempotent: it copies the SDK over SFTP, runs apt install python3-pyaudio, adds system to the audio group, registers any skills, and runs the diagnostic.
Manual alternative:
ssh system@<host> "mkdir -p ~/sdk/audio"
scp audio/audio_sdk.py system@<host>:~/sdk/audio/
ssh system@<host> "sudo apt-get install -y python3-pyaudio && sudo usermod -a -G audio system"
Quick example
from audio_sdk import AudioSDK
with AudioSDK(card=1, sample_rate=44100, channels=2, auto_tune=True) as audio:
audio.record("note.wav", duration=5.0)
audio.play("note.wav")
auto_tune=True applies the codec's tested defaults (PGA gain, capture levels) on construction so first-time recordings aren't silent.
Constructor
AudioSDK(
card: int = 1,
sample_rate: int = 44100,
channels: int = 2,
auto_tune: bool = True,
)
| Arg | Default | Notes |
|---|---|---|
card | 1 | ALSA card index — ES8389 is usually card 1; the Pi's onboard audio is card 0 |
sample_rate | 44100 | Common values: 16000 (voice), 44100 (CD), 48000 (DAW) |
channels | 2 | 1 = mono, 2 = stereo |
auto_tune | True | Apply known-good mixer defaults on init |
Recording and playback
audio.record("clip.wav", duration=5.0) # write file
audio.play("clip.wav") # play file
audio.record_and_play(3.0, "clip.wav") # convenience wrapper — duration first
Note the order on record_and_play: duration is positional, filepath is optional and defaults to /tmp/sdk_test.wav.
All three are synchronous. Use the streaming APIs for non-blocking work.
Streaming
# Inbound: yield PCM chunks for `duration` seconds (None = forever)
for chunk in audio.stream_in(duration=10):
process(chunk)
# Outbound: feed an iterable of PCM chunks to the speaker
audio.stream_out(my_chunk_generator())
# Passthrough: mic → speaker, live
audio.stream_passthrough(
duration=30,
feedback_safe=True, # apply a gate to suppress acoustic feedback
gate_threshold=400, # RMS threshold below which mic output is silenced
)
stream_passthrough is the function most demos use for "speak to the device and hear yourself."
Volume and mixer
audio.get_volume() # → int (percent)
audio.set_volume(70) # 0–100
audio.volume_up(step=10) # default step = 10
audio.volume_down(step=10) # default step = 10
audio.list_mixer_controls() # → list of mixer control names
audio.get_mixer("PCM") # → dict describing the control
audio.set_mixer("PCM", 200) # int or str (codec-specific)
For voice-quality capture without thinking, use the presets:
audio.apply_capture_defaults() # known-good values for speech
audio.set_high_gain() # boost capture for soft sources
Diagnostics
report = audio.health_check(loopback=True, sample_seconds=1.0)
print(report)
loopback=True plays a test tone, records it, and checks the recorded RMS. Useful in CI or as a smoke test after deployment. Raises RecordingTooQuietError if the recorded signal is below the threshold (often means the mic isn't physically connected or the codec isn't initialized).
Listing devices
audio.list_devices() # all PCM devices visible to PortAudio
from audio_sdk import list_devices, get_device_index
list_devices() # same, module-level
get_device_index(card=1, device=0) # → PortAudio index for card/device
Errors
All raise from AudioSDKError:
| Exception | Meaning |
|---|---|
AudioSDKError | Base class — catch this if you don't care which |
DeviceNotFoundError | The requested ALSA card/device doesn't exist |
MixerError | A mixer control name was wrong, or the codec rejected a value |
RecordingTooQuietError | Loopback health check captured below-threshold audio |
Resource cleanup
AudioSDK is a context manager — prefer with so streams and PortAudio handles always close:
with AudioSDK() as audio:
audio.record("x.wav", 1)
If you can't use with, call audio.close() explicitly when finished.