Skip to main content

Audio SDK

AudioSDK is the Python wrapper around the Dev Edition's audio hardware (ES8389 codec on the Allwinner-based Intern audio board, exposed as ALSA card index 1). It handles recording, playback, live streaming, mixer control, and a built-in health check.

The module is in audio/audio_sdk.py in the intern-developer-sdk repo.

Install

On your workstation (cross-platform)

pip install pyaudio

Platform notes:

  • macOSbrew install portaudio first if pip install pyaudio fails
  • Linux (Debian/Ubuntu)sudo apt install -y libportaudio2 portaudio19-dev
  • Windowspip install pyaudio ships wheels, no extra deps

On the device

The recommended path is the bundled onboarding script:

python onboarding/onboard_device.py <host> <user> --password <pw>

This is idempotent: it copies the SDK over SFTP, runs apt install python3-pyaudio, adds system to the audio group, registers any skills, and runs the diagnostic.

Manual alternative:

ssh system@<host> "mkdir -p ~/sdk/audio"
scp audio/audio_sdk.py system@<host>:~/sdk/audio/
ssh system@<host> "sudo apt-get install -y python3-pyaudio && sudo usermod -a -G audio system"

Quick example

from audio_sdk import AudioSDK

with AudioSDK(card=1, sample_rate=44100, channels=2, auto_tune=True) as audio:
audio.record("note.wav", duration=5.0)
audio.play("note.wav")

auto_tune=True applies the codec's tested defaults (PGA gain, capture levels) on construction so first-time recordings aren't silent.

Constructor

AudioSDK(
card: int = 1,
sample_rate: int = 44100,
channels: int = 2,
auto_tune: bool = True,
)
ArgDefaultNotes
card1ALSA card index — ES8389 is usually card 1; the Pi's onboard audio is card 0
sample_rate44100Common values: 16000 (voice), 44100 (CD), 48000 (DAW)
channels21 = mono, 2 = stereo
auto_tuneTrueApply known-good mixer defaults on init

Recording and playback

audio.record("clip.wav", duration=5.0) # write file
audio.play("clip.wav") # play file
audio.record_and_play(3.0, "clip.wav") # convenience wrapper — duration first

Note the order on record_and_play: duration is positional, filepath is optional and defaults to /tmp/sdk_test.wav.

All three are synchronous. Use the streaming APIs for non-blocking work.

Streaming

# Inbound: yield PCM chunks for `duration` seconds (None = forever)
for chunk in audio.stream_in(duration=10):
process(chunk)

# Outbound: feed an iterable of PCM chunks to the speaker
audio.stream_out(my_chunk_generator())

# Passthrough: mic → speaker, live
audio.stream_passthrough(
duration=30,
feedback_safe=True, # apply a gate to suppress acoustic feedback
gate_threshold=400, # RMS threshold below which mic output is silenced
)

stream_passthrough is the function most demos use for "speak to the device and hear yourself."

Volume and mixer

audio.get_volume() # → int (percent)
audio.set_volume(70) # 0–100
audio.volume_up(step=10) # default step = 10
audio.volume_down(step=10) # default step = 10

audio.list_mixer_controls() # → list of mixer control names
audio.get_mixer("PCM") # → dict describing the control
audio.set_mixer("PCM", 200) # int or str (codec-specific)

For voice-quality capture without thinking, use the presets:

audio.apply_capture_defaults() # known-good values for speech
audio.set_high_gain() # boost capture for soft sources

Diagnostics

report = audio.health_check(loopback=True, sample_seconds=1.0)
print(report)

loopback=True plays a test tone, records it, and checks the recorded RMS. Useful in CI or as a smoke test after deployment. Raises RecordingTooQuietError if the recorded signal is below the threshold (often means the mic isn't physically connected or the codec isn't initialized).

Listing devices

audio.list_devices() # all PCM devices visible to PortAudio
from audio_sdk import list_devices, get_device_index
list_devices() # same, module-level
get_device_index(card=1, device=0) # → PortAudio index for card/device

Errors

All raise from AudioSDKError:

ExceptionMeaning
AudioSDKErrorBase class — catch this if you don't care which
DeviceNotFoundErrorThe requested ALSA card/device doesn't exist
MixerErrorA mixer control name was wrong, or the codec rejected a value
RecordingTooQuietErrorLoopback health check captured below-threshold audio

Resource cleanup

AudioSDK is a context manager — prefer with so streams and PortAudio handles always close:

with AudioSDK() as audio:
audio.record("x.wav", 1)

If you can't use with, call audio.close() explicitly when finished.