Audio SDK

AudioSDK is the Python wrapper around the Dev Edition's audio hardware (ES8389 codec on the Allwinner-based Intern audio board, exposed as ALSA card index 1). It handles recording, playback, live streaming, mixer control, and a built-in health check.

The module is in audio/audio_sdk.py in the intern-developer-sdk repo.

Install

On your workstation (cross-platform)

pip install pyaudio

Platform notes:

macOS — brew install portaudio first if pip install pyaudio fails
Linux (Debian/Ubuntu) — sudo apt install -y libportaudio2 portaudio19-dev
Windows — pip install pyaudio ships wheels, no extra deps

On the device

The recommended path is the bundled onboarding script:

python onboarding/onboard_device.py <host> <user> --password <pw>

This is idempotent: it copies the SDK over SFTP, runs apt install python3-pyaudio, adds system to the audio group, registers any skills, and runs the diagnostic.

Manual alternative:

ssh system@<host> "mkdir -p ~/sdk/audio"
scp audio/audio_sdk.py system@<host>:~/sdk/audio/
ssh system@<host> "sudo apt-get install -y python3-pyaudio && sudo usermod -a -G audio system"

Quick example

from audio_sdk import AudioSDK

with AudioSDK(card=1, sample_rate=44100, channels=2, auto_tune=True) as audio:
    audio.record("note.wav", duration=5.0)
    audio.play("note.wav")

auto_tune=True applies the codec's tested defaults (PGA gain, capture levels) on construction so first-time recordings aren't silent.

Constructor

AudioSDK(
    card: int = 1,
    sample_rate: int = 44100,
    channels: int = 2,
    auto_tune: bool = True,
)

Arg	Default	Notes
`card`	`1`	ALSA card index — ES8389 is usually card 1; the Pi's onboard audio is card 0
`sample_rate`	`44100`	Common values: 16000 (voice), 44100 (CD), 48000 (DAW)
`channels`	`2`	`1` = mono, `2` = stereo
`auto_tune`	`True`	Apply known-good mixer defaults on init

Recording and playback

audio.record("clip.wav", duration=5.0)             # write file
audio.play("clip.wav")                              # play file
audio.record_and_play(3.0, "clip.wav")              # convenience wrapper — duration first

Note the order on record_and_play: duration is positional, filepath is optional and defaults to /tmp/sdk_test.wav.

All three are synchronous. Use the streaming APIs for non-blocking work.

Streaming

# Inbound: yield PCM chunks for `duration` seconds (None = forever)
for chunk in audio.stream_in(duration=10):
    process(chunk)

# Outbound: feed an iterable of PCM chunks to the speaker
audio.stream_out(my_chunk_generator())

# Passthrough: mic → speaker, live
audio.stream_passthrough(
    duration=30,
    feedback_safe=True,    # apply a gate to suppress acoustic feedback
    gate_threshold=400,    # RMS threshold below which mic output is silenced
)

stream_passthrough is the function most demos use for "speak to the device and hear yourself."

Volume and mixer

audio.get_volume()              # → int (percent)
audio.set_volume(70)            # 0–100
audio.volume_up(step=10)        # default step = 10
audio.volume_down(step=10)      # default step = 10

audio.list_mixer_controls()     # → list of mixer control names
audio.get_mixer("PCM")          # → dict describing the control
audio.set_mixer("PCM", 200)     # int or str (codec-specific)

For voice-quality capture without thinking, use the presets:

audio.apply_capture_defaults()  # known-good values for speech
audio.set_high_gain()           # boost capture for soft sources

Diagnostics

report = audio.health_check(loopback=True, sample_seconds=1.0)
print(report)

loopback=True plays a test tone, records it, and checks the recorded RMS. Useful in CI or as a smoke test after deployment. Raises RecordingTooQuietError if the recorded signal is below the threshold (often means the mic isn't physically connected or the codec isn't initialized).

Listing devices

audio.list_devices()                              # all PCM devices visible to PortAudio
from audio_sdk import list_devices, get_device_index
list_devices()                                    # same, module-level
get_device_index(card=1, device=0)                # → PortAudio index for card/device

Errors

All raise from AudioSDKError:

Exception	Meaning
`AudioSDKError`	Base class — catch this if you don't care which
`DeviceNotFoundError`	The requested ALSA card/device doesn't exist
`MixerError`	A mixer control name was wrong, or the codec rejected a value
`RecordingTooQuietError`	Loopback health check captured below-threshold audio

Resource cleanup

AudioSDK is a context manager — prefer with so streams and PortAudio handles always close:

with AudioSDK() as audio:
    audio.record("x.wav", 1)

If you can't use with, call audio.close() explicitly when finished.

Install​

On your workstation (cross-platform)​

On the device​

Quick example​

Constructor​

Recording and playback​

Streaming​

Volume and mixer​

Diagnostics​

Listing devices​

Errors​

Resource cleanup​