NAME
audio_system —
the
NetBSD in-kernel audio mixer specification
INTRODUCTION
This document aims to describe all aspects of the in-kernel audio mixer included
with
NetBSD 8 and onwards, describing its current
behavior as of 2018.
VIRTUAL CHANNEL (VCHAN)
This is the most fundamental element to the mixer. The vchan has all of the
properties of the traditional single open
NetBSD audio
channel. It consists of playback and record rings along with
audio_info structures.
Upon opening of
/dev/audio or
/dev/sound, a
new vchan and mixerctl structure is created. In the case of
/dev/sound,
audio_info structures are
inherited from the last open of
/dev/audio or
/dev/sound.
All vchans are up or down sampled into the mix ring (intermediate) format before
being sent to hardware.
It is described in the following diagram:
VCHAN1---------\
\ VCHAN0
VCHAN2-------------MIX RING ---- HARDWARE
... /
VCHANn---------/
In the case of
sysctl(8)
usemixer=0
(see below), there is only one vchan whose
play and record rings are the hardware play/record rings.
User accessible vchans are numbered starting at one (1). Vchan 0 is used
internally by the mixer for the mix ring and its ring buffers are not user
accessible.
The only limit to the number of open vchans is the speed of the computer and the
number of free file descriptors.
BLOCK - SIZE / LATENCY
A block of audio data is the basic unit for audio data. Audio applications will
not commence playback until three (3) blocks have been written - this is the
source of latency in the mixer along with the size of the audio data block.
For normal uses of audio read/write there will be three blocks of audio data
before playback commences one in the vchan, one in the mix ring and one in the
hardware ring.
The size of the audio data block is dependent on the audio format configured by
the application the latency
sysctl(8) and the underlying
audio hardware.
Some audio hardware devices only support a static block size, as such the
overall latency of the mixer for these devices cannot be changed. Other
devices such as those supported by
hdaudio(4) allow the hardware
block size to be changed, allowing the latency of the mixer to change from 4
milliseconds (ms) to 128 ms with the mixer intermediate format being 16 bit,
stereo, 48 kHz.
With regard to mmapped audio, blocks are played back immediately so the latency
presented to applications is one third of the latency
sysctl(8) value.
Latency can be calculated by the following formula:
Latency (ms) = blocksize(bytes) * num blocks * 1000
--------------------------------------
freq(Hz) * bytes per sample * channels
Latency in the mixer and latency presented to audio applications is consistent,
it will be the same regardless of the audio format requested by the audio
application.
The default latency configured at boot time is 150ms and is subject to the above
constraints.
ADDED IOCTLS
Two new ioctls have been added to accommodate mixing of multiple vchans:
-
-
AUDIO_SETCHAN
:
- Allows setting the target vchan to operate on for
subsequent ioctl(2)
calls.
-
-
AUDIO_GETCHAN
:
- Returns the current vchan number.
These ioctls were necessary as some audio applications like to open an
audio(4) device and an
audioctl(4) device so to check
on buffer usage and samples played etc.
As opening an
audioctl(4) device
would represent vchan 0 (the mix ring), these ioctls allow setting the target
vchan and
audio_info structure to that of an existing
vchan.
MIXERCTL INTERFACE /
SOFTWARE VOLUME
Mixerctl structures are allocated when a new vchan is created. The mixer control
structure allows for setting the software volume for playback -
vchan.dacN
or recording -
vchan.adcN
. These are 8 bit values and the this value
is applied during mixing into the mix ring.
The software volume is applied to all channels (1, 2, 4 etc.) in the vchan and
at present (2018-05-04) there are no balance controls for user accessible
vchans.
The first vchan corresponds to the
vchan.dac1/adc1
mixer
controls.
All vchan mixer controls only have effect upon its own volume and writing to
outputs.master
(or equivalent) control is required to
change the volume of the hardware.
Mixer controls are only present whilst the chan is in use and numbering starts
at one (1). Mixer control numbers i.e.
dac/adc1
correspond to their vchan number.
AUDIOCTL / AUDIO_INFO
INTERFACE
Audioctl allows access to the
audio_info structure of a
given device. Due to the audio mixer a
-p flag was added to
allow access to a given vchan's
audio_info structure.
The values for
-p are numbered starting at zero (0).
Not specifying
-p is the same as specifying
-p 0 and will result in working with
vchan 0 (the mix ring). This will display the audio parameters of the mix ring
and allow setting the hardware gain and balance.
This is for compatibility with existing applications and shell scripts that are
unaware of the
-p switch.
The parameters for playback and recording only effect the particular vchan being
operated on (gain, sample rate, channels, encoding etc), except
-p 0 (the mix ring).
ADDED SYSCTLS
With the introduction of the audio mixer the following
sysctl(7)s have been added:
-
-
hw.driverN.frequency
:
-
hw.driverN.precision
:
-
hw.driverN.channels
:
- Intermediate mixing format. (see below)
-
-
hw.driverN.latency
:
- Expressed in milliseconds. (see above)
-
-
hw.driverN.multiuser
:
- Off/On (0/1), defaults to off. This
sysctl(7) determines if
multiple users are allowed to access the sound hardware. The root user is
always allowed access (i.e., for wsbell). The first user to open the audio
device has full control of the audio device if this sysctl is set to off.
There currently is an outstanding PR about affecting a privileged process
- PR/52627.
Ideally if root intervenes with the audio device, it should do so
unaffected.
If this control is set to on, then all users' audio data are mixed and all
users have access to the audio hardware.
-
-
hw.driverN.usemixer
:
- Off/On (0/1), defaults to on. This
sysctl(7) enables or
disables the audio mixer. When set to off, the audio device can support
only one vchan. This vchan's play and record ring buffers are the hardware
ring buffers.
This option was added to aid older/slower systems where the extra overhead
of the audio mixer might pose a problem.
The initial concept was to handle incoming audio data similarly to that of a
superheterodyne radio receiver:
RF -> IF -> AF
So the corresponding mixing concept is:
vchan -> mixing format ->
hardware
The
sysctl(7)s described above
determine the format for mixing. All vchans are up or down sampled to this
format before mixing takes place.
On most systems this defaults to 16 bit stereo 48kHz. The
sysctl(7)s governing the mixing
format may only be changed when there are no vchans in use.
On faster systems the precision (8, 16, 32 bits) may be changed along with the
sample rate and number of channels (mono, stereo, 4 etc.).
On older/slower systems utilizing audio mixing, it may be required to lower the
quality of this format to ease the amount of data processing whilst mixing.
All possible audio formats (mulaw, alaw, slinear, ulinear, 8, 16, and 32 bit
precision) are converted for use by the audio mixer.
MEMORY MAPPED PLAYBACK
It is possible to use mmap for audio playback, achieving reduced latency.
However the audio applications selected format must match the
mixing/intermediate format (see above).
It is possible to obtain the
audio_info for vchan0 which
contains the intermediate/mixing format to ease applications configuring for
mmapped audio.
At present most applications don't use the mix ring's
audio_info structure to obtain the requiredplay back
parameters and some user intervention is required to set the audio format for
the application.
HARDWARE DRIVER
REQUIREMENTS
Audio mixing requires signed linear support in the host's endianness. Driver
authors should support slinear_le and slinear_be formats.
If the audio hardware is intended to be used with the mixer disabled, mulaw 1ch
8000 hz needs to be supported also.
This is easily achievable with the auconv framework/filters. All new drivers
should consider the use of auconv where possible.
SEE ALSO
audioctl(1),
mixerctl(1),
audio(4),
audio(9)
AUTHORS
Nathanial Sloss
SPECIAL THANKS
Great appreciation goes to Onno van der Linden, isaki@, maya@, jmcneill@,
pgoyette@, mrg@, riastradh@ and christos@ — without their input, this
code would not be what it is currently.