Ivan Krylov kry|ov@r00t @end|ng |rom gm@||@com
Sat Feb 2 20:11:57 CET 2019

```Hello Nick Wray,

Let me offer a simplified explanation of what's going on. Sorry if it's
unnecessary.

Sound is waves of pressure in the air. Devices like microphones can
measure the changing pressure by converting it into voltage. Voltage
can then be sampled by an analog-to-digital converter inside a sound
card and stored as numbers in computer memory.

On Fri, 1 Feb 2019 10:20:57 +0000 (GMT)
Nick Wray via R-help <r-help using r-project.org> wrote:

> What I am not sure about, and I can't find any clear explanation, is
> what these elements actually stand for?

Digital sound works by measuring "pressure" a few tens of thousands of
times per second and then recreating the corresponding signal
elsewhere. According to the sampling theorem, sound sampled N times per
second would be losslessly reproduced if it didn't contain frequencies
above N/2 Hz.

To reiterate, these numbers are just audio samples. Feed them to the
sound card at the original sample rate, and you hear the same sound

This part is explained well in two 30-minute video lectures here:
https://xiph.org/video/vid1.shtml https://xiph.org/video/vid2.shtml
(I wouldn't normally recommend video lectures, but these are really
good.)

> I would have thought that one needed as a minimum both volume and
> frequency ie a two dimensional vector but as far as I can tell there
> is only one single vector.

You are describing a spectrogram: a surface showing the "volume" of
each individual frequency in the sound recording, over time. How to get
it? If you run a Fourier transform over the original vector, you will
get only one vector showing the magnitudes and phases of all frequencies
through the whole length of the clip.

To get a two-dimensional spectrogram, you should take overlapping parts
of the original vector of samples, multiply them by a special window
function, then take a Fourier transform over that and combine
resulting vectors into a matrix. Computing a spectrogram involves
choosing a lot of parameters: size of the overlapping window, step
between overlapping windows, the window function itself and its own
parameters.

Problems like these should be described in books about digital signal
processing.

Jeff Newmiller sent more useful links while I was typing this, and I
guess I should posting off-topic.

--
Best regards,
Ivan

```