A Sinous Violin

The aim of this short notebook is to show how to use NumPy and SciPy to play with spectral audio signal analysis (and synthesis).

Lots of prior knowledge is assumed, and here no signal theory (nor its mathematical details) will be discussed. The reader interested a more formal discussion is invited to read, for example: "Spectral Audio Signal Processing" by Julius O. Smith III that is a precise and deep, yet manageable, introduction to the topic.

For the reader less inclined in formal details (heaven forbid that the others will read the following sentence) suffices it to say that any (periodic) signal can be obtained as a superposition of sine waves (with suitable frequencies and amplitures).

The roadmap of what we'll be doing is:

  • take a real signal (a violin and a flute sample),
  • perform a spectral analysis,
  • determine some of the frequencies having the strongest amplitudes in such spectrum,
  • "reconstruct" a signal using just a few sine waves,
  • play the orignal, and reconstructed signal.

As you'll see, beside what theory guarantees, this actually works and very few waves are enough to approximate the timbre of a musical instrument.

The source notebook is available on GitHub (under GPL v3), feel free to use issues to point out errors, or to fork it to suggest edits.

A special thanks to the friend and colleague Federico Pedersini for tolerating my endless discussion and my musings.

The usual notebook setup

Beside the already mentioned NumPy and SciPy, we'll use librosa to read the WAV files containing the samples, and matplotlib because a picture is worth a thousand words; to play the samples we'll use the standard Audio display class of IPython.

In [1]:
%matplotlib inline

from IPython.display import Audio
import librosa
import matplotlib.pyplot as plt
import numpy as np
import scipy as sp

plt.rcParams['figure.figsize'] = 8, 4
plt.style.use('ggplot')

Let's begin

We'll fix the sampling rate once and for all to 8000Hz, that is sufficient for our audio purposes, yet low enough to reduce the number of samples involved in the following computations.

In [2]:
RATE = 8000

We define a couple of helper functions, to load the samples in a WAV file and to generate a sine wave of given frequency and duration (given the sampling RATE defined above).

In [3]:
def load_signal_wav(name):
    signal, _ = librosa.load(name + '.wav', sr = RATE, mono = True)
    return signal
In [4]:
def sine_wave(freq, duration):
    return np.sin(np.arange(0, duration, 1 / RATE) * freq * 2 * np.pi) 

Let's check we've done a good job by playing a couple of seconds of a "pure A", that is a sine wave at 440hz

In [5]:
samples_sine = sine_wave(440, 2)
Audio(samples_sine, rate = RATE)