Outils pour utilisateurs

Outils du site


[ 2015 ]

by Yann Ics



Cite this article:
Yann Ics, Melody to tone, 2015
Date retrieved: 14/11/18
Permanent link: http://experimental.mus-ics.net/wiki/doku.php?id=article_2


This article aims to focus on some tools inside the PWGL environment – collected inside the M2T PWGL library – to interpret a melody to a unique tone with its own identity. The idea is to analyze the audio sample(s) and melody score in order to generate data profile destined to be used as synthesis parameters. Note the approach adopted here focuses on melodies with determined pitches.



The first of these tools realize a profile of harmonics from an audio sample. This audio sample has to match either a specific timbre or a part of the melody sampled as a single note.

For instance, the following clarinet sample can be a part of an initial melody or a timbre that we want to work with.

To extract the harmonics profile, the software PRAAT1) will do the job with the spectrum analysis2). From this analysis, we can get a list of power density by bin (i.e. frequency axis)3).

This procedure is described in the PWGL patch 1-spectrum.pwgl from the tutorial of the M2T library. In this patch, it is possible to hear the sample and to visualize the spectrum analysis (see illustrations 1 and 2).

Illustration 1: To play a sample from its pathname you have to evaluate the synth-box first (when it is done a green spot appears, note that if you want to hear another sample you have to re-evaluate this box) then push <<trig>> on the sample-player box.

Illustration 2: The spectrum analysis of the sample can be drawing in the 2D-Editor box with the evaluation of the spectrum box. Here we can easily see the seven first harmonics of the clarinet sample.

From this analysis, we extract the first significative peak according to that this peak is part of the harmonic series of the peak with the maximum power value. The significative value means that the value of a potential peak has to be superior or equal to the half of the maximum value inside a window length of n percent of the fundamental frequency (10% by default). This is not the most efficient way to estimate a potential peak as fundamental frequency, because you have to play with the window length in an empirical manner, but it works with the most of timbres of traditional instruments with a determined pitch.

Illustration 3: With this illustration, we can observe how each harmonic – included the fundamental frequency – is computed.

Now, we can extract as many harmonics as we need according to the given or computed fundamental frequency. This consists to get the maximum power value around each harmonic in a range of n percent of the fundamental frequency.

Illustration 4: In the 2D-Editor of the patch 2-harm-profile.pwgl we can see the harmonics profile of the clarinet sample.


This tool allows prioritizing the notes – expressed in midi or hertz – of a given melody according to their respective harmonics profile with an approximation set by the nearest division of the whole tone. This is done independently of the melody itself and this depends rather on the number of occurrence of the notes.

The result gives an ordered list of notes according to their importance – expressed as weight – in terms of resonance generated by the interaction of their respective harmonics profile according to a deliberate approximation.

Illustration 5: In this example (from the patch 3-sort-melody.pwgl), the aim is to give a weight to each note (even if there are duplicates) of the melody in the Score-Editor at the top left. The harm-weight-list parameter of the sort-melody box is set with a list of harmonics profiles with the respective note of the chord according an approximation of 12th of tone. Thus, the result sorts the notes according to their respective weight as you can see in the Score-Editor at the bottom right.

The result has to be interpreted. For instance we can create a new profile from the melody by associate each note with its weight value and weighted by its duration (in the context of PWGL – and more specifically with the GRhythm library4)– the value 1 means a quarter note, 1/2 eighth note, 1/4 sixteenth note, and so on).


Now it exists a powerful tool called energy-prof-morph-analysis from the library Morphologie developed at the IRCAM from 1997 by Frédéric Voisin and Jacopo Baboni Schilingi5).

This algorithmic is described in the article of Paolo Aralla (2002)6).
However, to resume and for the record, I will illustrate this algorithmic process with the example of the midi melody of the patches 3-sort-melody.pwgl and 4-energy-prof-morph-analysis.pwgl.
For instance, we will consider the weighted « sorting melody » weighted by the duration of each note rounded to one decimal place. Then, we « transpose » this result to a symbolic list to simplify the demonstration – that is to say, the list (5.8 2.9 2.4 2.9 3.5 4.7 17.5) becomes (a b c b d e f).

The first step is to realize the analysis of contrast described in the Paolo Aralla's article (op. cit.).

A theoretical problem we have faced is the relation between the object we have analyzed and the previous and following events. Any events chain perceived as belonging to a whole and complete organism stays anyway in relation with the previous and following sequential chain.
In case of performance of a music piece, the silence acts as a frame of the structure, and, being a frame, it becomes an organic element of the structure analyzed.
It is worth to underline that even in case of extrapolation, like in the here quoted examples (a thematic fragment, a subject of a fugue, etc.), the object is perceived as a unit, and therefore the silence places it in a well defined mental space.

(Extract from the documentation of new-old-analysis)

*start* = symbol-silence-start
*stop* = symbol-silence-end

*start* a b c b d e f *stop* Σ
1 2 3 4 3 5 6 7 8 39
1 2 3 2 4 5 6 7 30
1 2 1 3 4 5 6 22
1 2 3 4 5 6 21
1 2 3 4 5 15
1 2 3 4 10
1 2 3 6
1 2 3

Then we get all dx by row which will be multiplied by the previous summation.

1 1 1 -1 2 1 1 1 39
1 1 -1 2 1 1 1 30
1 -1 2 1 1 1 22
1 1 1 1 1 21
1 1 1 1 15
1 1 1 10
1 1 6
1 3

Then we make the sum of each column.

39 39 39 -39 78 39 39 39
30 30 -30 60 30 30 30
22 -22 22 22 22 22
21 21 21 21 21
15 15 15 15
10 10 10
6 6
39 69 91 -70 218 137 143 146

This « temporary » result (39 69 91 -70 218 137 143 146) corresponds to the analysis of contrasts called new-old-analysis in the Morphologie library.

The step that allows transforming the new-old-analysis function into a model able to simulate the psychic response of the perceptive act to the morphologic structure occurs using three functions.
Then, to this, the three functions apply to allow to define the energy profile.
1. In the first passage, the transformation into absolute
abs value contains all the relations with reference to the first element of the chain.
At this point, the data do not represent the ageing degree of the events anymore, but they are mere distance (it does not matter if they are old or new, they are to be intended nearly as physical distance between the various data stored in space/memory) related to a virtual point zero (a kind of possible present)
2. In the second passage, the use of the local derivative, implemented in OpenMusic under the name of
x–>dx, the contiguous relations are again pointed out, and the distance identified in the first passage is assimilated to the energy needed to cover the contiguous distances in space/memory
3. Finally, the transformation into absolute
abs value, because of the transformation of the distances into energy, brings all the data back to positive values.
(Documentation of energy-prof-morph-analysis)
0 39 69 91 -70 218 137 143
abs 0 39 69 91 70 218 137 143
x→dx 39 30 22 -21 148 -81 6
abs 39 30 22 21 148 81 6

Note that this kind of analysis is strictly symbolic, focusing the structure of contrast, and all symbols of the list are initially only referred to themselves. In our case, symbols can refer to the pitches, intervals, durations, along with others.

All these tools can be combined to interact with each other and weighted with the values of sort-melody or durations, dynamics or others.

Opus 11

For instance, we will consider the opening phrase of the opus 11 of Arnold Schoenberg (Drei Klavierstucke for solo piano). Note that the choice of this melody is motivated by its interesting complexity inside a real simplicity of writing.


The first step consists to segment the audio file in order to get as many audio files than axiomatic events (notes, chords, or others) forming the melody. In practice, it exists some algorithms to realize it, but all use a specific segmentation in a teleological aim, and the relevance is relative. So, each melody will be segmented in an empirical manner with its own tools.
For instance, the script mkSoneme.praat can realize a segmentation according to a significative differential intensity. This implies an attack discrimination for each event. So, this script is efficient for percussive sequences, as the present piano sample of opus 11.


Then collect all segments from the melody inside a folder in order to be easily localized for analysis. For convenience and to avoid some compatibility issue, these audio files are in the format WAV.

Illustration 6: Patch opus11.pwgl in opus11.zip.

Now, we can get a harmonic profile for each note of the melody through the patch opus11.pwgl with the abstraction harmonic-profile [see 2D-Editor on the left in the illustration 6]. First, all the notes of the melody have to be expressed as a list of midi notes: (71 68 67 69 65 65 64 64 67 59 61 60 58 59 53 57 50 54 57 58 59 42 46 42) in order to set the fundamental frequency after conversion. Note that the tune has to be adjusted if it is not. In the context of the energy profile, we have to write the 'midi melody' in time as: ((71) (68) (67) (59 53 42) (69) (65) (65) (61 57 46) (64) (60 64) (58) (67 59 50 42) (54) (57) (58) (59)).

Next step is the 'sorting melody' in a way to get the following result: ((53 65) (59 71) (57 69) (64) (46 58) (60) (67) (61) (50) (68) (42 54)) [see Score-Editor in the illustration 6].

To obtain the energy profile, this is more tricky and I propose presently to consider the mean value of the weight(s) from the 'sorting melody' for each step of the melody weighted by the duration value in time. Then we get an energy profile with the function energy-prof-morph-analysis as: (144 126 109 93 78 64 614 735 49 39 30 22 21 3105 1668 605).

In order to use as synthesis parameters, we associate for each pitch its weight from the 'sorting melody' and the average of its energy for the entire melody [see abstraction pitch-level-energy in the illustration 6]: ((53 492.66476 93.0) (65 492.66476 339.0) (59 405.09332 240.0) (71 405.09332 144.0) (57 363.93274 1920.0) (69 363.93274 78.0) (64 346.34695 44.0) (46 327.92123 735.0) (58 327.92123 849.0) (60 238.34644 39.0) (67 210.47075 65.5) (61 201.96611 735.0) (50 127.17165 22.0) (68 117.45143 126.0) (42 64.20631 57.5) (54 64.20631 21.0)). Of course all these weights will be scaled as we will see in the SuperCollider patch.


The synthesizer I chose reproduces the 'sustain' of a piano. I scaled the level from 0.01 to 0.1 and the energy (that I transposed in term of 'grain' that is to say a kind of 'texturation') from 300 to 3000 Hz. The synthesis is realized with the software SuperCollider7).

// inspired by the synthetic piano patch (James McCartney) – originally for SC2, 1998 – and freely adapted by Yann Ics ...
SynthDef(\op11M2T, { | bus=0, pitch=60, amp=0.1, grain=3000 |
	var detune, delayTime, noise, out;
	out = Mix.ar(Array.fill(3, { arg i;
		// detune strings, calculate delay time :
		detune = #[-0.05, 0, 0.04].at(i);
		delayTime = 1 / (pitch + detune).midicps;
		// each string gets own exciter :
		noise = LFNoise2.ar(grain, 0.1); // grain = 3000 Hz is the reference
		CombL.ar(noise,		// used as a string resonator
			delayTime, 	// max delay time
			delayTime,	// actual delay time
			6) 		// decay time of string
	Out.ar(bus, Splay.ar(out*amp)*EnvGen.ar(Env.linen(0.5,10,3.0),doneAction:2))}).add
// data [pitch, level, grain] from patch opus11.pwgl
~data = [ [ 53, 492.66476, 93.0 ], [ 65, 492.66476, 339.0 ], [ 59, 405.09332, 240.0 ], [ 71, 405.09332, 144.0 ], [ 57, 363.93274, 1920.0 ], [ 69, 363.93274, 78.0 ], [ 64, 346.34695, 44.0 ], [ 46, 327.92123, 735.0 ], [ 58, 327.92123, 849.0 ], [ 60, 238.34644, 39.0 ], [ 67, 210.47075, 65.5 ], [ 61, 201.96611, 735.0 ], [ 50, 127.17165, 22.0 ], [ 68, 117.45143, 126.0 ], [ 42, 64.20631, 57.5 ], [ 54, 64.20631, 21.0 ] ];
(instrument: \op11M2T, pitch: ~data.flop[0], amp: ~data.flop[1].normalize(0.01, 0.1), grain: ~data.flop[2].normalize(300, 3000)).play;


Of course, this is an example, and the possibilities of interpretation are infinite, but these tools allow managing this kind of transposition – that is to say the 'transformation' of a melody to an unique tone – with relevance and open a substantial field of creativity.


;; The function FSC allows to convert lisp list to a suitable Supercollider array via a scd file.
(defun FSC (lst &key (name "foo") (dir "./"))
  (with-open-file (file-stream (make-pathname :directory (pathname-directory dir)
					      :name name
					      :type "scd")
			       :direction :output
			       :if-exists :supersede
			       :if-does-not-exist :create)
    (labels ((convert-list-to-array (lst)
	       (with-output-to-string (stream)
		 (uiop:run-program (format nil "echo '~S' | sed -e 's/\ /\, /g;s/(/[ /g;s/)/ ]/g;s/$/;/g' | awk '{print tolower($0)}'" lst) :output stream))))
      (format file-stream "~~~a = ~a" name (convert-list-to-array lst)))))
1) PRAAT is a free software for the analysis of speech in phonetics. It was designed, and continues to be developed, by Paul Boersma and David Weenink of the University of Amsterdam.
4) GRhythm was originally developed by Magnus Lindberg during 1980s. GRhythm allows to manage 'gestural rhythms' with its own format convention.
5) Jacopo Baboni Schilingi, Frédéric Voisin, Morphologie – Fonctions d'analyse, de reconnais­sance, de classification et de reconstitution de séquences symboliques et nu­mériques, IRCAM, troisième édition, Paris 1999.
6) Paolo Aralla, Morphological Analysis, in PRISMA 01, Euresis Edizioni, Milan 2002.
7) SuperCollider is a programming language for real time audio synthesis and algorithmic composition.
article_2.txt · Dernière modification: 2018/08/20 16:19 (modification externe)