Outils pour utilisateurs

Outils du site


[ 2018 ]

by Yann Ics



Cite this article:
Yann Ics, Outline of a dynamic score from a sound file, 2018
Date retrieved: 17/10/18
Permanent link: http://experimental.mus-ics.net/wiki/doku.php?id=article_5


This article aims to describe how to draw a dynamic score from a sound file according to the analysis of the command line enkode. This is done with the executable script dekode using the software suite ImageMagick for graphical rendering. The result is then a graphical image as PNG.


In order to discriminate the data analysis into the bin length step, enkode offers two possibilities to do it. The first one consists to get events segmentation, and the second one – from the version 4.3 of enkode – is to get a constant bin step defined by its duration expressed in second (this implies to set the option -D, --duration with a floating point number).


The data generated by enkode has to be interpreted in order to be used in a graphical context, with the possibility to set the relevance of each data set in term of contrast.

Analysis type description interpretation
Relative pitch Classification* of a frequencies bandwidth according to the centroid. y-axis position, relative to the values of the image height, brightness height, and the 'bass' height.
Brightness Classification* of the centroid divided by f0 as the first partial of the spectrum profile. Decreasing bandwidth length of high frequencies.
Loudness Classification* of a dynamic level according to the human aural perception. The maximum loudness is black and decreases proportionally to the white by the gray.
Bass presence Classification* of the loudness of the event or bin of the filtered data (see bass loudness). Decreasing bandwidth length of low frequencies.
Segmentation One unit is defined by Δx = pts × n (see duration). Length of a bin in pixel unit multiplied if set by the bin factor defined on the x-axis.

* see article 1 discrimination in class.


The command line dekode allows drawing the dynamic score from the data generated by enkode. In short, each bin step is represented by the following illustration 1 according to the value of each parameter plus the loudness defined by the value of gray from black to white. The last can be qualified by the contrast value of loudness expressed in percentage.

Illustration 1: Description of a bin step – dt is the duration of a bin or an event.


  1. For each bin as step create an image:
    > convert -size <bs×binfactor>x<x1> xc:none -size <bs×binfactor>x<x2> gradient:none-gray<valueGray> -size <bs×binfactor>x<x3> gradient:gray<valueGray>-none -size <bs×binfactor>x<x4> xc:none -append <stepNumber>.png
    • x1 = (max(C) – C) (H – Hbr – Hba) / (max(C) – 1)
    • x2 = br × Hbr / max(br)
    • x3 = ba × Hba / max(ba)
    • x4 = H – (x1 + x2 + x3)

      with bs = bin step, C = centroid as relative pitch, H = image height, Hbr = maximum brightness height, Hba = maximum bass level height, br = brightness value, ba = bass value;
      and x1 = the margin from top, x2 = the top gradient x3 = the bottom gradient, x4 = the remaining.

  2. Append this image to the previous one:
    > convert <out>.png <stepNumber>.png +append <out>.png
  3. Add background color:
    > convert <out>.png -background <backgroundColor> -flatten <imgName>.png


List of the available options of dekode:

Option Argument type [ unit ]
--img-height integer [ pixel ]
--brightness-height integer [ pixel ]
--bass-height integer [ pixel ]
--loudness-contrast integer [ percentage ]
--bin-factor integer
--background-color hex|rgb|rgba|name|etc.
--img-name name
-v, --version
-h, --help


March 04, 2018: dekode alpha 0.1 released.

Test version.
Note that for now the processing time to build the image can be very long.
  • Command line version alpha 0.1 dekode


Let the electronic composition artikulation of György Ligeti – written in 1958 – be an illustration of the application of the dynamic score.

Constant split

  1. Write data file:
    > enkode --duration=0.1 -n -b 3 -d 20 +info artikulation.mp3 > artkc.dat
  2. Draw dynamic score:
    > dekode artkc.dat --background-color white

  3. Optional add phi scale* (see code annex : phi scale):
    > ./phi.sh artkc.png 10

    * the phi scale allows to markup the score according to the golden section increasingly and decreasingly.

  4. Optional crop image according to an array of timing in the second unit (see code annex : crop image):
    > ./crop.sh artkc.png 227 '(33.11 86.7 140.29 193.88)'

    The last values crop the final score according to the golden section estimated for this score to 53.58 seconds.

Segmental split

  1. Write data file:
    > enkode --duration=200 -n -b 3 -d 20 +info artikulation.mp3 > artks.dat

    The value of duration is set to 200 in order to fit all segment until 10 seconds, thus avoiding clipped value above the default value of 1.55 second.

  2. Draw dynamic score:
    > dekode artks.dat --background-color white


The graphical representation of 5D data in a relevant 2D result depends on what is important to be highlighted according to the y-axis. The x-axis is obviously the time axis. So, the proposition of this article is one possibility to deal with a dynamic score notably through the analysis of enkode, and this is of course destinated to be improved and to evolve.

This will be part of further development of dekode.

Analysis type interpretation note
Relative pitch shade of gray from black between loudness and bass presence curves
Brightness gradient – middle = 1, top > 1, 0 < bottom < 1
Loudness y-axis towards the top with an option in order to smooth the curve according to a deliberate factor
Bass presence y-axis towards the bottom as absolute value

« Tout est écrit dans une partition, sauf l'Essentiel. »


Phi scale

# Usage: ./$(basename "${BASH_SOURCE[0]}") <score.png> <discrete-integer>"
iw=$(identify -ping -format %w $1)
ih=$(identify -ping -format %h $1)
name=$(basename "$1" | cut -d. -f1)
for ((i=$1; i>=1; i--))
    res=`echo "scale=10;${ar[0]}/$or" | bc`
    ar=($res ${ar[@]})
for i in "${ar[@]}"
    co=`echo "scale=10;$i*$2" | bc`
    co=`printf "%.0f\n" $co`
    re=($co ${re[@]})
phi $2 $iw
convert -size $size xc:none .out.png
for i in "${re[@]}"
    j=$((iw - i))
    convert .out.png -stroke red -draw "path 'M $i,$ih L $i,0'" .out.png
    convert .out.png -stroke red -draw "path 'M $j,$ih L $j,0'" .out.png
composite .out.png $1 $name.png
rm .out.png

Crop image

# Thanks to http://www.imagemagick.org/discourse-server/viewtopic.php?t=28730#p127841
name=$(basename "$file" | cut -d. -f1)
# Get the duration in seconds with sox
# > sox --info -D soundfile
# Get the width of the image
imageWidth=$(identify -ping -format %w $file)
# Get the duration of 1 pixel (should be equal to pts divided by binfactor)
pixel=`echo "scale=6;$totalduration/$imageWidth" | bc`
# Initiate array containing x-coordinates and image width
# Fill x-coordsArray with x-coordinates and image width
declare -a ar=$3
for i in "${ar[@]}"
     aaa=`echo "$i/$pixel" | bc`
# Find the number of elements in the x_coordinates array
# Loop to calculate width of each stripe
# Append each width to stripeWidthArray
for (( i=0; i < x_coordsArrayLength-1; i++ ))
# Find the number of elements in the stripeWidthArray
# Loop to crop original image
for (( i=0; i < stripeWidthArrayLength; i++ ))
     convert $file -crop ${stripeWidthArray[i]}x0+${x_coords[i]}+0 $name$n.png

article_5.txt · Dernière modification: 2018/10/14 11:15 (modification externe)