README.md 3.05 KB
Newer Older
Marie Tahon's avatar
Marie Tahon committed
1
# spectral-clustering-music
Marie Tahon's avatar
Marie Tahon committed
2
3
4
5
6
This project aims at automatically extract musical structure from an audio file.

### Prerequisites
This project needs the following python3 packages
```
Marie Tahon's avatar
Marie Tahon committed
7
pip install librosa scikit-learn argparse scipy numpy matplotlib warnings
Marie Tahon's avatar
Marie Tahon committed
8
9
10
11
```

### Usage
```
Marie Tahon's avatar
Marie Tahon committed
12
python3 spectral_clustering_audio.py config.yaml
Marie Tahon's avatar
Marie Tahon committed
13
14
15
```

### Parameters
Marie Tahon's avatar
Marie Tahon committed
16
17
18

Modifiable parameters are located in the config file. They are organized in audio, segmentation, features, long-segmentation and cluster parameters.
The name of the audio file is given in this config file.
Marie Tahon's avatar
Marie Tahon committed
19
20
21
22
23
24
25
26
27
28

### Select number of clusters
The key issue of this approach is to find the correct number of musical sections (or clusters). We propose a compilation of different methods from state-of-the-art studies:
* fixed: the user a priori knows the number of sections he wants
* evals: the number of clusters is optimized using Laplacian eigen values (needs the additional script cluster_rotate.py)
* max: this method find the number of cluster which maximizes the quality of the alignement with eigen vectors in the range [1, cluser_max]
* silhouette: maximizes the silhouette index
* calinski-harabaz: maximizes the calinski-harabaz index
* davies-bouldin: maximizes the davies-bouldin index

Marie Tahon's avatar
Marie Tahon committed
29
30
31
### Differential transform
The differential transform aims at computing spectral differences between two consecutive frames and going back to temporal waveform.
Modifiable parameters are located in the same params.py as for spectral clustering.
Marie Tahon's avatar
Marie Tahon committed
32

Marie Tahon's avatar
Marie Tahon committed
33
34
35
36
37
38
The key point is that frames can be beat synchronous or onset synchronous, thus all frames have not the same duration.

Usage:
```
python3 TF_differential.py audio/filename.wav
```
39
From an original idea of Jean-Marc Chouvel [L'analyse musicale différentielle](http://www.ems-network.org/IMG/JMChouvelEMS07/index.html).
Marie Tahon's avatar
Marie Tahon committed
40

Marie Tahon's avatar
Marie Tahon committed
41
42
43
44
45
46
47

## Example on Free Jazz (2 sec)

The following results have been obtained using config-freejazz.yaml file

Similarity matrices

Marie Tahon's avatar
Marie Tahon committed
48
![alt text](./freejazz_simi.png)
Marie Tahon's avatar
Marie Tahon committed
49
50
51

Extracted structure with 6 and 8 clusters.

Marie Tahon's avatar
Marie Tahon committed
52
![alt text](./freejazz_struct.png)
Marie Tahon's avatar
Marie Tahon committed
53
54


Marie Tahon's avatar
Marie Tahon committed
55
### Bibliography
Marie Tahon's avatar
Marie Tahon committed
56
* M. Tahon, Z. Belghith, J.-M. Chouvel, P. Michel. SEGMENTATION AUTOMATIQUE DE CATÉGORIES MUSICALES: ÉTUDE EXPLORATOIRE SUR LE FREE JAZZ. Journées d'Informatique Musicale, May 2019, Bayonne, France [PDF] (https://hal.archives-ouvertes.fr/hal-02138608).
Marie Tahon's avatar
Marie Tahon committed
57
58
59
60
61
62
* B. McFee and D P.W. Ellis (2014). Analyzing song structure with spectral clustering. In proc. of International Society for Music Information Retrieval.
* L. Zelnik-Manor and P. Perona (2004). Self-Tuning Spectral Clustering. In Advances in Neural Information Processing Systems (NIPS), pp. 1601—1608.
* D. Deutsch (2007). Music perception. In Frontiers of Bioscience, vol. 12, pp. 4473—4482.
* J.-C. Lamirel, N. Dugué and P. Cuxac (2016). New efficient clustering quality indexes. In proc; of International Joint Conference on Neural Networks (IJCNN 2016), Vancouver, Canada

### Authors
Marie Tahon's avatar
Marie Tahon committed
63
* Marie Tahon: adaptation of initial code and addition of new functionalities
Marie Tahon's avatar
Marie Tahon committed
64
* Brian McFee: initial code [Laplacian segmentation](https://librosa.github.io/librosa/auto_examples/plot_segmentation.html)
Marie Tahon's avatar
Marie Tahon committed
65