*Sidekit for Diarization* requires the following software installed for your platform:
*Sidekit for Diarization* requires the following software installed for your platform:
1.[Python](http://www.python.org)
1.[Python](http://www.python.org)
2.[NumPy](http://www.numpy.org/)
2.[NumPy](http://www.numpy.org/)
3.[Scipy](http://http://www.scipy.org/)
3.[Scipy](http://http://www.scipy.org/)
4.[Pandas](http://http://www.pandas.org/)
4.[Pandas](http://http://www.pandas.org/)
5.[GLPK](https://www.gnu.org/software/glpk/)
5.[GLPK](https://www.gnu.org/software/glpk/)
6.If you want to build the documentation: [Sphinx 1.1.0 or newer](http://http://sphinx-doc.org/)
6.[Sphinx 1.1.0 or newer](http://http://sphinx-doc.org/) to build the documentation
INSTALLATION
INSTALLATION
============
============
We recommend the use of a virtual environment (e.g. [Miniconda](https://conda.io/miniconda.html) or [Virtualenv](https://virtualenv.readthedocs.io/en/latest/)).
We recommend the use of a virtual environment (e.g. [Miniconda](https://conda.io/miniconda.html) or [Virtualenv](https://virtualenv.readthedocs.io/en/latest/)).
After downloading the project, install the requirements with:
TUTORIALS
```
=========
pip install -r requirements.txt
```
Once your installation is complete, you can take a look at the [tutorials](https://git-lium.univ-lemans.fr/Meignier/s4d/tree/master/tutorials).
Then proceed to install s4d:
\ No newline at end of file
```
./install.sh
```
Once done, you can take a look at the [tutorials](https://git-lium.univ-lemans.fr/Meignier/s4d/tree/master/tutorials).
"/Users/Sulfyderz/Desktop/Doctorat/Tools/Environments/miniconda/Python3/lib/python3.6/site-packages/sidekit/bosaris/detplot.py:39: UserWarning: matplotlib.pyplot as already been imported, this call will have no effect.\n",
" matplotlib.use('PDF')\n",
"WARNING:root:WARNNG: libsvm is not installed, please refer to the documentation if you intend to use SVM classifiers\n"
]
}
],
"source": [
"source": [
"%matplotlib inline\n",
"%matplotlib inline\n",
"\n",
"\n",
...
@@ -96,7 +86,7 @@
...
@@ -96,7 +86,7 @@
},
},
{
{
"cell_type": "code",
"cell_type": "code",
"execution_count": 2,
"execution_count": 3,
"metadata": {},
"metadata": {},
"outputs": [],
"outputs": [],
"source": [
"source": [
...
@@ -113,7 +103,7 @@
...
@@ -113,7 +103,7 @@
},
},
{
{
"cell_type": "code",
"cell_type": "code",
"execution_count": 3,
"execution_count": 4,
"metadata": {},
"metadata": {},
"outputs": [],
"outputs": [],
"source": [
"source": [
...
@@ -133,7 +123,7 @@
...
@@ -133,7 +123,7 @@
},
},
{
{
"cell_type": "code",
"cell_type": "code",
"execution_count": 4,
"execution_count": 5,
"metadata": {},
"metadata": {},
"outputs": [],
"outputs": [],
"source": [
"source": [
...
...
%% Cell type:markdown id: tags:
%% Cell type:markdown id: tags:
Diarization for ASR
Diarization for ASR
===================
===================
This script performs a BIC diarization (ussally for ASR decoding)
This script performs a BIC diarization (ussally for ASR decoding)
The proposed diarization system was inspired by the
The proposed diarization system was inspired by the
system [1] which won the RT'04 fall evaluation
system [1] which won the RT'04 fall evaluation
and the ESTER1 evaluation. It was developed during the ESTER2
and the ESTER1 evaluation. It was developed during the ESTER2
evaluation campaign for the transcription with the goal of minimizing
evaluation campaign for the transcription with the goal of minimizing
word error rate.
word error rate.
Automatic transcription requires accurate segment boundaries. Segment
Automatic transcription requires accurate segment boundaries. Segment
boundaries have to be set within non-informative zones such as filler
boundaries have to be set within non-informative zones such as filler
words.
words.
Speaker diarization needs to produce homogeneous speech segments;
Speaker diarization needs to produce homogeneous speech segments;
however, purity and coverage of the speaker clusters are the main
however, purity and coverage of the speaker clusters are the main
objectives here. Errors such as having two distinct clusters (i.e.,
objectives here. Errors such as having two distinct clusters (i.e.,
detected speakers) corresponding to the same real speaker, or
detected speakers) corresponding to the same real speaker, or
conversely, merging segments of two real speakers into only one cluster,
conversely, merging segments of two real speakers into only one cluster,
get heavier penalty in the NIST time-based diarization metric than
get heavier penalty in the NIST time-based diarization metric than
misplaced boundaries.
misplaced boundaries.
The system is composed of acoustic BIC segmentation followed with BIC
The system is composed of acoustic BIC segmentation followed with BIC
hierarchical clustering. Viterbi decoding is performed to adjust the
hierarchical clustering. Viterbi decoding is performed to adjust the
segment boundaries.
segment boundaries.
Music and jingle regions are not removed but a speech activity
Music and jingle regions are not removed but a speech activity
diarization could be load before to segment and cluster the show.
diarization could be load before to segment and cluster the show.
Optionally, long segments are cut to be shorter than 20 seconds.
Optionally, long segments are cut to be shorter than 20 seconds.
[1] C. Barras, X. Zhu, S. Meignier, and J. L. Gauvain, “Multistage speaker diarization of broadcast news,” IEEE Transactions on Audio, Speech, and Language Processing, vol. 14, no. 5, pp. 1505–1512, Sep. 2006.
[1] C. Barras, X. Zhu, S. Meignier, and J. L. Gauvain, “Multistage speaker diarization of broadcast news,” IEEE Transactions on Audio, Speech, and Language Processing, vol. 14, no. 5, pp. 1505–1512, Sep. 2006.
/Users/Sulfyderz/Desktop/Doctorat/Tools/Environments/miniconda/Python3/lib/python3.6/site-packages/sidekit/bosaris/detplot.py:39: UserWarning: matplotlib.pyplot as already been imported, this call will have no effect.
matplotlib.use('PDF')
WARNING:root:WARNNG: libsvm is not installed, please refer to the documentation if you intend to use SVM classifiers
%% Cell type:markdown id: tags:
%% Cell type:markdown id: tags:
BIC diarization
BIC diarization
===============
===============
Arguments, variables and logger
Arguments, variables and logger
-------------------------------
-------------------------------
Set the logger
Set the logger
%% Cell type:code id: tags:
%% Cell type:code id: tags:
``` python
``` python
loglevel=logging.INFO
loglevel=logging.INFO
init_logging(level=loglevel)
init_logging(level=loglevel)
```
```
%% Cell type:markdown id: tags:
%% Cell type:markdown id: tags:
Set the input audio or mfcc file and the speech activity detection file (optional).
Set the input audio or mfcc file and the speech activity detection file (optional).