*Sidekit for Diarization* requires the following software installed for your platform:
1.[Python](http://www.python.org)
2.[NumPy](http://www.numpy.org/)
3.[Scipy](http://http://www.scipy.org/)
4.[Pandas](http://http://www.pandas.org/)
5.[GLPK](https://www.gnu.org/software/glpk/)
6.If you want to build the documentation: [Sphinx 1.1.0 or newer](http://http://sphinx-doc.org/)
6.[Sphinx 1.1.0 or newer](http://http://sphinx-doc.org/) to build the documentation
INSTALLATION
============
We recommend the use of a virtual environment (e.g. [Miniconda](https://conda.io/miniconda.html) or [Virtualenv](https://virtualenv.readthedocs.io/en/latest/)).
After downloading the project, install the requirements with:
```
pip install -r requirements.txt
```
Then proceed to install s4d:
```
./install.sh
```
Once done, you can take a look at the [tutorials](https://git-lium.univ-lemans.fr/Meignier/s4d/tree/master/tutorials).
\ No newline at end of file
TUTORIALS
=========
Once your installation is complete, you can take a look at the [tutorials](https://git-lium.univ-lemans.fr/Meignier/s4d/tree/master/tutorials).
"/Users/Sulfyderz/Desktop/Doctorat/Tools/Environments/miniconda/Python3/lib/python3.6/site-packages/sidekit/bosaris/detplot.py:39: UserWarning: matplotlib.pyplot as already been imported, this call will have no effect.\n",
" matplotlib.use('PDF')\n",
"WARNING:root:WARNNG: libsvm is not installed, please refer to the documentation if you intend to use SVM classifiers\n"
]
}
],
"outputs": [],
"source": [
"%matplotlib inline\n",
"\n",
...
...
@@ -96,7 +86,7 @@
},
{
"cell_type": "code",
"execution_count": 2,
"execution_count": 3,
"metadata": {},
"outputs": [],
"source": [
...
...
@@ -113,7 +103,7 @@
},
{
"cell_type": "code",
"execution_count": 3,
"execution_count": 4,
"metadata": {},
"outputs": [],
"source": [
...
...
@@ -133,7 +123,7 @@
},
{
"cell_type": "code",
"execution_count": 4,
"execution_count": 5,
"metadata": {},
"outputs": [],
"source": [
...
...
%% Cell type:markdown id: tags:
Diarization for ASR
===================
This script performs a BIC diarization (ussally for ASR decoding)
The proposed diarization system was inspired by the
system [1] which won the RT'04 fall evaluation
and the ESTER1 evaluation. It was developed during the ESTER2
evaluation campaign for the transcription with the goal of minimizing
word error rate.
Automatic transcription requires accurate segment boundaries. Segment
boundaries have to be set within non-informative zones such as filler
words.
Speaker diarization needs to produce homogeneous speech segments;
however, purity and coverage of the speaker clusters are the main
objectives here. Errors such as having two distinct clusters (i.e.,
detected speakers) corresponding to the same real speaker, or
conversely, merging segments of two real speakers into only one cluster,
get heavier penalty in the NIST time-based diarization metric than
misplaced boundaries.
The system is composed of acoustic BIC segmentation followed with BIC
hierarchical clustering. Viterbi decoding is performed to adjust the
segment boundaries.
Music and jingle regions are not removed but a speech activity
diarization could be load before to segment and cluster the show.
Optionally, long segments are cut to be shorter than 20 seconds.
[1] C. Barras, X. Zhu, S. Meignier, and J. L. Gauvain, “Multistage speaker diarization of broadcast news,” IEEE Transactions on Audio, Speech, and Language Processing, vol. 14, no. 5, pp. 1505–1512, Sep. 2006.
/Users/Sulfyderz/Desktop/Doctorat/Tools/Environments/miniconda/Python3/lib/python3.6/site-packages/sidekit/bosaris/detplot.py:39: UserWarning: matplotlib.pyplot as already been imported, this call will have no effect.
matplotlib.use('PDF')
WARNING:root:WARNNG: libsvm is not installed, please refer to the documentation if you intend to use SVM classifiers
%% Cell type:markdown id: tags:
BIC diarization
===============
Arguments, variables and logger
-------------------------------
Set the logger
%% Cell type:code id: tags:
``` python
loglevel=logging.INFO
init_logging(level=loglevel)
```
%% Cell type:markdown id: tags:
Set the input audio or mfcc file and the speech activity detection file (optional).