Commit 0996493f authored by Anthony Larcher's avatar Anthony Larcher
Browse files


parent 078498f7
......@@ -38,12 +38,50 @@ See **ALLIES_evaluation_plan_V0.pdf** in this repository.
![The ALLIES lifelong learning framework](./allies_baseline.png)
#### Human assisted learning
#### Diarization across time
The task of Diarization across time simulates the evaluation of an automatic
system across time. The system is allowed to update its models using any
audio data sent in, creating a better speech model or updating model clusters
for instance, and generate a new version of them to handle the next show.
See the evaluation plan for more details.
#### Lifelong learning speaker diarization
The protocol is similar to the one in *Diarization across time* except that the
the system is allowed to performed *active* and *interactive learning* during
its adaptation.
The task is performed through a set of python scripts that simulate
the user for the human-assisted learning. These scripts require
access to the test references. Participants are advised not to use
these references in their systems by mistake.
The user simulation that is used to evaluate human assisted learning
allows for two types of actions: active or interactive learning. In
active learning, the system asks questions and the user answers (up to
a point). In interactive learning the user spontaneously provides
questions and associated answers, choosing what has potentially the
biggest impact of the DER.
#### Lifelong learning and incremental processing
### The metrics
#### DER Across time
Evaluation-wise, validated speakers are associated with same-name
speakers in the hypothesis. The new speakers will be associated
optimally with the not yet validated speakers of the reference to
minimize the DER. The final DER is the time-ponderated mean of all
DERs computed at each show.
#### Penalized DER for human assisted learning
Performance of the system is a sequence of DER for each document.
The DER is computed on the final version of the hypothesis for each document
penalized by the cost of interacting with the *user in the loop*.
For more information about the metrics, please refer to the evaluation plan or to:
* Prokopalo, Yevhenii, Sylvain Meignier, Olivier Galibert, Loïc Barrault, and Anthony Larcher. "[Evaluation of Lifelong Learning Systems.](" In International Conference on Language Resources and Evaluation. 2020.
Supports Markdown
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment