evalmate.alignment

This module contains functionality for aligning labels of a ground truth with the labels of a system output.

Base classes

All aligners are based either on EventAligner or SegmentAligner. The base classes are mainly distinguished by the type of the alignment they return. While the EventAligner returns a mapping between complete labels, the SegmentAligner returns segments, that can span over parts of labels.

class evalmate.alignment.EventAligner[source]

Abstract class for aligner classes that return a mapping between labels (events).

An alignment is a mapping between labels from the ground truth (ref) and the system output (hyp). If there is no matching label in the system output for a label in the ground truth, it has to be aligned to None and vice versa. A single label can be aligned to multiple other labels.

align(ref_labels, hyp_labels)[source]

Return an alignment between the labels of the two label-lists.

Parameters:
  • ref_labels (list) – The list containing labels of the ground truth.
  • hyp_labels (list) – The list containing labels of the system output.
Returns:

A list of evalmate.alignment.LabelPair. Every pair contains one label from the ground truth and one from the system output, that are aligned. One of them also can be None.

Return type:

list

class evalmate.alignment.SegmentAligner[source]

Abstract class for aligner classes that align labels in segments.

An alignment is represented as a list of Segments with start/end-time and the labels from the ground truth and the system output, that are within this segment.

align(ref_labels, hyp_labels)[source]

Return an alignment of segments.

Parameters:
  • ref_labels (list) – The list containing labels of the ground truth.
  • hyp_labels (list) – The list containing labels of the system output.
Returns:

A list of evalmate.utils.structure.Segment. Every segment has start/end-time and two lists of labels that are contained in the segment (one for the ground truth and one for the system output).

Return type:

list

Time-Based

Align labels based on some distance metric based on their start/endtimes.

class evalmate.alignment.BipartiteMatchingAligner(candidate_finder=None, non_overlap_penalty_weight=1, substitution_penalty=2, insertion_penalty=10, deletion_penalty=10)[source]

Create event-based alignment, based on bipartite matching.

1. In a first step for every possible label-pair between ref and hyp, it is decided if a mapping of such a pair is possible. For this a CandidateFinder is used.

2. Using penalty and weight parameters, for every pair a penalty is computed for aligning the pair.

3. From all the pairs and the computed probabilities, the best alignment is computed using bipartite matching. So that every label only occurs once in the final alignment.

Parameters:
  • candidate_finder (CandidateFinder) – CandidateFinder to use for finding potential labels for alignment.
  • non_overlap_penalty_weight (float) – Weight-factor of penalty for the non-overlapping ratio between two labels.
  • substitution_penalty (float) – Penalty for aligning two labels with different values.
  • deletion_penalty (float) – Penalty for aligning a reference-label with no hypothesis-label.
  • insertion_penalty (float) – Penalty for aligning a hypothesis-label with no reference-label.
align(ref_labels, hyp_labels)[source]

Return an alignment between the events of the given label-lists.

Parameters:
  • ref_labels (list) – The list containing labels of the ground truth.
  • hyp_labels (list) – The list containing labels of the system output.
Returns:

A list of evalmate.alignment.LabelPair. Every pair contains one label (event) from the ground truth and one from the system output, that are aligned. One of them also can be None.

Return type:

list

class evalmate.alignment.FullMatchingAligner(min_overlap=0)[source]

Event-based alignment, where all possible matches are returned. So a single label can occur multiple times, but with a different counterpart.

Parameters:min_overlap (float) – Number of seconds the segment of overlap has to be, to align two labels. If 0, any overlap is accepted.
align(ref_labels, hyp_labels)[source]

Return an alignment between the labels of the two label-lists.

Parameters:
  • ref_labels (list) – The list containing labels of the ground truth.
  • hyp_labels (list) – The list containing labels of the system output.
Returns:

A list of evalmate.alignment.LabelPair. Every pair contains one label (event) from the ground truth and one from the system output, that are aligned. One of them also can be None.

Return type:

list

Sequence-Based

Align labels only considering the ordering of the sequence.

class evalmate.alignment.LevenshteinAligner(deletion_cost=3, insertion_cost=3, substitution_cost=4, custom_substitution_cost_function=None)[source]

Alignment of labels of two label-lists based on the Levenshtein distance (https://en.wikipedia.org/wiki/Levenshtein_distance).

This only takes the order of the labels into account, not the start and end-times.

Parameters:
  • deletion_cost (float) – Cost for a deletion in the alignment.
  • insertion_cost (float) – Cost for a insertion in the alignment.
  • substitution_cost (float) – Cost for a substitution in the alignment.
  • custom_substitution_cost_function (func) – Function to calculate substitution cost depending on the elements. The function has to take two paramters (ref-label, hyp-label).
align(ref_labels, hyp_labels)[source]

Return an alignment between the labels of the given label-lists.

Parameters:
  • ref_labels (list) – The list containing labels of the ground truth.
  • hyp_labels (list) – The list containing labels of the system output.
Returns:

A list of evalmate.alignment.LabelPair. Every pair contains one label from the ground truth and one from the system output, that are aligned. One of them also can be None.

Return type:

list

Example

>>> from audiomate.corpus import assets
>>>
>>> reference = [
>>>     assets.Label('a'),
>>>     assets.Label('b'),
>>>     assets.Label('c')
>>> ]
>>> hypothesis = [
>>>     assets.Label('a'),
>>>     assets.Label('c')
>>> ]
>>>
>>> LevenshteinAligner().align(reference, hypothesis)
[
    LabelPair(Label('a'), Label('a')),
    LabelPair(Label('b'), None),
    LabelPair(Label('c'), Label('c'))
]

Segment-Based

Align labels based on segments defined by start/end-time.

class evalmate.alignment.InvariantSegmentAligner[source]

Create a segment-based alignment so that within every segment the same labels are active. So for example as reference we have a label-list as following.

>>> [   A   ]     [     B     ]    [      A     ]
>>>                               [      E     ]

The output of some system (hypothesis) maybe as follows:

>>> [   Ax  ]     [  Ex ]                           [  Ax ]

Now the segments returned are created, so every segment represents some time range where the labels are equal.

>>>         S1      S2   S3    S4    S5       S6      S7   S8
>>>
>>> HYP  |   A   |     |  B  |  B  |    |      A     |   |     |
>>> HYP  |       |     |     |     |    |      E     |   |     |
>>> REF  |   Ax  |     |  Ex |     |    |            |   |  Ax |
align(ref_labels, hyp_labels)[source]

Create segment based alignment.

Parameters:
  • ref_labels (list) – The list with reference labels.
  • hyp_labels (list) – The list with hypothesis labels.
Returns:

A list of Segments.

Return type:

list

Example

>>> from audiomate.corpus import assets
>>>
>>> ref = [
>>>     assets.Label('a', 0, 3),
>>>     assets.Label('b', 3, 6),
>>>     assets.Label('c', 7, 10)
>>> ]
>>>
>>> hyp = [
>>>     assets.Label('a', 0, 3),
>>>     assets.Label('b', 4, 8),
>>>     assets.Label('c', 8, 10)
>>> ]
>>>
>>> InvariantSegmentAligner().align(ref, hyp)
[
    0 - 3 REF: [Label(a, 0, 3)] HYP: [Label(a, 0, 3)]
    3 - 4 REF: [Label(b, 3, 6)] HYP: []
    4 - 6 REF: [Label(b, 3, 6)] HYP: [Label(b, 4, 8)]
    6 - 7 REF: [] HYP: [Label(b, 4, 8)]
    7 - 8 REF: [Label(c, 7, 10)] HYP: [Label(b, 4, 8)]
    8 - 10 REF: [Label(c, 7, 10)] HYP: [Label(c, 8, 10)]
]
static create_event_list(ref_labels, hyp_labels, time_threshold=0.01)[source]

Create an event list of all labels.

Parameters:
  • ref_labels (list) – Reference labels.
  • hyp_labels (list) – Hypothesis labels.
  • time_threshold (float) – If two event times are closer than this threshold the time of the earlier event is used for both events.
Returns:

List of list of tuples. Every tuple contains a time, type (start or end), ll_index (ref/hyp) and the label which is responsible for the event. It is sorted ascending by time.

Return type:

list

static set_absolute_end_of_labels(labels)[source]

If there are any labels where the end is defined as -1 (end of utterance), set the concrete time.

Parameters:labels (list) – The list of labels to process.

Candidates

Classes to find possible pairs of labels for alignment.

class evalmate.alignment.CandidateFinder[source]

Class to find possible pairs of labels for further alignment. This is used for preprocessing and finding pairs of labels that may be aligned together. A label can be a candidate in multiple pairs.

find(ref_labels, hyp_labels)[source]

Return candidates as pairs of labels, as well as labels that have no possible counterparts.

Parameters:
  • ref_labels (list) – List with reference labels (ground truth).
  • hyp_labels (list) – List with hypothesis labels (system output).
Returns:

A tuple (candidates, single-ref, single-hyp) containing the candidates in paris, the ref-labels and the hyp-labels, that have no possible counterpart.

Return type:

tuple

class evalmate.alignment.StartEndCandidateFinder(start_delta_threshold, end_delta_threshold=-1)[source]

Finds candidates based on the difference between the start (and end) of two labels for a possible pairs.

Parameters:
  • start_delta_threshold (float) – Temporal tolerance of the start time in seconds. If the delta between the starts of the two labels is greater it is not a matching pair.
  • end_delta_threshold (float) – Temporal tolerance of the end time in seconds. If the delta between the ends of the two labels is greater it is not a matching pair. If < 0 the end time is not checked at all.
find(ref_labels, hyp_labels)[source]

Return candidates as pairs of labels, as well as labels that have no possible counterparts.

Parameters:
  • ref_labels (list) – List with reference labels (ground truth).
  • hyp_labels (list) – List with hypothesis labels (system output).
Returns:

A tuple (candidates, single-ref, single-hyp) containing the candidates in paris, the ref-labels and the hyp-labels, that have no possible counterpart.

Return type:

tuple

class evalmate.alignment.OverlapCandidateFinder(min_overlap=0.05)[source]

Finds candidates based on amount of overlapping between two labels.

Parameters:min_overlap (float) – Number of seconds the segment of overlap has to be, to include the combination of labels. (default 0.05 seconds)
find(ref_labels, hyp_labels)[source]

Return candidates as pairs of labels, as well as labels that have no possible counterparts.

Parameters:
  • ref_labels (list) – List with reference labels (ground truth).
  • hyp_labels (list) – List with hypothesis labels (system output).
Returns:

A tuple (candidates, single-ref, single-hyp) containing the candidates in paris, the ref-labels and the hyp-labels, that have no possible counterpart.

Return type:

tuple

Utils

class evalmate.alignment.Segment(start, end, ref=None, hyp=None)[source]

A class representing a segment within an alignment.

Parameters:
  • start (float) – The start time in seconds.
  • end (float) – The end time in seconds.
Variables:
  • ref (Label, list) – List of or single reference label in the segment.
  • hyp (Label, list) – List of or single hypothesis label in the segment.
class evalmate.alignment.LabelPair(ref, hyp)[source]

Class to hold a pair of labels.

Variables:
  • ref (Label) – Reference label.
  • hyp (Label) – Hypothesis label.
max_length()[source]

Return the length of the longer value from ref and hyp.

padded_hyp_value()[source]

Return the hypothesis value as string padded to the longer value of ref and hyp.

padded_ref_value()[source]

Return the reference value as string padded to the longer value of ref and hyp.