evalmate.alignment

This module contains functionality for aligning labels of a ground truth with the labels of a system output.

Base classes

All aligners are based either on EventAligner or SegmentAligner. The base classes are mainly distinguished by the type of the alignment they return. While the EventAligner returns a mapping between complete labels, the SegmentAligner returns segments, that can span over parts of labels.

class evalmate.alignment.EventAligner[source]

Abstract class for aligner classes that return a mapping between labels (events).

An alignment is a mapping between labels from the ground truth (ref) and the system output (hyp). If there is no matching label in the system output for a label in the ground truth, it has to be aligned to None and vice versa. A single label can be aligned to multiple other labels.

align(ref, hyp)[source]

Return an alignment between the labels of the two label-lists.

Parameters:
  • ref (audiomate.corpus.assets.LabelList) – The label-list containing labels of the ground truth.
  • hyp (audiomate.corpus.assets.LabelList) – The label-list containing labels of the system output.
Returns:

A list of evalmate.alignment.LabelPair. Every pair contains one label from the ground truth and one from the system output, that are aligned. One of them also can be None.

Return type:

list

class evalmate.alignment.SegmentAligner[source]

Abstract class for aligner classes that align labels in segments.

An alignment is represented as a list of Segments with start/end-time and the labels from the ground truth and the system output, that are within this segment.

align(ref, hyp)[source]

Return an alignment of segments.

Parameters:
  • ref (audiomate.corpus.assets.LabelList) – The label-list containing labels of the ground truth.
  • hyp (audiomate.corpus.assets.LabelList) – The label-list containing labels of the system output.
Returns:

A list of evalmate.utils.structure.Segment. Every segment has start/end-time and two lists of labels that are contained in the segment (one for the ground truth and one for the system output).

Return type:

list

Time-Based

Align labels based on some distance metric based on their start/endtimes.

class evalmate.alignment.BipartiteMatchingAligner(start_delta_threshold=0.5, end_delta_threshold=-1, non_overlap_penalty_weight=1, substitution_penalty=2, insertion_penalty=10, deletion_penalty=10)[source]

Create event-based alignment, based on bipartite matching.

1. In a first step for every possible label-pair between ref and hyp, it is decided if a mapping of such a pair is possible. This decision is based on the start_delta_threshold and end_delta_threshold.

  1. Using penalty and weight parameters, for every pair a penalty is computed for aligning the pair.

3. From all the pairs and the computed probabilities, the best alignment is computed using bipartite matching. So that every label only occurs once in the final alignment.

Parameters:
  • start_delta_threshold (float) – Temporal tolerance of the start time in seconds. If the delta between the starts of the two labels is greater it is not a matching pair.
  • end_delta_threshold (float) – Temporal tolerance of the end time in seconds. If the delta between the ends of the two labels is greater it is not a matching pair. If < 0 the end time is not checked at all.
  • non_overlap_penalty_weight (float) – Weight-factor of penalty for the non-overlapping ratio between two labels.
  • substitution_penalty (float) – Penalty for aligning two labels with different values.
  • deletion_penalty (float) – Penalty for aligning a reference-label with no hypothesis-label.
  • insertion_penalty (float) – Penalty for aligning a hypothesis-label with no reference-label.
align(ll_ref, ll_hyp)[source]

Return an alignment between the events of the given label-lists.

Parameters:
  • ref (audiomate.corpus.assets.LabelList) – The label-list containing labels (events) of the ground truth.
  • hyp (audiomate.corpus.assets.LabelList) – The label-list containing labels (events) of the system output.
Returns:

A list of evalmate.alignment.LabelPair. Every pair contains one label (event) from the ground truth and one from the system output, that are aligned. One of them also can be None.

Return type:

list

class evalmate.alignment.FullMatchingAligner(min_overlap=0)[source]

Event-based alignment, where all possible matches are returned. So a single label can occur multiple times, but with a different counterpart.

Parameters:min_overlap (float) – Number of seconds the segment of overlap has to be, to align two labels. If 0, any overlap is accepted.
align(ref, hyp)[source]

Return an alignment between the events of the given label-lists.

Parameters:
  • ref (audiomate.corpus.assets.LabelList) – The label-list containing labels (events) of the ground truth.
  • hyp (audiomate.corpus.assets.LabelList) – The label-list containing labels (events) of the system output.
Returns:

A list of evalmate.alignment.LabelPair. Every pair contains one label (event) from the ground truth and one from the system output, that are aligned. One of them also can be None.

Return type:

list

Sequence-Based

Align labels only considering the ordering of the sequence.

class evalmate.alignment.LevenshteinAligner(deletion_cost=3, insertion_cost=3, substitution_cost=4, custom_substitution_cost_function=None)[source]

Alignment of labels of two label-lists based on the Levenshtein distance (https://en.wikipedia.org/wiki/Levenshtein_distance).

This only takes the order of the labels into account, not the start and end-times.

Parameters:
  • deletion_cost (float) – Cost for a deletion in the alignment.
  • insertion_cost (float) – Cost for a insertion in the alignment.
  • substitution_cost (float) – Cost for a substitution in the alignment.
  • custom_substitution_cost_function (func) – Function to calculate substitution cost depending on the elements. The function has to take two paramters (ref-label, hyp-label).
align(reference, hypothesis)[source]

Return an alignment between the labels of the given label-lists.

Parameters:
  • reference (audiomate.corpus.assets.LabelList) – The label-list containing labels of the ground truth.
  • hypothesis (audiomate.corpus.assets.LabelList) – The label-list containing labels of the system output.
Returns:

A list of evalmate.alignment.LabelPair. Every pair contains one label from the ground truth and one from the system output, that are aligned. One of them also can be None.

Return type:

list

Example

>>> from audiomate.corpus import assets
>>>
>>> reference = assets.LabelList(labels=[
>>>     assets.Label('a'),
>>>     assets.Label('b'),
>>>     assets.Label('c')
>>> ])
>>> hypothesis = assets.LabelList(labels=([
>>>     assets.Label('a'),
>>>     assets.Label('c')
>>> ])
>>>
>>> LevenshteinAligner().align(reference, hypothesis)
[
    LabelPair(Label('a'), Label('a')),
    LabelPair(Label('b'), None),
    LabelPair(Label('c'), Label('c'))
]

Segment-Based

Align labels based on segments defined by start/end-time.

class evalmate.alignment.InvariantSegmentAligner[source]

Create a segment-based alignment so that within every segment the same labels are active. So for example as reference we have a label-list as following.

>>> [   A   ]     [     B     ]    [      A     ]
>>>                               [      E     ]

The output of some system (hypothesis) maybe as follows:

>>> [   Ax  ]     [  Ex ]                           [  Ax ]

Now the segments returned are created, so every segment represents some time range where the labels are equal.

>>>         S1      S2   S3    S4    S5       S6      S7   S8
>>>
>>> HYP  |   A   |     |  B  |  B  |    |      A     |   |     |
>>> HYP  |       |     |     |     |    |      E     |   |     |
>>> REF  |   Ax  |     |  Ex |     |    |            |   |  Ax |
align(ll_ref, ll_hyp)[source]

Create segment based alignment.

Parameters:
  • ll_ref (audiomate.corpus.assets.LabelList) – The label-list with reference labels.
  • ll_hyp (audiomate.corpus.assets.LabelList) – The label-list with hypothesis labels.
Returns:

A list of Segments.

Return type:

list

Example

>>> from audiomate.corpus import assets
>>>
>>> ref = assets.LabelList(labels=[
>>>     assets.Label('a', 0, 3),
>>>     assets.Label('b', 3, 6),
>>>     assets.Label('c', 7, 10)
>>> ])
>>>
>>> hyp = assets.LabelList(labels=[
>>>     assets.Label('a', 0, 3),
>>>     assets.Label('b', 4, 8),
>>>     assets.Label('c', 8, 10)
>>> ])
>>>
>>> InvariantSegmentAligner().align(ref, hyp)
[
    0 - 3 REF: [Label(a, 0, 3)] HYP: [Label(a, 0, 3)]
    3 - 4 REF: [Label(b, 3, 6)] HYP: []
    4 - 6 REF: [Label(b, 3, 6)] HYP: [Label(b, 4, 8)]
    6 - 7 REF: [] HYP: [Label(b, 4, 8)]
    7 - 8 REF: [Label(c, 7, 10)] HYP: [Label(b, 4, 8)]
    8 - 10 REF: [Label(c, 7, 10)] HYP: [Label(c, 8, 10)]
]
static create_event_list(ll_ref, ll_hyp, time_threshold=0.01)[source]

Create an event list of all labels.

Parameters:
  • ll_ref (LabelList) – Reference labels.
  • ll_hyp (LabelList) – Hypothesis labels.
  • time_threshold (float) – If two event times are closer than this threshold the time of the earlier event is used for both events.
Returns:

List of list of tuples. Every tuple contains a time, type (start or end), ll_index (ref/hyp) and the label which is responsible for the event. It is sorted ascending by time.

Return type:

list

static set_absolute_end_of_labels(label_list)[source]

If there are any labels where the end is defined as -1 (end of utterance), set the concrete time.

Parameters:label_list (LabelList) – The label-list to process.

Utils

class evalmate.alignment.Segment(start, end, ref=None, hyp=None)[source]

A class representing a segment within an alignment.

Parameters:
  • start (float) – The start time in seconds.
  • end (float) – The end time in seconds.
Variables:
  • ref (Label, list) – List of or single reference label in the segment.
  • hyp (Label, list) – List of or single hypothesis label in the segment.
class evalmate.alignment.LabelPair(ref, hyp)[source]

Class to hold a pair of labels.

Variables:
  • ref (Label) – Reference label.
  • hyp (Label) – Hypothesis label.
max_length()[source]

Return the length of the longer value from ref and hyp.

padded_hyp_value()[source]

Return the hypothesis value as string padded to the longer value of ref and hyp.

padded_ref_value()[source]

Return the reference value as string padded to the longer value of ref and hyp.