evalmate.confusion

This module contains classes for computing confusion statistics.

Confusion

class evalmate.confusion.Confusion[source]

Base class that provides methods for computing common metrics.

accuracy

Accuracy = correct / (total + insertions)

correct

Amount that is correct.

Example

>>> ref = 'xxx'
>>> hyp = 'xxx'
deletions

Amount that is deleted.

Example

>>> ref = 'xxx'
>>> hyp = None
error_rate

ErrorRate = (substitutions + deletions + insertions) / total

f_measure(beta=1)[source]

F-Measure see https://en.wikipedia.org/wiki/Precision_and_recall

false_negatives

Amount of false negatives (No indication of precence, when it should be present).

Note

Equal to ‘self.total - self.correct’

false_positives

Amount of false positives (Indications of presence, when it is not present).

Note

Equal to self.insertions + self.substitutions_out

insertions

Amount that is inserted.

Example

>>> ref = None
>>> hyp = 'xxx'
precision

Precision = tp / (fp + tp)

recall

Recall = tp / (fn + tp)

substitutions

Amount that is substituted.

If this stats are representing stats for a specific instance (e.g. occurrence of the word ‘hello’) substitutions is the amount where the specific instance was substituted with some other instance/event. If not it is not necessary to designate which event/instance substitutes which event/instance.

Example

>>> ref = 'xxx'
>>> hyp = 'yyy'
substitutions_out

Amount that is substituted.

If this stats are representing stats for a specific instance (e.g. occurrence of the word ‘hello’) substitutions_out is the amount where the specific instance was output, when some other event/instance was expected (reference). If not it is equal to substitutions.

Example

>>> ref = 'yyy'
>>> hyp = 'xxx'
total

Return the total amount based on the reference system.

Note

Equal to ‘self.correct + self.deletions + self.substitutions’

true_positives

Amount of true positives (Correct indications).

Note

Equal to self.correct

SegmentConfusion

class evalmate.confusion.SegmentConfusion(value)[source]

Class to represent confusions of a specific instance (e.g. some class) based on segments. The insertions, deletions and so on represent the time in seconds the instance was confused (or not).

Argument:
value (str): The value of the instance (e.g. the class “speech”)
Variables:
  • correct_segments (list) – (List of Segment) Segments that are correct (ref == hyp).
  • insertion_segments (list) – (List of Segment) Segments that are insertions (ref = None, hyp = ‘value’).
  • deletion_segments (list) – (List of Segment) Segments that are deletions (ref = ‘value’, hyp = None)
  • substitution_segments (Dict) – Segments that are substitutions with other values (ref = ‘value’, hyp = ‘other-value’). Dict holding a list for every other-value.
  • substitution_out_segments (Dict) – Segments that are substitutions of other values (ref = ‘other-value’, hyp = ‘value’). Dict holding a list for every other-value.
correct

Amount that is correct.

Example

>>> ref = 'xxx'
>>> hyp = 'xxx'
deletions

Amount that is deleted.

Example

>>> ref = 'xxx'
>>> hyp = None
insertions

Amount that is inserted.

Example

>>> ref = None
>>> hyp = 'xxx'
substitutions

Amount that is substituted.

If this stats are representing stats for a specific instance (e.g. occurrence of the word ‘hello’) substitutions is the amount where the specific instance was substituted with some other instance/event. If not it is not necessary to designate which event/instance substitutes which event/instance.

Example

>>> ref = 'xxx'
>>> hyp = 'yyy'
substitutions_out

Amount that is substituted.

If this stats are representing stats for a specific instance (e.g. occurrence of the word ‘hello’) substitutions_out is the amount where the specific instance was output, when some other event/instance was expected (reference). If not it is equal to substitutions.

Example

>>> ref = 'yyy'
>>> hyp = 'xxx'

EventConfusion

class evalmate.confusion.EventConfusion(value)[source]

Class to represent confusions of a specific instance (e.g. some class) based on label-to-label alignment. The insertions, deletions and so on represent the number of times a label was confused (or not).

Argument:
value (str): The value of the instance (e.g. the class “speech”)
Variables:
  • correct_pairs (list) – (List of LabelPair) Correct matches.
  • insertion_pairs (list) – (List of LabelPair) Insertions (ref = None, hyp = value)
  • deletion_pairs (list) – (List of LabelPair) Deletions (ref = value, hyp = None)
  • substitution_pairs (Dict) – Substitutions with other values (ref = value, hyp = other-value). Dict holding a list for every other-value.
  • substitution_out_pairs (Dict) – Substitutions from other values (ref = other-value, hyp = value) Dict holding a list for every other-value.
correct

Amount that is correct.

Example

>>> ref = 'xxx'
>>> hyp = 'xxx'
deletions

Amount that is deleted.

Example

>>> ref = 'xxx'
>>> hyp = None
insertions

Amount that is inserted.

Example

>>> ref = None
>>> hyp = 'xxx'
substitutions

Amount that is substituted.

If this stats are representing stats for a specific instance (e.g. occurrence of the word ‘hello’) substitutions is the amount where the specific instance was substituted with some other instance/event. If not it is not necessary to designate which event/instance substitutes which event/instance.

Example

>>> ref = 'xxx'
>>> hyp = 'yyy'
substitutions_by_count()[source]

Return a list of tuples (Substituted-value, Number-of-substitutions) ordered by number of substitutions descending.

Returns:List of tuples.
Return type:list
substitutions_out

Amount that is substituted.

If this stats are representing stats for a specific instance (e.g. occurrence of the word ‘hello’) substitutions_out is the amount where the specific instance was output, when some other event/instance was expected (reference). If not it is equal to substitutions.

Example

>>> ref = 'yyy'
>>> hyp = 'xxx'

AggregatedConfusion

class evalmate.confusion.AggregatedConfusion[source]

Class to aggregate multiple confusions.

Variables:instances (dict) – Dictionary containing the aggregated confusions.
correct

Amount that is correct.

Example

>>> ref = 'xxx'
>>> hyp = 'xxx'
deletions

Amount that is deleted.

Example

>>> ref = 'xxx'
>>> hyp = None
insertions

Amount that is inserted.

Example

>>> ref = None
>>> hyp = 'xxx'
substitutions

Amount that is substituted.

If this stats are representing stats for a specific instance (e.g. occurrence of the word ‘hello’) substitutions is the amount where the specific instance was substituted with some other instance/event. If not it is not necessary to designate which event/instance substitutes which event/instance.

Example

>>> ref = 'xxx'
>>> hyp = 'yyy'
substitutions_out

Amount that is substituted.

If this stats are representing stats for a specific instance (e.g. occurrence of the word ‘hello’) substitutions_out is the amount where the specific instance was output, when some other event/instance was expected (reference). If not it is equal to substitutions.

Example

>>> ref = 'yyy'
>>> hyp = 'xxx'