Evaluation API

Evaluation processors for assessing correction quality.

Utilities

ReferenceDataMixin

class facet.evaluation.metrics.ReferenceDataMixin[source]

Bases: object

Mixin for extracting reference data (outside acquisition).

get_eeg_channels(raw: BaseRaw) → ndarray[source]

Get EEG channel indices.

Parameters:: raw (mne.io.BaseRaw) – Raw data object.
Returns:: Array of EEG channel indices.
Return type:: numpy.ndarray

get_reference_data(raw: BaseRaw, triggers: ndarray, artifact_length: int, time_buffer: float = 0.1, context: ProcessingContext | None = None) → ndarray[source]

Extract reference data (outside acquisition window).

Parameters:

raw (mne.io.BaseRaw) – Raw data object.
triggers (numpy.ndarray) – Trigger indices.
artifact_length (int) – Length of one artifact in samples.
time_buffer (float, optional) – Buffer in seconds to stay away from acquisition (default: 0.1).
context (facet.core.ProcessingContext, optional) – Current processing context. If it contains a user-selected reference interval (set by ReferenceIntervalSelector), that interval is used before falling back to automatic extraction.

Returns:

Array of shape (n_channels, n_times) containing concatenated reference data.

Return type:

numpy.ndarray

get_acquisition_data(raw: BaseRaw, triggers: ndarray, artifact_length: int, context: ProcessingContext | None = None) → ndarray[source]

Extract data within the acquisition window.

Parameters:

raw (mne.io.BaseRaw) – Raw data object.
triggers (numpy.ndarray) – Trigger indices.
artifact_length (int) – Length of one artifact in samples.
context (facet.core.ProcessingContext, optional) – Current processing context. If it contains a user-selected evaluation interval (set by SignalIntervalSelector), that interval is used before falling back to automatic extraction.

Returns:

Array of shape (n_channels, n_times) from the acquisition window.

Return type:

numpy.ndarray

Interactive Selectors

ReferenceIntervalSelector

class facet.evaluation.ReferenceIntervalSelector(channel: str | int | None = None, min_duration: float = 0.5, tmin: float | None = None, tmax: float | None = None)[source]

Bases: Processor, ReferenceDataMixin

Interactively select a clean reference interval from a signal plot.

Opens a Matplotlib GUI window for one EEG channel and lets the user drag a time span. The selected region is highlighted in green and, after confirmation, stored in context.metadata.custom['reference_interval']. Downstream metrics processors can use this interval as explicit reference data.

Parameters:

channel (str or int, optional) – Channel to display. If None (default), the first EEG channel is used.
min_duration (float, optional) – Minimum selectable interval length in seconds (default: 0.5).
tmin (float, optional) – Reference interval start in seconds. If provided together with tmax (or alone), the selector GUI is skipped.
tmax (float, optional) – Reference interval end in seconds. If provided together with tmin (or alone), the selector GUI is skipped.

name: str = 'reference_interval_selector'

description: str = 'Interactively select clean reference interval for metrics'

version: str = '1.0.0'

requires_triggers: bool = False

requires_raw: bool = True

modifies_raw: bool = False

parallel_safe: bool = False

__init__(channel: str | int | None = None, min_duration: float = 0.5, tmin: float | None = None, tmax: float | None = None) → None[source]: Initialize processor.

validate(context: ProcessingContext) → None[source]

Validate that prerequisites are met.

Override this method to add custom validation logic.

Parameters:: context – Processing context
Raises:: ProcessorValidationError – If validation fails

process(context: ProcessingContext) → ProcessingContext[source]

Process the context.

This is the main method to implement in subclasses.

Parameters:: context – Input processing context
Returns:: Output processing context. If None is returned, the input context is used (no-op behavior).

SignalIntervalSelector

class facet.evaluation.SignalIntervalSelector(channel: str | int | None = None, min_duration: float = 0.5, tmin: float | None = None, tmax: float | None = None)[source]

Bases: Processor, ReferenceDataMixin

Interactively select the evaluated signal interval from a signal plot.

Opens a Matplotlib GUI window for one EEG channel and lets the user drag a time span that should be used as evaluated signal (typically acquisition). The selected region is highlighted in blue and, after confirmation, stored in context.metadata.custom['evaluation_interval']. Downstream metrics processors can use this interval as explicit acquisition/evaluation data.

The trigger-derived acquisition window is additionally shown as a weak orange background hint to help orient manual selection when boundaries are not obvious after correction.

Parameters:

channel (str or int, optional) – Channel to display. If None (default), the first EEG channel is used.
min_duration (float, optional) – Minimum selectable interval length in seconds (default: 0.5).
tmin (float, optional) – Evaluated interval start in seconds. If provided together with tmax (or alone), the selector GUI is skipped.
tmax (float, optional) – Evaluated interval end in seconds. If provided together with tmin (or alone), the selector GUI is skipped.

name: str = 'signal_interval_selector'

description: str = 'Interactively select evaluated signal interval for metrics'

version: str = '1.0.0'

requires_triggers: bool = True

requires_raw: bool = True

modifies_raw: bool = False

parallel_safe: bool = False

__init__(channel: str | int | None = None, min_duration: float = 0.5, tmin: float | None = None, tmax: float | None = None) → None[source]: Initialize processor.

validate(context: ProcessingContext) → None[source]

Validate that prerequisites are met.

Override this method to add custom validation logic.

Parameters:: context – Processing context
Raises:: ProcessorValidationError – If validation fails

process(context: ProcessingContext) → ProcessingContext[source]

Process the context.

This is the main method to implement in subclasses.

Parameters:: context – Input processing context
Returns:: Output processing context. If None is returned, the input context is used (no-op behavior).

Metrics Calculators

SNRCalculator

class facet.evaluation.SNRCalculator(time_buffer: float = 0.1, verbose: bool = False)[source]

Bases: Processor, ReferenceDataMixin

Calculate Signal-to-Noise Ratio (SNR).

Compares corrected data to a clean reference (data outside acquisition window). Higher SNR indicates better correction.

SNR = variance(reference) / variance(residual)

Parameters:: time_buffer (float, optional) – Time buffer around acquisition window in seconds (default: 0.1).

Examples

snr = SNRCalculator()
context = snr.execute(context)
print(context.metadata.custom['snr'])

name: str = 'snr_calculator'

description: str = 'Calculate Signal-to-Noise Ratio'

version: str = '1.0.0'

requires_triggers: bool = True

requires_raw: bool = True

modifies_raw: bool = False

parallel_safe: bool = False

__init__(time_buffer: float = 0.1, verbose: bool = False) → None[source]: Initialize processor.

validate(context: ProcessingContext) → None[source]

Validate that prerequisites are met.

Override this method to add custom validation logic.

Parameters:: context – Processing context
Raises:: ProcessorValidationError – If validation fails

process(context: ProcessingContext) → ProcessingContext[source]

Process the context.

This is the main method to implement in subclasses.

Parameters:: context – Input processing context
Returns:: Output processing context. If None is returned, the input context is used (no-op behavior).

LegacySNRCalculator

class facet.evaluation.LegacySNRCalculator(verbose: bool = False)[source]

Bases: Processor

Calculate legacy-style Signal-to-Noise Ratio (SNR).

Mirrors the original FACET implementation by comparing the variance of the corrected data to the variance of the uncorrected reference recording.

name: str = 'legacy_snr_calculator'

description: str = 'Legacy-style SNR using original raw as reference'

version: str = '1.0.0'

requires_triggers: bool = False

requires_raw: bool = True

modifies_raw: bool = False

parallel_safe: bool = False

__init__(verbose: bool = False) → None[source]: Initialize processor.

validate(context: ProcessingContext) → None[source]

Validate that prerequisites are met.

Override this method to add custom validation logic.

Parameters:: context – Processing context
Raises:: ProcessorValidationError – If validation fails

process(context: ProcessingContext) → ProcessingContext[source]

Process the context.

This is the main method to implement in subclasses.

Parameters:: context – Input processing context
Returns:: Output processing context. If None is returned, the input context is used (no-op behavior).

RMSCalculator

class facet.evaluation.RMSCalculator(verbose: bool = False)[source]

Bases: Processor

Calculate Root Mean Square (RMS) improvement ratio.

Compares RMS of corrected data to uncorrected data. A higher ratio indicates better correction (more artifact removed).

RMS_ratio = RMS(uncorrected) / RMS(corrected)

Examples

rms = RMSCalculator()
context = rms.execute(context)
print(context.metadata.custom['rms_ratio'])

name: str = 'rms_calculator'

description: str = 'Calculate RMS improvement ratio'

version: str = '1.0.0'

requires_triggers: bool = True

requires_raw: bool = True

modifies_raw: bool = False

parallel_safe: bool = False

__init__(verbose: bool = False) → None[source]: Initialize processor.

validate(context: ProcessingContext) → None[source]

Validate that prerequisites are met.

Override this method to add custom validation logic.

Parameters:: context – Processing context
Raises:: ProcessorValidationError – If validation fails

process(context: ProcessingContext) → ProcessingContext[source]

Process the context.

This is the main method to implement in subclasses.

Parameters:: context – Input processing context
Returns:: Output processing context. If None is returned, the input context is used (no-op behavior).

RMSResidualCalculator

class facet.evaluation.RMSResidualCalculator(time_buffer: float = 0.1, verbose: bool = False)[source]

Bases: Processor, ReferenceDataMixin

Calculate RMS Residual Ratio (corrected vs. reference).

Compares the RMS of the corrected signal (during acquisition) to the RMS of the clean reference signal (outside acquisition).

Ratio = RMS(corrected) / RMS(reference)

A ratio of 1.0 is the target. Values below 1.0 suggest over-correction; values above 1.0 indicate residual artifacts. Corresponds to rms_residual in FACET MATLAB Edition.

Parameters:: time_buffer (float, optional) – Time buffer around acquisition window in seconds (default: 0.1).

name: str = 'rms_residual_calculator'

description: str = 'Calculate RMS ratio (corrected vs reference)'

version: str = '1.0.0'

requires_triggers: bool = True

requires_raw: bool = True

modifies_raw: bool = False

parallel_safe: bool = False

__init__(time_buffer: float = 0.1, verbose: bool = False) → None[source]: Initialize processor.

process(context: ProcessingContext) → ProcessingContext[source]

Process the context.

This is the main method to implement in subclasses.

Parameters:: context – Input processing context
Returns:: Output processing context. If None is returned, the input context is used (no-op behavior).

MedianArtifactCalculator

class facet.evaluation.MedianArtifactCalculator(verbose: bool = False)[source]

Bases: Processor, ReferenceDataMixin

Calculate median peak-to-peak artifact amplitude.

Measures the median artifact amplitude across all epochs. Lower values indicate smaller artifacts (better correction).

Also calculates the ratio to the median amplitude of the reference signal (outside acquisition), which should ideally be close to 1.0.

Examples

median = MedianArtifactCalculator()
context = median.execute(context)
print(context.metadata.custom['median_artifact'])

name: str = 'median_artifact_calculator'

description: str = 'Calculate median artifact amplitude'

version: str = '1.0.0'

requires_triggers: bool = True

requires_raw: bool = True

modifies_raw: bool = False

parallel_safe: bool = False

__init__(verbose: bool = False) → None[source]: Initialize processor.

validate(context: ProcessingContext) → None[source]

Validate that prerequisites are met.

Override this method to add custom validation logic.

Parameters:: context – Processing context
Raises:: ProcessorValidationError – If validation fails

process(context: ProcessingContext) → ProcessingContext[source]

Process the context.

This is the main method to implement in subclasses.

Parameters:: context – Input processing context
Returns:: Output processing context. If None is returned, the input context is used (no-op behavior).

FFTAllenCalculator

class facet.evaluation.FFTAllenCalculator(verbose: bool = False)[source]

Bases: Processor, ReferenceDataMixin

Calculate FFT Allen metric.

Compares spectral power in specific frequency bands between corrected data and clean reference data. The metric is the median absolute percent difference per band.

Bands: 0.8–4 Hz (delta), 4–8 Hz (theta), 8–12 Hz (alpha), 12–24 Hz (beta).

Formula:

metric = median(|Power_corr - Power_ref| / Power_ref) * 100

name: str = 'fft_allen_calculator'

description: str = 'Calculate spectral power difference (Allen)'

version: str = '1.0.0'

requires_triggers: bool = True

requires_raw: bool = True

modifies_raw: bool = False

parallel_safe: bool = False

BANDS = [(0.8, 4, 'Delta'), (4, 8, 'Theta'), (8, 12, 'Alpha'), (12, 24, 'Beta')]

__init__(verbose: bool = False) → None[source]: Initialize processor.

process(context: ProcessingContext) → ProcessingContext[source]

Process the context.

This is the main method to implement in subclasses.

Parameters:: context – Input processing context
Returns:: Output processing context. If None is returned, the input context is used (no-op behavior).

FFTNiazyCalculator

class facet.evaluation.FFTNiazyCalculator(verbose: bool = False)[source]

Bases: Processor, ReferenceDataMixin

Calculate FFT Niazy metric.

Analyzes residual artifacts at slice and volume frequencies by computing the power ratio (uncorrected / corrected) at these frequencies and their harmonics. Values are reported in dB.

name: str = 'fft_niazy_calculator'

description: str = 'Calculate spectral power ratio at slice/volume frequencies'

version: str = '1.0.0'

requires_triggers: bool = True

requires_raw: bool = True

modifies_raw: bool = False

parallel_safe: bool = False

__init__(verbose: bool = False) → None[source]: Initialize processor.

process(context: ProcessingContext) → ProcessingContext[source]

Process the context.

This is the main method to implement in subclasses.

Parameters:: context – Input processing context
Returns:: Output processing context. If None is returned, the input context is used (no-op behavior).

Reports

MetricsReport

class facet.evaluation.MetricsReport(name: str | None = None, store: dict | None = None)[source]

Bases: Processor

Generate a summary report of all calculated metrics.

Collects all metrics from context and logs a formatted summary. Can also store results in a shared dictionary for comparison and plotting.

Parameters:

name (str, optional) – Name of the result set (e.g., "Pipeline A"). If None, a default name is generated during processing.
store (dict, optional) – Dictionary to accumulate results. Structure: {name: {metric: value}}.

Examples

# Basic usage
report = MetricsReport()
context = report.execute(context)

# Advanced usage (collecting results for comparison)
results = {}
report = MetricsReport(name="Pipeline A", store=results)
context = report.execute(context)

# Plot comparison
MetricsReport.plot(results)

# Plot specific metrics
MetricsReport.plot(results, metrics=['snr', 'rms_ratio'])

name: str = 'metrics_report'

description: str = 'Generate metrics summary report'

version: str = '1.0.0'

requires_triggers: bool = False

requires_raw: bool = False

modifies_raw: bool = False

parallel_safe: bool = False

__init__(name: str | None = None, store: dict | None = None) → None[source]: Initialize processor.

process(context: ProcessingContext) → ProcessingContext[source]

Process the context.

This is the main method to implement in subclasses.

Parameters:: context – Input processing context
Returns:: Output processing context. If None is returned, the input context is used (no-op behavior).

static compare(results: list | dict, labels: list[str] | None = None, title: str = 'Metrics Comparison', save_path: str | None = None, show: bool = True, metrics: list[str] | None = None) → None[source]

Compare metrics from a list of PipelineResult objects or a plain dict.

Accepts either:

A list of PipelineResult instances (optionally named via labels).
The legacy {name: {metric: value}} dict format used by MetricsReport.plot.

Parameters:

results (list or dict) – List of PipelineResult objects or legacy dict.
labels (list of str, optional) – Names for each result when passing a list. Defaults to ["Result 1", "Result 2", …].
title (str, optional) – Plot title (default: “Metrics Comparison”).
save_path (str, optional) – If set, save the figure to this path.
show (bool, optional) – Whether to display the figure interactively (default: True).
metrics (list of str, optional) – Subset of metric keys to show. If None, all metrics are plotted.

Examples

result_a = pipeline_aas.run()
result_b = pipeline_aas_pca.run()

MetricsReport.compare(
    [result_a, result_b],
    labels=["AAS only", "AAS + PCA"],
)

static plot(results: dict[str, dict[str, float]], title: str = 'Metrics Comparison', save_path: str | None = None, show: bool = True, metrics: list[str] | None = None) → None[source]

Plot comparison of metrics using Matplotlib.

Parameters:

results (dict) – Dictionary of results {name: {metric: value}}.
title (str, optional) – Plot title (default: “Metrics Comparison”).
save_path (str, optional) – Path to save the figure.
show (bool, optional) – Whether to show the plot (default: True).
metrics (list of str, optional) – Subset of metric keys to plot. If None, all are shown.

RawPlotter

class facet.evaluation.RawPlotter(mode: str = 'matplotlib', channel: str | int | None = None, start: float = 0.0, duration: float = 10.0, overlay_original: bool = True, scale: float = 1000000.0, save_path: str | Path | None = None, show: bool = False, auto_close: bool = True, figure_kwargs: dict[str, Any] | None = None, mne_kwargs: dict[str, Any] | None = None, picks: Sequence[int | str] | None = None, title: str | None = None, source: str = 'raw')[source]

Bases: Processor

Plot raw EEG data snippets during the pipeline.

Supports both Matplotlib-based summary figures as well as native MNE-Python interactive plots. By default, a Matplotlib plot is generated and saved to the configured path, overlaying the current corrected signal with the original recording for quick visual inspection.

Parameters:

mode (str, optional) – Plotting backend: 'matplotlib' (default) or 'mne'.
channel (str or int, optional) – Single channel to visualise (name or index). Defaults to the first EEG channel, falling back to the first channel overall.
start (float, optional) – Start time in seconds of the snippet to plot (default: 0.0).
duration (float, optional) – Duration in seconds of the snippet to plot (default: 10.0).
overlay_original (bool, optional) –
Overlay original recording when available (default: True). Semantics depend on source:
- source='raw' — overlays the original recording on top of the current (corrected) signal: the classical before/after view.
- source='prediction' — overlays the original noisy recording on top of the predicted artifact: useful for residual diagnostics — where the two curves diverge is where the model’s prediction misses part of the artifact (= residual artifact in the corrected signal).
scale (float, optional) – Multiplier applied to amplitude values (default: 1e6 for V → µV).
save_path (str or pathlib.Path, optional) – File path to save the generated plot.
show (bool, optional) – Whether to display the plot interactively (default: False).
auto_close (bool, optional) – Close the figure after saving when running headless (default: True).
figure_kwargs (dict, optional) – Additional keyword arguments forwarded to plt.subplots().
mne_kwargs (dict, optional) – Additional keyword arguments forwarded to mne.io.Raw.plot().
picks (collections.abc.Sequence of int or str, optional) – Explicit channel picks for MNE plotting mode.
title (str, optional) – Custom figure title for Matplotlib mode.
source (str, optional) –
Which signal to plot (default: 'raw'):
- 'raw' — current context.get_raw() data (corrected signal).
- 'prediction' — context.get_estimated_noise() (the predicted artifact written by a DeepLearningCorrection or compatible processor). Skipped with a warning if no prediction is present.

name: str = 'raw_plotter'

description: str = 'Plot raw data snippets for visual inspection.'

version: str = '1.0.0'

requires_triggers: bool = False

requires_raw: bool = True

modifies_raw: bool = False

parallel_safe: bool = False

__init__(mode: str = 'matplotlib', channel: str | int | None = None, start: float = 0.0, duration: float = 10.0, overlay_original: bool = True, scale: float = 1000000.0, save_path: str | Path | None = None, show: bool = False, auto_close: bool = True, figure_kwargs: dict[str, Any] | None = None, mne_kwargs: dict[str, Any] | None = None, picks: Sequence[int | str] | None = None, title: str | None = None, source: str = 'raw') → None[source]: Initialize processor.

process(context: ProcessingContext) → ProcessingContext[source]

Process the context.

This is the main method to implement in subclasses.

Parameters:: context – Input processing context
Returns:: Output processing context. If None is returned, the input context is used (no-op behavior).