Adjusting Performance Thresholds for Legacy CNC Machines

Aging Fanuc, Siemens Sinumerik, and Heidenhain controllers were never designed to feed an OEE pipeline. They expose no OPC UA endpoint, so engineers fall back on raw PLC register polling, analog 4-20 mA current transducers, and discrete I/O state mapping — signal sources riddled with quantization noise and mechanical resonance. On this hardware a single static cycle-time setpoint misclassifies transient spindle chatter and feed-rate overrides as microstops, artificially depressing the Performance factor and corrupting capacity planning. This page is the legacy-machine case of threshold tuning for microstops: how to set, adapt, and validate the boundaries that turn noisy register reads from old iron into deterministic, auditable performance loss inside Downtime Classification & OEE Calculation.

The four configuration options below are the levers you actually tune per machine family. Each subsequent section covers one of them with a tight code or config snippet.

1. Signal conditioning before any threshold is evaluated Permalink to this section

Threshold logic is only as trustworthy as the series it reads. Legacy controllers output quantized encoder pulses and unfiltered VFD current that carry harmonic distortion and resonance spikes, so a deterministic conditioning stage must run first — the same upstream-hygiene principle behind PLC tag standardization. The contract: poll discrete I/O and analog registers at 50-100 Hz over Modbus TCP or a serial gateway, suppress everything above the machine’s mechanical bandwidth, and capture a per-program baseline under stable tooling.

import numpy as np
import pandas as pd
from scipy.signal import savgol_filter

def condition_feed_signal(
    df: pd.DataFrame,
    signal_col: str = "feed_rate_mm_min",
    ema_span: int = 8,
    savgol_window: int = 11,
    savgol_poly: int = 3,
) -> pd.DataFrame:
    """Suppress quantization noise on a legacy CNC feed-rate stream.

    EMA removes high-frequency VFD ripple; Savitzky-Golay preserves the
    acceleration peaks that distinguish a real cut from a feed hold.
    """
    if len(df) < savgol_window:
        # Too few samples to filter safely — return raw, flag for review.
        df = df.assign(feed_filtered=df[signal_col], conditioning_ok=False)
        return df
    ema = df[signal_col].ewm(span=ema_span, adjust=False).mean()
    smoothed = savgol_filter(ema.to_numpy(), savgol_window, savgol_poly)
    return df.assign(feed_filtered=smoothed, conditioning_ok=True)

Record nominal cycle times per part_program_id as baseline_cycle_time once this stage is clean. Skip conditioning and every threshold downstream evaluates corrupted data, guaranteeing false-positive downtime.

2. Rolling-median adaptive thresholds instead of fixed setpoints Permalink to this section

Static thresholds fail on legacy machines because legitimate variance — thermal expansion drift, operator feed-rate overrides, manual jogs — temporarily slows axis velocity without a fault. Replace the fixed setpoint with a rolling-median baseline and a deviation multiplier (a tolerance_factor of roughly 1.15-1.30). The median, not the mean, prevents a single outlier from inflating the band; production code should lean on optimized windowing such as pandas.DataFrame.rolling for deterministic latency. Where the noise is bursty rather than Gaussian, pair the band with the techniques in outlier detection methods before it ever reaches the threshold comparator.

def compute_dynamic_threshold(
    df: pd.DataFrame,
    cycle_col: str = "cycle_time_sec",
    window: int = 10,
    tolerance_factor: float = 1.20,
) -> pd.DataFrame:
    """Rolling-median baseline with a deviation multiplier as the ceiling."""
    df = df.copy()
    df["rolling_median_cycle"] = (
        df[cycle_col].rolling(window=window, min_periods=1, center=False).median()
    )
    df["upper_threshold"] = df["rolling_median_cycle"] * tolerance_factor
    return df

3. Debounce and M/T-code masking to reject expected slowdowns Permalink to this section

The boundary alone over-counts. Tool changes (Txx), coolant activation (M08), and pallet swaps (M60) all produce predictable velocity drops that are not microstops, and momentary feed holds bounce across the threshold for a fraction of a second. Two guards fix this: a debounce timer that only escalates a breach once it persists past debounce_sec, and a mask that cross-references discrete register states to exclude planned non-cutting intervals. This is the highest-resolution instance of event-to-downtime mapping — every classified interval must route to exactly one reason code such as MICROSTOP_FEED_HOLD or TOOL_CHANGE.

def classify_microstops(
    df: pd.DataFrame,
    cycle_col: str = "cycle_time_sec",
    debounce_sec: float = 15.0,
) -> pd.DataFrame:
    """Flag breaches that persist past the debounce window and are not masked."""
    df = df.copy()
    df["breach"] = df[cycle_col] > df["upper_threshold"]
    # Group consecutive breaches; measure how long each run has persisted.
    run_id = (~df["breach"]).cumsum()
    df["breach_duration"] = df.groupby(run_id)["timestamp"].transform(
        lambda x: (x - x.iloc[0]).dt.total_seconds()
    )
    df["is_microstop"] = df["breach"] & (df["breach_duration"] > debounce_sec)
    # Mask expected non-cutting intervals decoded from M/T-code registers.
    df["is_microstop"] &= ~df["is_tool_change"]
    return df

4. Shift-boundary alignment and OEE sanity checks Permalink to this section

Legacy controllers suffer CMOS battery degradation and real-time-clock drift, so their timestamps cannot anchor a shift report. Ingest with UTC epoch tags, convert to facility-local time against an NTP-synchronized reference, and apply the same clock drift correction you would on any edge gateway before assigning a shift. A cycle_continuity flag attributes a cycle that straddles a shift change to its originating period — the detail that shift boundary logic exists to enforce — preventing phantom availability drops at midnight.

def assign_shift_id(df: pd.DataFrame, ts_col: str = "timestamp") -> pd.DataFrame:
    """Three-shift pattern on NTP-corrected local time, not controller clocks.

    A 06:00-14:00, B 14:00-22:00, C 22:00-06:00. Adjust per facility calendar.
    """
    df = df.copy()
    hour = df[ts_col].dt.hour
    conditions = [(hour >= 6) & (hour < 14), (hour >= 14) & (hour < 22)]
    df["shift_id"] = np.select(conditions, ["Shift_A", "Shift_B"], default="Shift_C")
    return df

Tuned thresholds still need a final gate before metrics publish. OEE is the product of three ratios, and it is acutely sensitive to a stale ideal_cycle_time:

$\text{OEE} = \text{Availability} \times \text{Performance} \times \text{Quality}$

Performance above roughly 1.05 almost always signals an outdated baseline or an unfiltered telemetry spike, not a machine outrunning physics. Enforce the assertions described under OEE formula validation — bound every factor to [0.0, 1.0] (Performance may carry a small over-ideal tolerance), reconcile scrap against total parts, and route any batch that fails to a dead-letter queue rather than to the MES dashboard. Standardizing on ISO 22400 keeps the resulting numbers comparable across facilities and audit-ready. Persist tuned parameters in a version-controlled registry (Consul, AWS Parameter Store) so each machine family can be A/B tuned without redeployment, and monitor false_positive_rate and missed_microstop_count to converge on the right tolerance_factor and debounce_sec.

Gotchas & anti-patterns Permalink to this section

Reusing one threshold across a machine family. A 1995 vertical mill and a 2008 horizontal machining center on the same line have different resonance signatures; share a tolerance_factor between them and one will over-count while the other goes blind.
Trusting the controller clock. CMOS-drifted timestamps silently misattribute microstops to the wrong shift. Always re-anchor to an NTP/PTP reference before classification.
Filtering away the signal. Over-aggressive EMA spans or wide Savitzky-Golay windows flatten the genuine acceleration peaks that separate a cut from a feed hold, turning real microstops invisible.
Forgetting M/T-code masking. Without decoding tool-change and pallet-swap registers, every planned non-cutting interval inflates Performance loss and pollutes the reason-code distribution.
Tuning the multiplier without a debounce. Loosening tolerance_factor to kill false positives also raises the floor for real stops; the debounce timer, not a looser band, is the correct knob for contact-bounce flicker.

Quick-reference: which lever to turn Permalink to this section

Symptom on a legacy CNC	Root cause	Parameter to adjust	Typical move
Microstops fire on every tool change	No auxiliary-code masking	M/T-code mask	Decode `Txx`/`M06`/`M60` registers, exclude
Sub-second flicker counted as stops	Contact bounce / scan-cycle jitter	`debounce_sec`	Raise to 10-20 s
Performance > 1.05 in reports	Stale baseline or spikes	`ideal_cycle_time` / conditioning	Re-baseline; tighten EMA span
Band trips on operator feed overrides	Static or too-tight ceiling	`tolerance_factor`	Raise toward 1.30
Slow to catch a degrading spindle	Window too long	rolling `window`	Shorten to 6-8 cycles
Stops mis-assigned at midnight	RTC drift / shift split	clock sync + `cycle_continuity`	NTP-anchor; flag straddling cycles

Threshold Tuning for Microstops — the parent subsystem and its shared parameter contract
Calculating OEE with Overlapping Maintenance Windows — a sibling availability-edge-case recipe
Shift Boundary Logic — aligning telemetry to production schedules
OEE Formula Validation — sanity assertions before metrics publish
Clock Drift Correction — fixing the legacy RTC problem upstream

Adjusting Performance Thresholds for Legacy CNC Machines

1. Signal conditioning before any threshold is evaluated #Permalink to this section

2. Rolling-median adaptive thresholds instead of fixed setpoints #Permalink to this section

3. Debounce and M/T-code masking to reject expected slowdowns #Permalink to this section

4. Shift-boundary alignment and OEE sanity checks #Permalink to this section

Gotchas & anti-patterns #Permalink to this section

Quick-reference: which lever to turn #Permalink to this section

Related #Permalink to this section