Skip to content

Commit

Permalink
Check for Nans and Infs in TensorboardMetric (#2628)
Browse files Browse the repository at this point in the history
Summary:
Pull Request resolved: #2628

Raises a ValueError in bulk_fetch_trial_data if a Nan or an Inf is found. This will get wrapped up in a MetricFetchE and handled appropriately in the Scheduler (ex. INFO if we intend to try and fetch again, WARN if coming from a tracking metric, mark trial as ABANDONED if the metric is needed for the optimization https://fburl.com/code/eq37gghi).

Differential Revision: D60670356
  • Loading branch information
mpolson64 authored and facebook-github-bot committed Aug 2, 2024
1 parent f25b4ce commit 9297abe
Showing 1 changed file with 8 additions and 0 deletions.
8 changes: 8 additions & 0 deletions ax/metrics/tensorboard.py
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,8 @@
from logging import Logger
from typing import Any, Dict, List, Optional

import numpy as np

import pandas as pd
from ax.core.base_trial import BaseTrial
from ax.core.map_data import MapData, MapKeyInfo
Expand Down Expand Up @@ -166,6 +168,12 @@ def bulk_fetch_trial_data(
.reset_index()
)

# If there are any NaNs or Infs in the data, raise an Exception
if np.any(~np.isfinite(df["mean"])):
raise ValueError(
f"Found NaNs or Infs in data for {metric.name}."
)

# Apply per-metric post-processing
# Apply cumulative "best" (min if lower_is_better)
if metric.cumulative_best:
Expand Down

0 comments on commit 9297abe

Please sign in to comment.