Skip to content

Manuscript Audit Brief

Audit any manuscript or slide narrative against the earnings-event-vol research contract.

Valid Claim Shape

The manuscript may claim, if supported by the reported evidence:

We test whether models improve the ranking of option-implied earnings event variance mispricing in risk-defined earnings option strategies.

For the current local results, the claim must stay narrower:

In a no-NBBO proxy sample, model features show preliminary cross-sectional ranking signal for earnings event-variance mispricing. Paper-grade tradability requires bid/ask or NBBO execution data.

It must not claim:

  • generic IV forecasting superiority;
  • paper-grade executable backtest results from second-aggregate trade bars;
  • that lower RMSE alone implies economic value;
  • that official mamba-ssm or any sequence model is the contribution independent of baselines, ablations, and costs;
  • that calendar spreads isolate pure event variance.

Required Manuscript Elements

Research object:

  • Earnings announcements are scheduled jump-risk events.
  • Options embed an ex ante market-implied event variance.
  • Models forecast C2O, C2C, and O2C realized-variance targets.
  • The primary scientific target is RVAR_event_jump_c2o.
  • V1 tradable proxy mispricing uses C2C: RVAR_event_day_c2c - IVAR_event.
  • Trading entry is evaluated in USD premium space, not raw variance space.

Data:

  • The options source and entitlement window are stated explicitly.
  • Current local proxy results use Massive option second aggregates and option day aggregates from the observed 2022-onward entitlement window.
  • Market-data inputs are stated precisely: options day aggregates for universe, contract, IV proxy, fallback exit diagnostics, and sequence construction; underlying day aggregates for vendor OHLC opens, C2O/C2C/O2C targets, and exit spot; targeted option one-second trade aggregates for entry pricing, primary C2C exit pricing, and post-open C2O/O2C open-anchor pricing.
  • Entry second aggregates are restricted to the pre-cutoff buffer, default 60 minutes before event cutoff, and the selected entry price is the true per-leg volume-weighted option VWAP over the final 900 seconds before cutoff.
  • The option open anchor is unified as a trade-aggregate 5-15 minute post-open VWAP. It is the primary C2O comparison mark and the O2C diagnostic entry mark; 0-5 minute VWAP is only an opening microstructure stress test.
  • C2C exits use exit-date preclose 15-minute option VWAP as the primary proxy; option day-aggregate close is not used as a strategy-exit fallback.
  • O2C proxy PnL is a realized decomposition diagnostic, not a model-driven strategy headline without a post-open residual-IV baseline.
  • The 2013-2025 sample is described as the target paper range unless historical option data for that range has actually been acquired and processed.
  • Earnings events come from SEC EDGAR submissions plus SEC primary filing document validation.
  • Massive 8-K text is auxiliary fallback only.
  • BMO/AMC rules are explicit; DMH and unknown events are excluded in v1.
  • Universe construction filters non-single-name symbols before selecting the monthly top 50.
  • Any second-aggregate or trade-price proxy result is labeled no_nbbo_trade_proxy and separated from bid/ask executable backtests.

Variables:

  • RVAR_event_jump_c2o = log(open_after / close_before)^2.
  • RVAR_event_day_c2c = log(close_after / close_before)^2.
  • rvar_event is the C2C backward-compatible alias.
  • IVAR_event is extracted from two-expiry total implied variance.
  • IVAR extraction failures are reported by reason, including missing event- covering expiries, nonmonotone total variance, and negative extracted IVAR.
  • Feature as-of timestamps are before or at event entry.
  • iv_butterfly_25d or proxy curve-shape measures are defined before use.

Models:

  • Market-implied IVAR is the primary benchmark.
  • Historical event baselines and Goyal-Saretto-style RV-IV spread are included.
  • Elastic Net and LightGBM/XGBoost are included before deep-model claims.
  • FT-Transformer and sequence diagnostics are positioned after strong tabular baselines.
  • Sequence results include coverage, drop-rate diagnostics, mask-only controls, and deterministic time-shuffle controls.
  • If LightGBM/XGBoost beat the sequence suite, the conclusion is that tabular nonlinear interactions currently dominate the proxy sequence route.

Evaluation:

  • Forecast metrics include MAE, RMSE, QLIKE, and OOS R2 versus market-implied IVAR.
  • Ranking metrics include AUC, Brier score, calibration, and top-decile precision.
  • Strategy metrics include net proxy PnL, return on premium or capital, Sharpe, Sortino, drawdown, hit rate, tail loss, turnover, and cost sensitivity.
  • Inference does not rely on naive t-stats only; event-date, ticker, two-way clustering, block bootstrap, or model-comparison corrections are used when the paper moves beyond proxy screening.

Backtests:

  • Long ATM straddle tests predicted cheap event volatility.
  • Short iron fly tests predicted rich event volatility.
  • Current proxy PnL is explicitly non-NBBO and non-paper-grade.
  • Paper-grade execution claims require historical bid/ask or NBBO, realistic spread crossing, and leg-level cost accounting.
  • Mid or haircut results are labeled as sensitivity or proxy cases, not the main tradability evidence.
  • Multi-leg fills disclose simultaneous-fill assumptions and unmodeled legging risk.

Red Flags

  • "Mamba predicts IV better" as the headline.
  • Random train/test split.
  • Missing BMO/AMC alignment.
  • Trades entered after the event cutoff.
  • Second-aggregate trade bars described as quotes, mid, bid/ask, or NBBO.
  • Full-spread results omitted while making executable strategy claims.
  • Deep models compared only against weak neural baselines and not LightGBM or XGBoost.
  • Calendar returns interpreted as pure event-variance returns.
  • Variance-space edge compared directly to dollar transaction costs.
  • Proxy results from 2022-2025 presented as full 2013-2025 paper evidence.