Manuscript Audit Brief
Audit any manuscript or slide narrative against the earnings-event-vol
research contract.
Valid Claim Shape
The manuscript may claim, if supported by the reported evidence:
We test whether models improve the ranking of option-implied earnings event variance mispricing in risk-defined earnings option strategies.
For the current local results, the claim must stay narrower:
In a no-NBBO proxy sample, model features show preliminary cross-sectional ranking signal for earnings event-variance mispricing. Paper-grade tradability requires bid/ask or NBBO execution data.
It must not claim:
- generic IV forecasting superiority;
- paper-grade executable backtest results from second-aggregate trade bars;
- that lower RMSE alone implies economic value;
- that official
mamba-ssmor any sequence model is the contribution independent of baselines, ablations, and costs; - that calendar spreads isolate pure event variance.
Required Manuscript Elements
Research object:
- Earnings announcements are scheduled jump-risk events.
- Options embed an ex ante market-implied event variance.
- Models forecast C2O, C2C, and O2C realized-variance targets.
- The primary scientific target is
RVAR_event_jump_c2o. - V1 tradable proxy mispricing uses C2C:
RVAR_event_day_c2c - IVAR_event. - Trading entry is evaluated in USD premium space, not raw variance space.
Data:
- The options source and entitlement window are stated explicitly.
- Current local proxy results use Massive option second aggregates and option day aggregates from the observed 2022-onward entitlement window.
- Market-data inputs are stated precisely: options day aggregates for universe, contract, IV proxy, fallback exit diagnostics, and sequence construction; underlying day aggregates for vendor OHLC opens, C2O/C2C/O2C targets, and exit spot; targeted option one-second trade aggregates for entry pricing, primary C2C exit pricing, and post-open C2O/O2C open-anchor pricing.
- Entry second aggregates are restricted to the pre-cutoff buffer, default 60 minutes before event cutoff, and the selected entry price is the true per-leg volume-weighted option VWAP over the final 900 seconds before cutoff.
- The option open anchor is unified as a trade-aggregate 5-15 minute post-open VWAP. It is the primary C2O comparison mark and the O2C diagnostic entry mark; 0-5 minute VWAP is only an opening microstructure stress test.
- C2C exits use exit-date preclose 15-minute option VWAP as the primary proxy; option day-aggregate close is not used as a strategy-exit fallback.
- O2C proxy PnL is a realized decomposition diagnostic, not a model-driven strategy headline without a post-open residual-IV baseline.
- The 2013-2025 sample is described as the target paper range unless historical option data for that range has actually been acquired and processed.
- Earnings events come from SEC EDGAR submissions plus SEC primary filing document validation.
- Massive 8-K text is auxiliary fallback only.
- BMO/AMC rules are explicit; DMH and unknown events are excluded in v1.
- Universe construction filters non-single-name symbols before selecting the monthly top 50.
- Any second-aggregate or trade-price proxy result is labeled
no_nbbo_trade_proxyand separated from bid/ask executable backtests.
Variables:
RVAR_event_jump_c2o = log(open_after / close_before)^2.RVAR_event_day_c2c = log(close_after / close_before)^2.rvar_eventis the C2C backward-compatible alias.IVAR_eventis extracted from two-expiry total implied variance.- IVAR extraction failures are reported by reason, including missing event- covering expiries, nonmonotone total variance, and negative extracted IVAR.
- Feature as-of timestamps are before or at event entry.
iv_butterfly_25dor proxy curve-shape measures are defined before use.
Models:
- Market-implied IVAR is the primary benchmark.
- Historical event baselines and Goyal-Saretto-style RV-IV spread are included.
- Elastic Net and LightGBM/XGBoost are included before deep-model claims.
- FT-Transformer and sequence diagnostics are positioned after strong tabular baselines.
- Sequence results include coverage, drop-rate diagnostics, mask-only controls, and deterministic time-shuffle controls.
- If LightGBM/XGBoost beat the sequence suite, the conclusion is that tabular nonlinear interactions currently dominate the proxy sequence route.
Evaluation:
- Forecast metrics include MAE, RMSE, QLIKE, and OOS R2 versus market-implied IVAR.
- Ranking metrics include AUC, Brier score, calibration, and top-decile precision.
- Strategy metrics include net proxy PnL, return on premium or capital, Sharpe, Sortino, drawdown, hit rate, tail loss, turnover, and cost sensitivity.
- Inference does not rely on naive t-stats only; event-date, ticker, two-way clustering, block bootstrap, or model-comparison corrections are used when the paper moves beyond proxy screening.
Backtests:
- Long ATM straddle tests predicted cheap event volatility.
- Short iron fly tests predicted rich event volatility.
- Current proxy PnL is explicitly non-NBBO and non-paper-grade.
- Paper-grade execution claims require historical bid/ask or NBBO, realistic spread crossing, and leg-level cost accounting.
- Mid or haircut results are labeled as sensitivity or proxy cases, not the main tradability evidence.
- Multi-leg fills disclose simultaneous-fill assumptions and unmodeled legging risk.
Red Flags
- "Mamba predicts IV better" as the headline.
- Random train/test split.
- Missing BMO/AMC alignment.
- Trades entered after the event cutoff.
- Second-aggregate trade bars described as quotes, mid, bid/ask, or NBBO.
- Full-spread results omitted while making executable strategy claims.
- Deep models compared only against weak neural baselines and not LightGBM or XGBoost.
- Calendar returns interpreted as pure event-variance returns.
- Variance-space edge compared directly to dollar transaction costs.
- Proxy results from 2022-2025 presented as full 2013-2025 paper evidence.