Data Health

Live monitoring of BTC/USDT 4H dataset, ML predictions, and feature coverage.

All Systems Healthy

Total OHLCV Bars

14,122

6.4 years of 4H data

ML Predictions

6,306

35 months coverage

Features Engineered

74

13,719 aligned samples

Data Gaps

0

No gaps > 1 day

Dataset Overview

Source

Binance Spot BTC/USDT 4H

Date Range

2020-01-01 → 2026-06-11

Duration

2,353 days (6.4 years)

Missing Values

0 (all columns complete)

Duplicate Timestamps

0

Monthly Density

65 – 186 bars/month (avg 181)

Yearly Breakdown

Year Bars Start End Avg Close Coverage
20202,195Jan 1Dec 31$11,076100%
20212,190Jan 1Dec 31$47,355100%
20222,190Jan 1Dec 31$28,219100%
20232,190Jan 1Dec 31$28,807100%
20242,196Jan 1Dec 31$65,904100%
20252,190Jan 1Dec 31$101,629100%
2026971Jan 1Jun 11$75,52141%

ML Pipeline Status

LightGBM Predictions

  • Coverage: 6,306 bars (2023-02 → 2025-12)
  • Missing values: 0
  • Regimes: chop 58.3%, crash 16.5%, bear 13.8%, bull 9.9%

Triple-Barrier Labels

  • Total labels: 14,117
  • Positive (profitable): 5,301 (37.6%)
  • Exit: stop-loss 57.5%, take-profit 27.1%, time 15.4%

Feature Matrix

  • Shape: 13,719 × 74
  • NaN count: 0
  • Inf count: 0
  • Base features: 97 → 74 after collinearity audit

Alignment

  • OHLCV ∩ Predictions: 6,306 bars
  • OHLCV ∩ Labels: 14,117 bars
  • Predictions ∩ Labels: 6,306 bars

Data Integrity Checks

No missing values
No duplicate timestamps
No gaps > 1 day
No NaN in features
No Inf in features
Full 6.4yr coverage