Methodology
How the Data is Created
Our pipeline from raw sources to structured trade intelligence.
Source Collection
We ingest from GDELT's global monitoring network — scanning thousands of news wires, government publications, and media sources across 65+ languages and 150+ countries in near-real-time, with new content processed every 15 minutes.
Event Detection
AI models screen incoming content against trade-specific criteria to identify events that directly disrupt, constrain, or restrict the movement of physical goods — from port closures and sanctions enforcement to strikes, weather events, and military conflicts.
Classification
Each event is classified across multiple dimensions: disruption type, affected commodity or sector, geographic location (country, region, port, waterway), impacted trade routes, HS commodity chapter, and temporal status.
Severity Scoring
Events are scored from -1 (minor disruption) to -4 (critical disruption) using a blend of AI contextual assessment and keyword-based calibration. Confirmed events are weighted higher than forecasts. Within tracked multi-day events, severity only escalates — never decreases.
Trade Risk Index
A daily composite score (0–100) aggregates severity and event volume into a single risk metric. Levels: Moderate (0–29), Elevated (30–49), High (50–74), Extreme (75–100). The index includes a 7-day rolling average and day-over-day delta for trend detection.
Enrichment
Records are enriched with trade route inference, HS chapter codes, port and waterway identification, weather event classification, and detection timing metadata. Headline language is neutralised to remove editorial bias.
Validation
An automated QA gate checks every daily output before publication — verifying schema integrity, detecting duplicate events, flagging headline errors, and validating severity consistency. Issues are auto-fixed and re-validated before release.
Delivery
Final records are delivered daily as Apache Parquet files — one self-contained file per day with all active events. Each file is typically 15–75 events covering global trade disruptions detected in the previous 24 hours.
Severity Scale
Our disruption severity score captures the magnitude of impact on international trade flows.
Trade Risk Index
A daily composite score that aggregates individual event severity into a single market-wide risk metric.
The index includes a 7-day rolling average for trend smoothing and a day-over-day delta to flag rapid escalation. Historical index values are available in the daily data delivery.
Event Classification
Every event is tagged with a primary type reflecting the nature of the disruption.
Commodity Coverage
We track disruptions across 30+ commodity and sector categories spanning energy, agriculture, metals, manufacturing, and logistics.
Events without a specific commodity are tagged with descriptive sector categories: Port Operations, Maritime & Shipping, Aviation & Airports, Road & Rail Freight, Finance & Trade Policy, or Conflict & Security.
Data Quality
We maintain data integrity through multiple automated layers:
- AI-powered event detection with keyword cross-validation
- Semantic deduplication — same events tracked under a single ID across days and sources
- Temporal consistency — events progress forward only (forecast, developing, confirmed)
- Severity ratchet — within a tracked event, severity never decreases
- Pre-publish QA gate with auto-fix on every daily output
- Headline neutralisation to remove editorial and sensational language
- Daily dataset refresh with continuous monitoring
Output Format
Each daily record contains 20 structured fields: event_id, event_headline, event_status, impact_description, severity_score, event_type, commodity_sector, hs_chapter, country, region, city, port, waterway, weather_event, trade_routes, event_start, event_end, days_active, detected_at, hours_to_detect.
Data is delivered as daily Apache Parquet files. Each file is self-contained — all events active on that date in a single file, sorted by severity. The Trade Risk Index is delivered alongside as a JSON and Parquet file with daily scores, 7-day rolling averages, and risk levels. See the data section on our homepage for a sample record and delivery structure.