Cefic Comext ETL status
Eurostat Comext → PostgreSQL → Parquet export.
Imports the chemistry-relevant CN codes (chapters 20–39) every Monday morning,
then republishes 3 Parquet files (a fact table plus partner and product dimensions) consumed by the EU27 chemicals trade radar.
Declarants are restricted to EU27: GB/UK is excluded from the declarant side and the column is flattened to the constant 'EU'.
Latest run
How idempotency works →Most recent execution of the weekly cron: what it tried, what it imported, what it skipped.
No summary available.
Published Parquet outputs
Parquet schema →Three artifacts produced by the weekly export: the fact table filtered on chemistry CN codes, plus the partner-country and product (NC) dimensions rebuilt from Eurostat's SDMX codelists. All three are regenerated in the same export_parquet.py run.
period, declarant, partner, product_nc, cpa2015, chapter_cn, flow, flow_label, value_in_euros, quantity_in_kg.
Filtered on ~1369 CN2025 codes from SubstanceId.csv. Declarants restricted to EU27 (GB/UK excluded) and flattened to 'EU'; partner-side EU27 countries aggregated to 'EU27'.
partner_code, label_en, label_fr, label_de, is_standard_code, source, note, ec_corporate_code, ec_corporate_uri.
Joined in the trade radar on partner_code to surface multilingual labels.
Source: Eurostat SDMX 3.0 – ESTAT/PARTNER.
product_code, label_en, label_fr, label_de, level, parent_code, chapter, is_not_elsewhere_specified.
Full Combined Nomenclature hierarchy (chapters → headings → HS subheadings → 8-digit CN codes). Joined on product_code in the trade radar.
Source: Eurostat SDMX 3.0 – ESTAT/CXT_NC.
Eurostat revisions sweep
Eurostat republishes monthly archives as Member States submit late or correct figures
(typically 1–3 republications per period over the year following first release).
Each Monday the pipeline compares the per-MS upload calendar to our
etl_log and re-imports any period that was republished after we loaded it
(rolling 24-month window). The reload step does DELETE WHERE period=X
followed by INSERT, so no duplicate rows.
| Period | Eurostat re-uploaded | Our previous load | Reloaded at | Rows replaced |
|---|---|---|---|---|
| – | ||||
Last revision check found no outdated periods. Eurostat’s republished dates match (or are older than) our most recent load timestamps.
Recent imports
Last successful monthly imports recorded in etl_log. Includes both first loads and revision reloads (reloads update the existing row in place), so the Finished at stamp reflects the most recent run.
| Period | Filename | Rows loaded | Finished at | Status |
|---|---|---|---|---|
| – | ||||
Loaded rows over time
Total rows inserted into PostgreSQL per monthly file, from the oldest period to the most recent.
Coverage matrix
Year × month grid. Filled cells = period present in etl_log (darker = more rows). Dashed cells = missing.
Source freshness by Member State
When each EU Member State last uploaded a slice to Eurostat's Comext database. Rows = MS, columns = reference period (most recent on the right). Darker = fresher upload date; dashed = not yet published. Toggle between Intra-EU (MS-declared) and Extra-EU (customs-declared) flows.
Latest log
Tail of the most recent cron log: INFO in default, WARNING in orange, ERROR in red.