Weekly refresh · Updated –

Cefic Comext ETL status

Eurostat Comext → PostgreSQL → Parquet export. Imports the chemistry-relevant CN codes (chapters 20–39) every Monday morning, then republishes 3 Parquet files (a fact table plus partner and product dimensions) consumed by the EU27 chemicals trade radar. Declarants are restricted to EU27: GB/UK is excluded from the declarant side and the column is flattened to the constant 'EU'.

Source Eurostat Comext
Cadence Mon 06:00 CET
Coverage CN 20–39
Declarants EU27
Rows in DB
Months loaded
Years covered
Parquet rows
Parquet refreshed
Last week's run

Most recent execution of the weekly cron: what it tried, what it imported, what it skipped.

Started
Finished
Duration
Files processed
Rows added
Errors
Skipped
Parquet rows

No summary available.

Outputs

Published Parquet outputs

Parquet schema →

Three artifacts produced by the weekly export: the fact table filtered on chemistry CN codes, plus the partner-country and product (NC) dimensions rebuilt from Eurostat's SDMX codelists. All three are regenerated in the same export_parquet.py run.

Fact table
Rows
Size
Refreshed
period, declarant, partner, product_nc, cpa2015, chapter_cn, flow, flow_label, value_in_euros, quantity_in_kg. Filtered on ~1369 CN2025 codes from SubstanceId.csv. Declarants restricted to EU27 (GB/UK excluded) and flattened to 'EU'; partner-side EU27 countries aggregated to 'EU27'.
etl_log timeline

Recent imports

Last successful monthly imports recorded in etl_log. Includes both first loads and revision reloads (reloads update the existing row in place), so the Finished at stamp reflects the most recent run.

Period Filename Rows loaded Finished at Status
Throughput

Loaded rows over time

Total rows inserted into PostgreSQL per monthly file, from the oldest period to the most recent.

First period
Last period
Average / month
Largest month
Coverage

Coverage matrix

Year × month grid. Filled cells = period present in etl_log (darker = more rows). Dashed cells = missing.

Fewer rows
More rows Dashed = missing month
Logs

Latest log

Tail of the most recent cron log: INFO in default, WARNING in orange, ERROR in red.

Log file