Downloads
Pulls DBC files straight from the DATASUS FTP. Selects by subsystem, date range, and state — nothing more, nothing less.
An open-source ETL for DATASUS. Downloads from the FTP, converts DBC to DBF to DuckDB, enriches with IBGE and CID-10 references, and writes partitioned parquet — in one command. Built for researchers who need accurate data, easily.
pip install datasus-etl Pulls DBC files straight from the DATASUS FTP. Selects by subsystem, date range, and state — nothing more, nothing less.
DBC → DBF → DuckDB → Parquet, all in-process with no CSV intermediates. Streaming inserts keep memory predictable on multi-GB reads.
Joins IBGE municipal codes (5,571 municipalities), CID-10 validation, and categorical mappings automatically. Output ships with clean schemas.
Browse through the local web UI, query with DuckDB SQL, or read the partitioned parquet from anywhere — polars, pandas, R, Arrow.
Install the app, click the desktop shortcut, pick a folder. Choose a subsystem and a date range. The app downloads and processes everything locally. Query through a dropdown-driven web UI — no SQL needed.
Read the tutorial →
The same installer exposes the full datasus
CLI. Scriptable pipelines, a Python API, and DuckDB as the
query surface. The output is Hive-partitioned parquet — pipe it
into your existing stack.
Developed by Nycholas Maia in technical collaboration with Paulo Alves Maia (FUNDACENTRO) within the CNPq research group "Mudanças Climáticas e Segurança e Saúde no Trabalho" (Climate Change and Occupational Safety and Health).
CNPq research group ↗
The button detects your OS. Full platform table, checksums, and
install notes on the download page. Each release is cut from
the VERSION
file in the repo — the same number appears in the app footer
and in datasus version.