A collection of hands-on examples, helper utilities, Jupyter notebooks, and data workflows showcasing how to work with the OKDP Platform. This repository is meant to help you explore OKDP capabilities around compute, object storage, data catalog, SQL engines, Spark, and analytics.
Jupyter notebooks that query Trino:
- Querying data using Trino (Python/SQLAlchemy)
- Querying data using Trino (SQL engine)
An index.ipynb notebook is also provided as an entry point.
Use Apache Superset (SQL Lab) to query Trino and build visualizations/dashboards on top of the same datasets.
Using okdp-ui, deploy the following components:
- Storage: MinIO
- Data Catalog: Hive Metastore
- Interactive Query: Trino
- Notebooks: Jupyter
- DataViz: Apache Superset
- Applications: okdp-examples
The Helm chart downloads public datasets at runtime, uploads them into object storage and creates appropriate Trino external tables.
ℹ️ NOTE
The datasets are not bundled in this repository or baked into container images.