Built on top of the popular open source project, Polars Cloud enables you to write DataFrame code once and run it anywhere. The distributed engine available with Polars Cloud allows you to scale your Polars queries beyond a single machine.
- Unified DataFrame Experience: Run a Polars query seamlessly on your local machine or at scale with our new distributed engine. All from the same API.
- Serverless Compute: Effortlessly start compute resources without managing infrastructure, with options to run queries on both CPU and GPU.
- Any Environment: Start a remote query from a notebook on your machine, Airflow DAG, AWS Lambda, or any server. Get the flexibility to embed Polars Cloud in any environment.
To use Polars cloud simply add it to your existing project
pip install polars_cloudThen call .remote() on your dataframe and provide a compute context.
import polars as pl
import polars_cloud as pc
ctx = pc.ComputeContext(cpus=16, memory=64)
query = (
pl.scan_parquet("s3://my-dataset/")
.group_by("returned", "status")
.agg(
avg_price=pl.mean("price"),
avg_disc=pl.mean("discount"),
count_order=pl.len(),
)
)
(
query.remote(ctx)
.distributed()
.sink_parquet("s3://my-destination/")
)Hit run and your query will be executed in the cloud. You can follow your query's progress on the dashboard. And once your first query is done it's time to increase your dataset size and up the core count.
Sign up here to run Polars Cloud.