Use DuckDB for querying and analyzing Parquet forecast data(suggestion)

## Detailed Description
Currently, Quartz Solar Forecast stores forecast outputs and historical weather data in Parquet and csv files. While Parquet is efficient for storage, users still need to load entire files into memory or write custom aggregation scripts to analyze the data.

Adding DuckDB as a query layer would allow:

- Direct SQL queries on Parquet files without loading all data into memory.
- Fast aggregation, filtering, and joining across multiple Parquet files (e.g., multi-site forecasts or weather data).
- Easier ad hoc analysis and evaluation of model performance.
- Integration with Python via Pandas DataFrames while still leveraging SQL efficiency.

Example use cases:

- Compute hourly average predicted power across all sites.
- Filter forecasts for specific time ranges or locations.
- Join forecast data with historical weather datasets for feature engineering.

## Context
This change would benefit:
- Developers and analysts working with large forecast datasets.
- Users who want to perform quick queries without writing custom scripts for every analysis.
- Situations where multiple sites or long time periods generate large Parquet datasets.
- Fast querying and analytics on local or in-memory datasets (especially PV site data, weather predictions, or timeseries CSV).
- Effortless integration with Pandas and Polars DataFrames.
- SQL queries inside Python/Jupyter Notebooks, improving ETL and aggregation operations compared to pure Pandas for larger datasets.
- Easy CSV/Parquet/Arrow file ingestion without complex setup.
- No server or infrastructure overhead, making setup simple.

## Possible Implementation

```
import duckdb
import pandas as pd

# Load CSV or DataFrame directly into DuckDB
df = pd.read_csv('your_data.csv')
con = duckdb.connect()

# Register as DuckDB table and query using SQL
con.register('my_df', df)
result = con.execute('SELECT avg(power_output) FROM my_df WHERE timestamp > ?',
                     ['2024-01-01']).fetchdf()

```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Use DuckDB for querying and analyzing Parquet forecast data(suggestion) #323

Detailed Description

Context

Possible Implementation

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Use DuckDB for querying and analyzing Parquet forecast data(suggestion) #323

Description

Detailed Description

Context

Possible Implementation

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions