Skip to content

Conversation

@jihoonson
Copy link
Collaborator

@jihoonson jihoonson commented Mar 10, 2025

This PR proposes to add support for two new optional arguments, warmup_iterations and iterations.

  • warmup_iterations specifies the number of warmup iterations for each query to run before its performance is measured. The warmup timings are not reported. This is set to 0 by default.
  • iterations specifies the number of iterations for each query to run and measure its performance. This is set to 1 by default.

An example snippet of the report with warmup_iterations=3 and iterations=5:

...
local-1741649529813,CreateTempView catalog_sales,2245
local-1741649529813,CreateTempView store_sales,1697
local-1741649529813,query1,874
local-1741649529813,query1,952
local-1741649529813,query1,763
local-1741649529813,query1,860
local-1741649529813,query1,752
local-1741649529813,Power Start Time,1741649555
local-1741649529813,Power End Time,1741649565
local-1741649529813,Power Test Time,10000
local-1741649529813,Total Time,38532
...

But the query1 ran 8 times as intended (3 warmups, 5 actual runs).

Screenshot 2025-03-10 at 4 34 03 PM

These new arguments are optional, and thus they do not change existing behaviors when they are not set. When they are set, the power run script runs each query warmup_iterations times first, and then iterations times to measure timings. Only the timings after the warmup are reported.

Unlike the existing runs argument that specifies the number of power runs to run, these new arguments specify per-query iterations. In Spark terms, the runs argument can submit multiple Spark applications where each application performs one power run, while the new arguments specifies the number of iterations for each query to run in the same Spark application. This could be useful especially when you want to keep some states between multiple query runs in Spark, such as file cache, to get more consistent results.

Copy link
Collaborator

@abellina abellina left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@jihoonson jihoonson merged commit 0db754f into NVIDIA:dev Mar 12, 2025
2 checks passed
jihoonson added a commit to jihoonson/spark-rapids-benchmarks that referenced this pull request Mar 14, 2025
pxLi pushed a commit that referenced this pull request Mar 14, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants