Add support for warmups and iterations for each query #202
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR proposes to add support for two new optional arguments,
warmup_iterationsanditerations.warmup_iterationsspecifies the number of warmup iterations for each query to run before its performance is measured. The warmup timings are not reported. This is set to 0 by default.iterationsspecifies the number of iterations for each query to run and measure its performance. This is set to 1 by default.An example snippet of the report with
warmup_iterations=3anditerations=5:But the
query1ran 8 times as intended (3 warmups, 5 actual runs).These new arguments are optional, and thus they do not change existing behaviors when they are not set. When they are set, the power run script runs each query
warmup_iterationstimes first, and theniterationstimes to measure timings. Only the timings after the warmup are reported.Unlike the existing
runsargument that specifies the number of power runs to run, these new arguments specify per-query iterations. In Spark terms, therunsargument can submit multiple Spark applications where each application performs one power run, while the new arguments specifies the number of iterations for each query to run in the same Spark application. This could be useful especially when you want to keep some states between multiple query runs in Spark, such as file cache, to get more consistent results.