Skip to content

Commit 6d42108

Browse files
committed
Basic implementation of excluding outliers
Seems to all work fine. But do add a test - although I figure that one will be hard to do "properly".
1 parent 2cf5dc5 commit 6d42108

File tree

4 files changed

+67
-13
lines changed

4 files changed

+67
-13
lines changed

README.md

Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -136,6 +136,8 @@ In addition, you can optionally output an extended set of statistics:
136136
* **sample size** - the number of measurements taken
137137
* **mode** - the measured values that occur the most. Often one value, but can be multiple values if they occur exactly as often. If no value occurs at least twice, this value will be `nil`.
138138

139+
Benchee can also [remove outliers](#remove-outliers).
140+
139141
## Installation
140142

141143
Add `:benchee` to your list of dependencies in `mix.exs`:
@@ -303,6 +305,7 @@ So, what happens if a function executes too fast for Benchee to measure? If Benc
303305
* essentially every single measurement is now an average across 10 runs making lots of statistics less meaningful
304306

305307
Benchee will print a big warning when this happens.
308+
306309
#### Measuring Memory Consumption
307310

308311
Starting with version 0.13, users can now get measurements of how much memory their benchmarked scenarios use. The measurement is **limited to the process that Benchee executes your provided code in** - i.e. other processes (like worker pools)/the whole BEAM isn't taken into account.
@@ -542,6 +545,21 @@ Enum."-map/2-lists^map/1-0-"/2 10001 26.38 2282 0.23
542545

543546
**Note about after_each hooks:** `after_each` hooks currently don't work when profiling a function, as they are not passed the return value of the function after the profiling run. It's already fixed on the elixir side and is waiting for release, likely in 1.14. It should then just work.
544547

548+
### Remove Outliers
549+
550+
Benchee can remove outliers from the gathered samples.
551+
That is, as determined by percentiles/quantiles (we follow [this approach](https://en.wikipedia.org/wiki/Interquartile_range#Outliers)).
552+
553+
You can simply pass `exclude_outliers: true` to Benchee to trigger the removal of outliers.
554+
555+
```elixir
556+
Benchee.run(jobs, exclude_outliers: true)
557+
```
558+
559+
The outliers themselves (aka the samples that have been determined to be outliers)
560+
as well as the lower/upper bound after which samples are considered outliers are accessible
561+
in the `Benchee.Statistics` struct.
562+
545563
### Saving, loading and comparing previous runs
546564

547565
Benchee can store the results of previous runs in a file and then load them again to compare them. For example this is useful to compare what was recorded on the main branch against a branch with performance improvements. You may also use this to benchmark across different exlixir/erlang versions.

lib/benchee/configuration.ex

Lines changed: 9 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -48,7 +48,8 @@ defmodule Benchee.Configuration do
4848
# It also generates less than 1GB in data (some of which is garbage collected/
4949
# not necessarily all in RAM at the same time) - which seems reasonable enough.
5050
# see `samples/statistics_performance.exs` and also maybe run it yourself.
51-
max_sample_size: 1_000_000
51+
max_sample_size: 1_000_000,
52+
exclude_outliers: false
5253

5354
@typedoc """
5455
The configuration supplied by the user as either a map or a keyword list
@@ -152,6 +153,11 @@ defmodule Benchee.Configuration do
152153
This is used to limit memory consumption and unnecessary processing - 1 Million samples is plenty.
153154
This limit also applies to number of iterations done during warmup.
154155
You can set your own number or set it to `nil` if you don't want any limit.
156+
* `exclude_outliers` - whether or not statistical outliers should be removed for the calculated statistics.
157+
Defaults to `false`.
158+
This means that values that are far outside the usual range (as determined by the percentiles/quantiles) will
159+
be removed from the gathered samples and the calculated statistics. You might want to enable this if you
160+
don't want things like the garbage collection triggering to influence your results as much.
155161
"""
156162
@type user_configuration :: map | keyword
157163

@@ -183,7 +189,8 @@ defmodule Benchee.Configuration do
183189
measure_function_call_overhead: boolean,
184190
title: String.t() | nil,
185191
profile_after: boolean | atom | {atom, keyword},
186-
max_sample_size: pos_integer()
192+
max_sample_size: pos_integer(),
193+
exclude_outliers: boolean()
187194
}
188195

189196
@time_keys [:time, :warmup, :memory_time, :reduction_time]

lib/benchee/statistics.ex

Lines changed: 19 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -121,7 +121,7 @@ defmodule Benchee.Statistics do
121121
...> input: "Input"
122122
...> }
123123
...> ]
124-
...>
124+
...>
125125
...> suite = %Benchee.Suite{scenarios: scenarios}
126126
...> statistics(suite, Benchee.Test.FakeProgressPrinter)
127127
%Benchee.Suite{
@@ -179,15 +179,17 @@ defmodule Benchee.Statistics do
179179
printer.calculating_statistics(suite.configuration)
180180

181181
percentiles = suite.configuration.percentiles
182+
exclude_outliers? = suite.configuration.exclude_outliers
182183

183184
update_in(suite.scenarios, fn scenarios ->
184-
scenario_statistics = compute_statistics_in_parallel(scenarios, percentiles)
185+
scenario_statistics =
186+
compute_statistics_in_parallel(scenarios, percentiles, exclude_outliers?)
185187

186188
update_scenarios_with_statistics(scenarios, scenario_statistics)
187189
end)
188190
end
189191

190-
defp compute_statistics_in_parallel(scenarios, percentiles) do
192+
defp compute_statistics_in_parallel(scenarios, percentiles, exclude_outliers?) do
191193
scenarios
192194
|> Enum.map(fn scenario ->
193195
# we filter down the data here to avoid sending the input and benchmarking function to
@@ -200,7 +202,7 @@ defmodule Benchee.Statistics do
200202
# async_stream as we might run a ton of scenarios depending on the benchmark
201203
|> Task.async_stream(
202204
fn scenario_collection_data ->
203-
calculate_scenario_statistics(scenario_collection_data, percentiles)
205+
calculate_scenario_statistics(scenario_collection_data, percentiles, exclude_outliers?)
204206
end,
205207
timeout: :infinity,
206208
ordered: true
@@ -235,27 +237,33 @@ defmodule Benchee.Statistics do
235237
end)
236238
end
237239

238-
defp calculate_scenario_statistics({run_time_data, memory_data, reductions_data}, percentiles) do
240+
defp calculate_scenario_statistics(
241+
{run_time_data, memory_data, reductions_data},
242+
percentiles,
243+
exclude_outliers?
244+
) do
239245
run_time_stats =
240246
run_time_data.samples
241-
|> calculate_statistics(percentiles)
247+
|> calculate_statistics(percentiles, exclude_outliers?)
242248
|> add_ips
243249

244-
memory_stats = calculate_statistics(memory_data.samples, percentiles)
245-
reductions_stats = calculate_statistics(reductions_data.samples, percentiles)
250+
memory_stats = calculate_statistics(memory_data.samples, percentiles, exclude_outliers?)
251+
252+
reductions_stats =
253+
calculate_statistics(reductions_data.samples, percentiles, exclude_outliers?)
246254

247255
{run_time_stats, memory_stats, reductions_stats}
248256
end
249257

250-
defp calculate_statistics([], _) do
258+
defp calculate_statistics([], _, _) do
251259
%__MODULE__{
252260
sample_size: 0
253261
}
254262
end
255263

256-
defp calculate_statistics(samples, percentiles) do
264+
defp calculate_statistics(samples, percentiles, exclude_outliers?) do
257265
samples
258-
|> Statistex.statistics(percentiles: percentiles)
266+
|> Statistex.statistics(percentiles: percentiles, exclude_outliers: exclude_outliers?)
259267
|> convert_from_statistex
260268
end
261269

samples/outlier_removal.exs

Lines changed: 21 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,21 @@
1+
list = Enum.to_list(1..10_000)
2+
map_fun = fn i -> [i, i * i] end
3+
4+
suite =
5+
Benchee.run(
6+
%{
7+
"flat_map" => fn -> Enum.flat_map(list, map_fun) end,
8+
"map.flatten" => fn -> list |> Enum.map(map_fun) |> List.flatten() end
9+
},
10+
formatters: [{Benchee.Formatters.Console, extended_statistics: true}],
11+
exclude_outliers: true
12+
)
13+
14+
suite.scenarios
15+
|> Enum.map(fn scenario ->
16+
statistics = scenario.run_time_data.statistics
17+
18+
{scenario.name, length(statistics.outliers), statistics.outliers,
19+
statistics.lower_outlier_bound, statistics.upper_outlier_bound}
20+
end)
21+
|> IO.inspect()

0 commit comments

Comments
 (0)