-
Notifications
You must be signed in to change notification settings - Fork 90
Description
I played with this library this afternoon and noticed that there's a bias towards edge values like 0, ?::MAX ?::MIN and I get that they work wonders in fuzzing, but I ran into the problem that the bias ended up producing simple test cases.
I was generating a sequence of push(value)/pop operations on a heap. My approach could be simplified to,
#[derive(Arbitrary)]
struct OperationBatch {
seed: u64, // Fixes the push/pop sequence. (I biased towards pushing small batches)
numbers_to_push: Vec<u16>,
}The bias resulted in my operation sequences using mostly the same numbers on the heap, which doesn't stress the heap too much.
I ended up implementing OperationBatch::from_seed(u64) and customising the number distribution, but would appreciate documentation around default distributions and mention to helpers to tailor the distribution of values when the defaults are a bad fit.
At least from the docs around output distributions it wasn't clear to me that there's this bias nor how sharp it is. I feel that I'd have had an easier time if I ran into arbitrary_len docs, but from the README I initially thought that sprinkling a few attributes would be all I needed.
Maybe I just ran out of entropy because of a poor size_hint, but I'd expect errors instead of silently generating bad samples (Looking around might be related to #219 (comment)). Also it seems that there's a missing set of attributes to specify collection sizes that uses arbitrary_len underneath.