Skip to content

Commit 9185bc9

Browse files
authored
Merge pull request #47 from Evian-Zhang/docs
Add docs
2 parents d5ff693 + 173f085 commit 9185bc9

File tree

5 files changed

+522
-3
lines changed

5 files changed

+522
-3
lines changed

Readme.md

Lines changed: 15 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -11,15 +11,25 @@ Starting from v3.0.0, unicornafl is fully rewritten with `libafl_targets` in Rus
1111
To use `unicornafl` as a library, just add this to your `Cargo.toml`
1212

1313
```toml
14-
unicornafl = {git = "https://github.com/AFLplusplus/unicornafl", branch = "main"}
14+
unicornafl = { git = "https://github.com/AFLplusplus/unicornafl", branch = "main" }
1515
```
1616

1717
`main` is used here because `unicorn` is not released yet. We will make it ready shortly.
1818

19+
For more details, please refer to [Rust usage](./docs/rust-usage.md).
20+
1921
### Python
2022

2123
At this moment, manual building is required (see below) but we will soon release wheels.
2224

25+
For more details, please refer to [Python usage](./docs/python-usage.md).
26+
27+
### C/C++
28+
29+
After building this repo, you could link the generated static archive or shared library with included C/C++ header file in [include/unicornafl.h](./include/unicornafl.h).
30+
31+
For more details, please refer to [C/C++ usage](./docs/c-usage.md).
32+
2333
## Build
2434

2535
Simply do:
@@ -33,7 +43,7 @@ cargo build --release
3343
For python bindings, we have:
3444

3545
```bash
36-
maturin build
46+
maturin build --release
3747
```
3848

3949
## Example && Minimal Tutorial
@@ -62,6 +72,8 @@ afl-fuzz -i ./input -o ./output-8 -b 1 -g 8 -G 8 -V 60 -c 0 -- ./target/release/
6272

6373
This shall find the crash instantly, thanks to the `cmplog` integration.
6474

75+
For more details, please refer to [Fuzzing using UnicornAFL](./docs/fuzzing.md).
76+
6577
## Migration
6678

67-
There should be nothing special migrating from unicornafl v2.x to unicornafl v3.x, execpt the way integrating with `AFL++`. If your harness builds and statically links against unicornafl directly, there is no longer needed for the unicorn mode with `AFL++`. However, for Python users with `libunicornafl.so` dynamically linked, unicorn mode is still needed for `AFL++` command line.
79+
There should be nothing special migrating from unicornafl v2.x to unicornafl v3.x, execpt the way integrating with AFL++. If your harness builds and statically links against unicornafl directly, there is no longer needed for the unicorn mode with AFL++. However, if you are using Python, or using C/C++ with `libunicornafl.so` dynamically linked, unicorn mode (`-U` option) is still needed for `afl-fuzz` command line.

docs/c-usage.md

Lines changed: 106 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,106 @@
1+
# C/C++ Usage for UnicornAFL
2+
3+
To use UnicornAFL with C/C++, you should clone this repository and build it yourself:
4+
5+
```shell
6+
git clone --depth 1 https://github.com/AFLplusplus/unicornafl && cd unicornafl
7+
cargo build --release
8+
```
9+
10+
Before building this repo, make sure that you have installed dependencies to build [Unicorn](https://github.com/unicorn-engine/unicorn), and installed stable Rust compiler with at least 1.87.0.
11+
12+
After building this repo, there will be a `libunicornafl.a` and a `libunicornafl.so` in `./target/release/` directory. To use UnicornAFL, you should link either one, and use header file at `./include/unicornafl.h`.
13+
14+
## API usage
15+
16+
The API for UnicornAFL is simple but powerful, which is the following two functions: `uc_afl_fuzz` and `uc_afl_fuzz_custom`.
17+
18+
### Simplified API
19+
20+
`uc_afl_fuzz`
21+
22+
```c
23+
uc_afl_ret uc_afl_fuzz(uc_engine* uc, char* input_file,
24+
uc_afl_cb_place_input_t place_input_callback,
25+
uint64_t* exits, size_t exit_count,
26+
uc_afl_cb_validate_crash_t validate_crash_callback,
27+
bool always_validate, uint32_t persistent_iters,
28+
void* data);
29+
```
30+
31+
`uc` is a unicorn instance created in advance. See the following [Creating Unicorn Instance](#Creating-Unicorn-Instance) for more details.
32+
33+
`input_file` is a path to input file. If you are using the fuzzing mode, just pass `NULL` to this argument, and the input seed directory should be passed to `afl-fuzz` instead. For standalone mode, UnicornAFL takes input using this argument.
34+
35+
`place_input_callback` is the callback for UnicornAFL to place received input into Unicorn's memory space. This callback takes five arguments: a pointer to the unicorn intance which users could use to read/write unicorn's emulated CPU/memory in this callback, a pointer to the input buffer, the input buffer length, the persistent round (which means how many times have this harness executed without exiting and forking to another child process), and custom data. This callback should return a bool, indicating whether this input is acceptable.
36+
37+
`exits` and `exit_count` means the exit points for Unicorn. When the Unicorn instance reaches one of the given exit address, UnicornAFL will switch to next round.
38+
39+
`validate_crash_callback` is the callback for UnicornAFL when an error encounted when executing the harness. It takes six arguments: a pointer to the unicorn intance, a value indicating the error of Unicorn when exuecting the harness, a pointer to the input buffer, the input buffer length, the persistent round, and custom data. This callback should return a bool, if it is `false`, then the AFL++ main executable will not treat this round as crash. This could be used to eliminate false positives during fuzzing.
40+
41+
`always_validate` means whether the `validate_crash_callback` will be invoked even if the Unicorn does not face errors during execution.
42+
43+
`persistent_iters` specifies how many times should this harness being executed persistently until the parent forks another child. For simplicity, you could just pass `1` here, which means always exiting and forking whenever this harness ends. However, if you want to write a more efficient harness, you should consider running persistently. Passing `0` here means never exiting or forking unless the process crashes, just run persistently.
44+
45+
`data` is a custom data. In each callback listed above, this pointer will also passed as the callback argument. By this way you could maintain some shared data across execution.
46+
47+
This function returns a `uc_afl_ret`. If it is not `UC_AFL_RET_OK`, this means unexpected things happened during fuzzing that you should take care of.
48+
49+
### Advanced API
50+
51+
`uc_afl_fuzz_custom`
52+
53+
```c
54+
uc_afl_ret uc_afl_fuzz_custom(uc_engine* uc, char* input_file,
55+
uc_afl_cb_place_input_t place_input_callback,
56+
uc_afl_fuzz_cb_t fuzz_callbck,
57+
uc_afl_cb_validate_crash_t validate_crash_callback,
58+
bool always_validate, uint32_t persistent_iters,
59+
void* data);
60+
```
61+
62+
Some of the arguments are the same as the simplified API. The only difference is the `fuzz_callbck` argument. UnicornAFL will use this function to start one execution round, and when this function stops, UnicornAFL knows this round has ended. By default, UnicornAFL will just use `uc_emu_start()`.
63+
64+
### Creating Unicorn Instance
65+
66+
Before using fuzzing APIs, you should create unicorn instance on your own. It should be noted that, UnicornAFL does not need to know the actual target to fuzz. Instead, you should manually setup your target in Unicorn instance (for example, map the codes in unicorn's memory space).
67+
68+
## Tips
69+
70+
### Linking
71+
72+
Note that `libunicornafl.a` or `libunicornafl.so` already bundles a Unicorn. As a result, you don't need to manually link Unicorn any more.
73+
74+
### Use a different version of Unicorn
75+
76+
It should be noted that the internal of UnicornAFL depends heavily on some newest Unicorn APIs. As a result, older version of Unicorn may not work. However, if you want to use your own version of Unicorn, you should modify the `Cargo.toml` in this repo.
77+
78+
First, find the following line:
79+
80+
```toml
81+
unicorn-engine = { git = "https://github.com/unicorn-engine/unicorn", branch = "dev" }
82+
```
83+
84+
If you want to use a Unicorn in local filesystem, you should change this line to
85+
86+
```toml
87+
unicorn-engine = { path = "/path/to/unicorn/bindings/rust" }
88+
```
89+
90+
Note that the `bindings/rust` suffix is necessary.
91+
92+
If you want to use a forked Unicorn or Unicorn in remote Git server, you should change this line to
93+
94+
```toml
95+
unicorn-engine = { git = "http://my/own/unicorn/fork" }
96+
```
97+
98+
### Debugging
99+
100+
Inside UnicornAFL, there are many logs could be used for debugging. To enable logging, you should compile this repo using
101+
102+
```shell
103+
cargo build --release --features env_logger
104+
```
105+
106+
And when running, passing `RUST_LOG=trace` as environment. (`AFL_DEBUG=1` is also needed if you are using `afl-fuzz` to run the harness)

docs/fuzzing.md

Lines changed: 70 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,70 @@
1+
# Fuzzing using UnicornAFL
2+
3+
UnicornAFL is a bridge between AFL++ and Unicorn.
4+
5+
## Running Mode
6+
7+
The harness built with UnicornAFL supports two running mode: standalone mode and fuzzing mode.
8+
9+
### Standalone Mode
10+
11+
This mode is not intended for fuzzing. Instead, you should use this mode to check whether you have written the correct harness, and it is also helpful to analyze the crashes found by AFL++.
12+
13+
To run harness in standalone mode, you should directly execute the harness executable that uses UnicornAFL without using `afl-fuzz`. The commandline options for executing this harness is defined by users. Users need to then pass correct value to the parameter of UnicornAFL API, especially the `input_file` argument. The commandline harness executable should take a path to a file, then if it is passed to the `input_file`, UnicornAFL will use that file as input to execute the Unicorn engine for the target being tested.
14+
15+
Before any fuzzing, you should create a normal input seed that don't expect to crash the harness. Then you should run in standalone mode to check that the harness can execute normally. Then if anything unexpected happened during standalone mode, this means you write the wrong harness.
16+
17+
### Fuzzing mode
18+
19+
After testing the correctness of the harness, then you can fuzz the harness using `afl-fuzz`. To use `afl-fuzz` with UnicornAFL, you should first make sure how you build the harness.
20+
21+
If you are using Rust, or if you are using C/C++ that statically link the `libunicornafl.a`, then the minimized working example is
22+
23+
```shell
24+
afl-fuzz \
25+
-i input \
26+
-o output \
27+
-- \
28+
./your-harness --and-your-own-harness-options
29+
```
30+
31+
If you are using Python, or if you are using C/C++ that dynamically link the `libunicornafl.so`, then the minimized working example is
32+
33+
```shell
34+
afl-fuzz \
35+
-U \
36+
-i input \
37+
-o output \
38+
-- \
39+
./your-harness --and-your-own-harness-options
40+
```
41+
42+
The `-U` option specifies that this is the legacy Unicorn mode.
43+
44+
Note that you don't need to use `@@` to specify input file, we use shared memory to get input seed.
45+
46+
## Persistent Fuzzing
47+
48+
UnicornAFL supports persistent fuzzing. Instead of forking at the beginning of each execution round, persistent fuzzing will just do a `for`-loop to execute the target. The overall steps are:
49+
50+
1. Users invoke `afl-fuzz` and pass the path to your UnicornAFL harness.
51+
2. `afl-fuzz` spawns a harness process (which we call it harness parent).
52+
3. The harness process will execute until the beginning of one of the UnicornAFL's APIs (`uc_afl_fuzz` and `uc_afl_fuzz_custom`). Then it will fork itself, producing another process (which we call it harness child).
53+
4. The harness child contains a loop that executes the target with Unicorn engine repeatly. Each round is counted as a execution for `afl-fuzz`.
54+
5. When the user specified `persistent_round` is achieved, or the harness child process crashes (which is rare, since the exceptions shall be captured by Unicorns already), the harness child end. The harness parent will fork a new harness child and do the same thing.
55+
56+
Since in the harness child, the target is executed repeatly, it is very important that **you should restore the Unicorn's state after each round** unless you can make sure the target does not modify Unicorn's CPU and memory in this round. To make things easier, you can just specify `persistent_round` as 1, which downgrade to the legacy forkserver-based fuzzing, which is significantly slower.
57+
58+
## CMPLOG and CMPCOV
59+
60+
UnicornAFL also supprost CMPLOG and CMPCOV in AFL++. If you don't know these terms, please refer to the AFL++'s documentation. In short, this is aimed to bypass the long comparison like `CMP RAX, 0x114514`.
61+
62+
To use CMPCOV mode, you should specify `UNICORN_AFL_CMPCOV=1` environment in `afl-fuzz`.
63+
64+
To use CMPLOG mode, you can just add `-c 0` option to `afl-fuzz`.
65+
66+
## Which language should I choose to use?
67+
68+
The language to choose may have a little affect on the throughput of fuzzing, while you should keep in mind that the main overhead is the target itself.
69+
70+
Although not benchmarked, Rust may be a slightly faster than C/C++ due to the power of inlining and LTO. The python version is much more slower. However, since the it only have a little affect, it is more appropriate if you choose the language that you are good at. Don't struggle with language itself, it is fuzzing that is all you need :)

docs/python-usage.md

Lines changed: 113 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,113 @@
1+
# Python Usage for UnicornAFL
2+
3+
To use UnicornAFL with Python, you should clone this repository and build it yourself:
4+
5+
```shell
6+
git clone --depth 1 https://github.com/AFLplusplus/unicornafl && cd unicornafl
7+
cargo build --release
8+
maturin build --release
9+
```
10+
11+
Before building this repo, make sure that you have installed dependencies to build [Unicorn](https://github.com/unicorn-engine/unicorn), and installed stable Rust compiler with at least 1.87.0, and you should also install [maturin](https://www.maturin.rs).
12+
13+
After building this repo, there will be a wheel in `./target/wheels`, just use it.
14+
15+
## API usage
16+
17+
The API for UnicornAFL is simple but powerful, which is the following two functions: `uc_afl_fuzz` and `uc_afl_fuzz_custom`.
18+
19+
### Simplified API
20+
21+
`uc_afl_fuzz`
22+
23+
```python
24+
def uc_afl_fuzz(uc: Uc,
25+
input_file: str,
26+
place_input_callback: Callable,
27+
exits: List[int],
28+
validate_crash_callback: Callable = None,
29+
always_validate: bool = False,
30+
persistent_iters: int = 1,
31+
data: Any = None): ...
32+
```
33+
34+
`uc` is a unicorn instance created in advance. See the following [Creating Unicorn Instance](#Creating-Unicorn-Instance) for more details.
35+
36+
`input_file` is a path to input file. If you are using the fuzzing mode, just pass `None` to this argument, and the input seed directory should be passed to `afl-fuzz` instead. For standalone mode, UnicornAFL takes input using this argument.
37+
38+
`place_input_callback` is the callback for UnicornAFL to place received input into Unicorn's memory space. This callback takes four arguments: a pointer to the unicorn intance which users could use to read/write unicorn's emulated CPU/memory in this callback, input buffer, the persistent round (which means how many times have this harness executed without exiting and forking to another child process), and custom data. This callback should return a Bool, indicating whether this input is acceptable.
39+
40+
`exits` means the exit points for Unicorn. When the Unicorn instance reaches one of the given exit address, UnicornAFL will switch to next round.
41+
42+
`validate_crash_callback` is the callback for UnicornAFL when an error encounted when executing the harness. It takes five arguments: a pointer to the unicorn intance, a value indicating the error of Unicorn when exuecting the harness, the input buffer, the persistent round, and custom data. This callback should return a Bool, if it is `False`, then the AFL++ main executable will not treat this round as crash. This could be used to eliminate false positives during fuzzing.
43+
44+
`always_validate` means whether the `validate_crash_callback` will be invoked even if the Unicorn does not face errors during execution.
45+
46+
`persistent_iters` specifies how many times should this harness being executed persistently until the parent forks another child. For simplicity, you could just pass `1` here, which means always exiting and forking whenever this harness ends. However, if you want to write a more efficient harness, you should consider running persistently. Passing `0` here means never exiting or forking unless the process crashes, just run persistently.
47+
48+
`data` is a custom data. In each callback listed above, this pointer will also passed as the callback argument. By this way you could maintain some shared data across execution.
49+
50+
This function returns a `UcAflError` or value `UC_AFL_RET_OK`. If the return value is not `UC_AFL_RET_OK`, this means unexpected things happened during fuzzing that you should take care of.
51+
52+
### Advanced API
53+
54+
`uc_afl_fuzz_custom`
55+
56+
```python
57+
def uc_afl_fuzz_custom(uc: Uc,
58+
input_file: str,
59+
place_input_callback: Callable,
60+
fuzzing_callback: Callable,
61+
validate_crash_callback: Callable = None,
62+
always_validate: bool = False,
63+
persistent_iters: int = 1,
64+
data: Any = None): ...
65+
```
66+
67+
Some of the arguments are the same as the simplified API. The only difference is the `fuzz_callbck` argument. UnicornAFL will use this function to start one execution round, and when this function stops, UnicornAFL knows this round has ended. By default, UnicornAFL will just use `uc_emu_start()`.
68+
69+
### Creating Unicorn Instance
70+
71+
Before using fuzzing APIs, you should create unicorn instance on your own. It should be noted that, UnicornAFL does not need to know the actual target to fuzz. Instead, you should manually setup your target in Unicorn instance (for example, map the codes in unicorn's memory space).
72+
73+
## Tips
74+
75+
### Use a different version of Unicorn
76+
77+
It should be noted that the internal of UnicornAFL depends heavily on some newest Unicorn APIs. As a result, older version of Unicorn may not work. However, if you want to use your own version of Unicorn, you should modify the `Cargo.toml` in this repo.
78+
79+
First, find the following line:
80+
81+
```toml
82+
unicorn-engine = { git = "https://github.com/unicorn-engine/unicorn", branch = "dev" }
83+
```
84+
85+
If you want to use a Unicorn in local filesystem, you should change this line to
86+
87+
```toml
88+
unicorn-engine = { path = "/path/to/unicorn/bindings/rust" }
89+
```
90+
91+
Note that the `bindings/rust` suffix is necessary.
92+
93+
If you want to use a forked Unicorn or Unicorn in remote Git server, you should change this line to
94+
95+
```toml
96+
unicorn-engine = { git = "http://my/own/unicorn/fork" }
97+
```
98+
99+
### Linking
100+
101+
To use UnicornAFL and Unicorn at the same time, you should make sure that the Unicorn version that UnicornAFL uses is consistent with the Unicorn version of Unicorn Python package. Then you can import Unicorn package and UnicornAFL package at the same time.
102+
103+
When building the Python binding, we dynamically link the Unicorn shared library. As a result, using UnicornAFL and Unicorn package at the same time will be OK as long as the Unicorn version does not conflict.
104+
105+
### Debugging
106+
107+
Inside UnicornAFL, there are many logs could be used for debugging. To enable logging, you should compile this repo using
108+
109+
```shell
110+
cargo build --release --features env_logger
111+
```
112+
113+
And when running, passing `RUST_LOG=trace` as environment. (`AFL_DEBUG=1` is also needed if you are using `afl-fuzz` to run the harness)

0 commit comments

Comments
 (0)