Skip to content

Commit ae8b97a

Browse files
committed
Add readme
1 parent bff46a4 commit ae8b97a

File tree

1 file changed

+99
-1
lines changed

1 file changed

+99
-1
lines changed

README.md

Lines changed: 99 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,101 @@
11
# wav-files-vad-api
22

3-
This project provides a simple API for Voice Activity Detection (VAD) on WAV audio files.
3+
A command-line tool for recursively performing Voice Activity Detection (VAD) on WAV audio files using one or more external API servers. It validates files against a specific format (mono channel, 16-bit PCM, 16kHz sample rate), processes them in parallel, preserves the input directory structure in the output, and provides summary statistics on completion.
4+
5+
## Features
6+
7+
- **Recursive Scanning**: Walks the input directory tree to find all `.wav` files using `walkdir`.
8+
- **Format Validation**: Ensures WAV files meet the required specs (mono, 16-bit PCM, 16kHz) using the `hound` crate.
9+
- **Parallel Processing**: Leverages `rayon` to process files concurrently, with the degree of parallelism matching the number of provided API servers.
10+
- **API Integration**: Distributes load by sending JSON requests to a list of external VAD APIs via `ureq` and handles responses.
11+
- **Robust Error Handling**: Uses `anyhow` for contextual error propagation and clear logging.
12+
- **Directory Preservation**: Mirrors the input folder structure in the output directory.
13+
- **CLI-Friendly**: Built with `clap` for intuitive argument parsing and help output.
14+
15+
## Prerequisites
16+
17+
- Rust 1.75+ (stable channel, due to `2024` edition)
18+
- An external API server running at the specified address(es), accepting POST requests with JSON payloads for VAD.
19+
- Expected request body: `{ "input_file": String, "output_dir": String, "model": Option<String> }`
20+
- Expected success response: HTTP status `200 OK`.
21+
22+
## Installation
23+
24+
### From GitHub Releases
25+
26+
Statically-linked Linux binaries are available for download from the Releases page.
27+
28+
### From Source
29+
30+
1. Clone the repository:
31+
```bash
32+
git clone https://github.com/RustedBytes/wav-files-vad-api.git
33+
cd wav-files-vad-api
34+
```
35+
36+
2. Build the project:
37+
```bash
38+
cargo build --release
39+
```
40+
41+
The binary will be available at `target/release/wav-files-vad-api`.
42+
43+
## Usage
44+
45+
Run the tool with the required input and output directories, and at least one API server address.
46+
47+
```bash
48+
wav-files-vad-api /path/to/input/dir /path/to/output/dir --addr-api http://localhost:8000/vad
49+
```
50+
51+
### Arguments
52+
53+
- `INPUT_DIR`: Path to the directory containing WAV files (scanned recursively).
54+
- `OUTPUT_DIR`: Path to the directory where VAD output files will be saved (created if it doesn't exist).
55+
- `--addr-api <ADDR_API>`: A comma-separated list of VAD API server URLs. Work will be distributed among them. (Required)
56+
- `--model <MODEL>`: An optional model name to pass to the VAD API.
57+
58+
### Example
59+
60+
Process all valid WAV files in `./raw_audio/` and save results to `./processed_audio/` using two local API servers for parallel execution:
61+
62+
```bash
63+
./target/release/wav-files-vad-api ./raw_audio ./processed_audio --addr-api http://127.0.0.1:8001/vad,http://127.0.0.1:8002/vad
64+
```
65+
66+
Example output:
67+
```
68+
Skipping invalid WAV file: ./raw_audio/unsupported_format.wav
69+
Error processing ./raw_audio/corrupted.wav: Failed to open WAV file: ./raw_audio/corrupted.wav
70+
VAD failed for ./raw_audio/no_speech.wav: API returned status 500
71+
VAD complete: 42 files processed, 3 skipped.
72+
```
73+
74+
## Dependencies
75+
76+
This tool relies on the following crates (as defined in `Cargo.toml`):
77+
78+
| Crate | Purpose | Version |
79+
|---|---|---|
80+
| `anyhow` | Contextual error handling | `1.0` |
81+
| `clap` | CLI argument parsing | `4.5` |
82+
| `hound` | WAV file reading and validation | `3.5` |
83+
| `rayon` | Data parallelism | `1.11` |
84+
| `serde` | JSON serialization/deserialization | `1.0` |
85+
| `ureq` | HTTP client for API requests | `3.1` |
86+
| `walkdir` | Recursive directory traversal | `2.5` |
87+
88+
## Contributing
89+
90+
1. Fork the repo.
91+
2. Create a feature branch (`git checkout -b feature/my-feature`).
92+
3. Commit changes (`git commit -am 'Add my feature'`).
93+
4. Push to the branch (`git push origin feature/my-feature`).
94+
5. Open a Pull Request.
95+
96+
Please ensure code is formatted with `cargo fmt` before submitting.
97+
98+
## License
99+
100+
This project is licensed under the MIT License - see the LICENSE file for details.
101+

0 commit comments

Comments
 (0)