A Python GUI application for quantizing AI models and automatically uploading them to Hugging Face repositories. This tool converts models from SafeTensors format to various GGUF quantization formats with a user-friendly interface.
Support for 40+ quantization formats including:
- Standard formats: F32, F16, BF16, Q8_0, Q6_K
- Q5 variants: Q5_0, Q5_1, Q5_K, Q5_K_S, Q5_K_M
- Q4 variants: Q4_0, Q4_1, Q4_K, Q4_K_S, Q4_K_M, Q4_0_4_4, Q4_0_4_8, Q4_0_8_8
- Q3 variants: Q3_K, Q3_K_S, Q3_K_M, Q3_K_L
- Q2 variants: Q2_K, Q2_K_S
- Intelligent Quantization (IQ): IQ1_S, IQ1_M, IQ2_XXS through IQ4_XS
- Ternary Quantization (TQ): TQ1_0, TQ2_0
- Special formats: fp8_scaled_stochastic
- Clean, intuitive tkinter interface
- Scrollable quantization selection panel
- Real-time progress monitoring
- Comprehensive logging output
- Quick selection buttons (Select All, Deselect All, Select Common)
- Multi-threaded processing (GUI remains responsive)
- Selective quantization (choose only what you need)
- Upload control (enable/disable automatic uploads)
- Error handling and validation
- Progress tracking with stop functionality
- Batch processing of multiple quantization formats
- Automatic Hugging Face repository uploads
- Organized output file structure
- Commit messages for version tracking
- Python 3.11 or higher
- tkinter (usually included with Python)
- Required tools in your
tools/directory:convert.py- Model conversion scriptllama-quantize.exe- GGUF quantization toolconvert_fp8_scaled_stochastic.py- FP8 conversion script
- Hugging Face CLI configured with authentication
- There is an included batch file for installing the tools.
- Clone this repository:
git clone https://github.com/marduk191/Diffusion_model_Quantize_and_upload_gui.git
cd model-quantizer-gui- Install dependencies: This will run in your comfyui venv.
pip install tkinter # If not already available- Set up your directory structure:
project_root/
βββ quantizer_gui.py
βββ tools/
β βββ convert.py
β βββ llama-quantize.exe
β βββ convert_fp8_scaled_stochastic.py
βββ in/
β βββ chroma/
β βββ detailed/
β βββ your-model.safetensors
βββ out/
βββ (generated output folders)
- Configure Hugging Face CLI:
huggingface-cli login- Run the application:
python quantizer_gui.py-
Configure your settings:
- File Name: Name of your model file (without extension)
- Author: Your name/username for file naming
- Repository: Target Hugging Face repository (username/repo-name)
- Base Path: Directory containing your project structure
- Venv Path: Path to your Python virtual environment activation script
-
Select quantization formats:
- Use checkboxes to select desired formats
- Use "Select Common" for most popular formats
- Use "Select All" to process all available formats
-
Choose upload option:
- β Enabled: Quantize and upload automatically
- β Disabled: Quantize only (no upload)
-
Click "Start Processing" and monitor the log output
You can run specific quantization types by unchecking unwanted formats. This is useful for:
- Testing new formats
- Re-running failed quantizations
- Processing only high-priority formats
Disable uploads to work offline or test quantizations:
- Uncheck "Enable automatic upload after quantization"
- All files will be saved locally in
out/model-name/ - Upload manually later using Hugging Face CLI
Process multiple models by:
- Changing the file name
- Keeping other settings
- Running processing again
project_root/
βββ in/
β βββ chroma/
β βββ detailed/
β βββ your-model.safetensors
project_root/
βββ out/
β βββ your-model/
β βββ your-model-BF16-author.gguf
β βββ your-model-Q8_0-author.gguf
β βββ your-model-Q5_0-author.gguf
β βββ your-model-Q4_0-author.gguf
β βββ your-model-fp8_scaled_stochastic-author.safetensors
| Format | Description | Use Case | File Size |
|---|---|---|---|
| F32 | 32-bit float | Maximum quality, huge files | 100% |
| F16 | 16-bit float | High quality, large files | 50% |
| BF16 | Brain float 16 | Good quality, manageable size | 50% |
| Q8_0 | 8-bit quantization | Excellent quality/size balance | 25% |
| Q5_K_M | 5-bit K-quant medium | Good quality, smaller size | 20% |
| Q4_K_M | 4-bit K-quant medium | Decent quality, small size | 15% |
| Q4_0 | 4-bit standard | Basic quality, very small | 12% |
| IQ4_XS | Intelligent 4-bit | Better than Q4_0, similar size | 12% |
| Q2_K | 2-bit K-quant | Minimal quality, tiny files | 8% |
For most users, these formats provide the best balance:
- Q8_0: Near-original quality
- Q5_K_M: Excellent balance
- Q4_K_M: Good for limited storage
- Q4_0: Maximum compression
"Input file not found"
- Check that your model file exists in
in/chroma/detailed/ - Verify the file name matches exactly (case-sensitive)
- Ensure the file has
.safetensorsextension
"Base file not found for quantization"
- Make sure BF16 conversion completed successfully first
- Check that
convert.pyis working properly - Verify the tools directory contains all required scripts
"Upload failed"
- Confirm Hugging Face CLI is logged in:
huggingface-cli whoami - Check repository exists and you have write access
- Verify internet connection
"Command timed out"
- Large models may take longer than 5 minutes
- Increase timeout in the code if needed
- Check system resources (RAM/CPU)
- RAM Usage: Large models require significant RAM for quantization
- Storage: Ensure enough disk space for all output formats
- CPU: Multi-core CPUs will process quantizations faster
- Selection: Only select needed formats to save time
Contributions are welcome! Please feel free to submit a Pull Request. For major changes, please open an issue first to discuss what you would like to change.
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature) - Commit your changes (
git commit -m 'Add amazing feature') - Push to the branch (
git push origin feature/amazing-feature) - Open a Pull Request
This project is licensed under the MIT License - see the LICENSE file for details.
- Original batch script by marduk191
- GGUF quantization tools from the llama.cpp project
- Hugging Face for model hosting and CLI tools
- Python tkinter for the GUI framework
If you encounter any issues or have questions:
- Check the Issues page
- Create a new issue with detailed information
- Include log output and error messages
β Star this repository if you find it helpful!