Skip to content

marduk191/Diffusion_model_Quantize_and_upload_gui

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

6 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Model Quantizer & Uploader GUI

A Python GUI application for quantizing AI models and automatically uploading them to Hugging Face repositories. This tool converts models from SafeTensors format to various GGUF quantization formats with a user-friendly interface.

Python License Platform

Features

🎯 Multiple Quantization Formats

Support for 40+ quantization formats including:

  • Standard formats: F32, F16, BF16, Q8_0, Q6_K
  • Q5 variants: Q5_0, Q5_1, Q5_K, Q5_K_S, Q5_K_M
  • Q4 variants: Q4_0, Q4_1, Q4_K, Q4_K_S, Q4_K_M, Q4_0_4_4, Q4_0_4_8, Q4_0_8_8
  • Q3 variants: Q3_K, Q3_K_S, Q3_K_M, Q3_K_L
  • Q2 variants: Q2_K, Q2_K_S
  • Intelligent Quantization (IQ): IQ1_S, IQ1_M, IQ2_XXS through IQ4_XS
  • Ternary Quantization (TQ): TQ1_0, TQ2_0
  • Special formats: fp8_scaled_stochastic

πŸ–₯️ User-Friendly GUI

  • Clean, intuitive tkinter interface
  • Scrollable quantization selection panel
  • Real-time progress monitoring
  • Comprehensive logging output
  • Quick selection buttons (Select All, Deselect All, Select Common)

⚑ Smart Processing

  • Multi-threaded processing (GUI remains responsive)
  • Selective quantization (choose only what you need)
  • Upload control (enable/disable automatic uploads)
  • Error handling and validation
  • Progress tracking with stop functionality

πŸš€ Automated Workflow

  • Batch processing of multiple quantization formats
  • Automatic Hugging Face repository uploads
  • Organized output file structure
  • Commit messages for version tracking

Screenshots

Main Interface

image

Installation

Prerequisites

  • Python 3.11 or higher
  • tkinter (usually included with Python)
  • Required tools in your tools/ directory:
    • convert.py - Model conversion script
    • llama-quantize.exe - GGUF quantization tool
    • convert_fp8_scaled_stochastic.py - FP8 conversion script
  • Hugging Face CLI configured with authentication
  • There is an included batch file for installing the tools.

Setup

  1. Clone this repository:
git clone  https://github.com/marduk191/Diffusion_model_Quantize_and_upload_gui.git
cd model-quantizer-gui
  1. Install dependencies: This will run in your comfyui venv.
pip install tkinter  # If not already available
  1. Set up your directory structure:
project_root/
β”œβ”€β”€ quantizer_gui.py
β”œβ”€β”€ tools/
β”‚   β”œβ”€β”€ convert.py
β”‚   β”œβ”€β”€ llama-quantize.exe
β”‚   └── convert_fp8_scaled_stochastic.py
β”œβ”€β”€ in/
β”‚   └── chroma/
β”‚       └── detailed/
β”‚           └── your-model.safetensors
└── out/
    └── (generated output folders)
  1. Configure Hugging Face CLI:
huggingface-cli login

Usage

Quick Start

  1. Run the application:
python quantizer_gui.py
  1. Configure your settings:

    • File Name: Name of your model file (without extension)
    • Author: Your name/username for file naming
    • Repository: Target Hugging Face repository (username/repo-name)
    • Base Path: Directory containing your project structure
    • Venv Path: Path to your Python virtual environment activation script
  2. Select quantization formats:

    • Use checkboxes to select desired formats
    • Use "Select Common" for most popular formats
    • Use "Select All" to process all available formats
  3. Choose upload option:

    • βœ… Enabled: Quantize and upload automatically
    • ❌ Disabled: Quantize only (no upload)
  4. Click "Start Processing" and monitor the log output

Advanced Usage

Selective Processing

You can run specific quantization types by unchecking unwanted formats. This is useful for:

  • Testing new formats
  • Re-running failed quantizations
  • Processing only high-priority formats

Offline Mode

Disable uploads to work offline or test quantizations:

  • Uncheck "Enable automatic upload after quantization"
  • All files will be saved locally in out/model-name/
  • Upload manually later using Hugging Face CLI

Batch Processing

Process multiple models by:

  1. Changing the file name
  2. Keeping other settings
  3. Running processing again

Directory Structure

Input Structure

project_root/
β”œβ”€β”€ in/
β”‚   └── chroma/
β”‚       └── detailed/
β”‚           └── your-model.safetensors

Output Structure

project_root/
β”œβ”€β”€ out/
β”‚   └── your-model/
β”‚       β”œβ”€β”€ your-model-BF16-author.gguf
β”‚       β”œβ”€β”€ your-model-Q8_0-author.gguf
β”‚       β”œβ”€β”€ your-model-Q5_0-author.gguf
β”‚       β”œβ”€β”€ your-model-Q4_0-author.gguf
β”‚       └── your-model-fp8_scaled_stochastic-author.safetensors

Quantization Format Guide

Format Description Use Case File Size
F32 32-bit float Maximum quality, huge files 100%
F16 16-bit float High quality, large files 50%
BF16 Brain float 16 Good quality, manageable size 50%
Q8_0 8-bit quantization Excellent quality/size balance 25%
Q5_K_M 5-bit K-quant medium Good quality, smaller size 20%
Q4_K_M 4-bit K-quant medium Decent quality, small size 15%
Q4_0 4-bit standard Basic quality, very small 12%
IQ4_XS Intelligent 4-bit Better than Q4_0, similar size 12%
Q2_K 2-bit K-quant Minimal quality, tiny files 8%

Recommended Formats

For most users, these formats provide the best balance:

  • Q8_0: Near-original quality
  • Q5_K_M: Excellent balance
  • Q4_K_M: Good for limited storage
  • Q4_0: Maximum compression

Troubleshooting

Common Issues

"Input file not found"

  • Check that your model file exists in in/chroma/detailed/
  • Verify the file name matches exactly (case-sensitive)
  • Ensure the file has .safetensors extension

"Base file not found for quantization"

  • Make sure BF16 conversion completed successfully first
  • Check that convert.py is working properly
  • Verify the tools directory contains all required scripts

"Upload failed"

  • Confirm Hugging Face CLI is logged in: huggingface-cli whoami
  • Check repository exists and you have write access
  • Verify internet connection

"Command timed out"

  • Large models may take longer than 5 minutes
  • Increase timeout in the code if needed
  • Check system resources (RAM/CPU)

Performance Tips

  • RAM Usage: Large models require significant RAM for quantization
  • Storage: Ensure enough disk space for all output formats
  • CPU: Multi-core CPUs will process quantizations faster
  • Selection: Only select needed formats to save time

Contributing

Contributions are welcome! Please feel free to submit a Pull Request. For major changes, please open an issue first to discuss what you would like to change.

Development Setup

  1. Fork the repository
  2. Create a feature branch (git checkout -b feature/amazing-feature)
  3. Commit your changes (git commit -m 'Add amazing feature')
  4. Push to the branch (git push origin feature/amazing-feature)
  5. Open a Pull Request

License

This project is licensed under the MIT License - see the LICENSE file for details.

Acknowledgments

  • Original batch script by marduk191
  • GGUF quantization tools from the llama.cpp project
  • Hugging Face for model hosting and CLI tools
  • Python tkinter for the GUI framework

Support

If you encounter any issues or have questions:

  1. Check the Issues page
  2. Create a new issue with detailed information
  3. Include log output and error messages

⭐ Star this repository if you find it helpful!

About

A Python GUI application for quantizing AI models and automatically uploading them to Hugging Face repositories.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published