Skip to content

Commit a9a6d6c

Browse files
Merge pull request #64 from pescheckit/feature_added-ollama
Feature added ollama
2 parents 3985e05 + 6eca9b1 commit a9a6d6c

File tree

10 files changed

+687
-13
lines changed

10 files changed

+687
-13
lines changed

README.md

Lines changed: 40 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@
44
![PyPI](https://img.shields.io/pypi/v/gpt-po-translator?label=gpt-po-translator)
55
![Downloads](https://pepy.tech/badge/gpt-po-translator)
66

7-
**Translate gettext (.po) files using AI models.** Supports OpenAI, Azure OpenAI, Anthropic/Claude, and DeepSeek with automatic AI translation tagging.
7+
**Translate gettext (.po) files using AI models.** Supports OpenAI, Azure OpenAI, Anthropic/Claude, DeepSeek, and Ollama (local) with automatic AI translation tagging.
88

99
## 🚀 Quick Start
1010

@@ -21,7 +21,8 @@ gpt-po-translator --folder ./locales --bulk
2121

2222
## ✨ Key Features
2323

24-
- **Multiple AI providers** - OpenAI, Azure OpenAI, Anthropic/Claude, DeepSeek
24+
- **Multiple AI providers** - OpenAI, Azure OpenAI, Anthropic/Claude, DeepSeek, and Ollama (local)
25+
- **Privacy option** - Use Ollama for local, offline translations with no cloud API
2526
- **AI translation tracking** - Auto-tags AI-generated translations with `#. AI-generated` comments
2627
- **Bulk processing** - Efficient batch translation for large files
2728
- **Smart language detection** - Auto-detects target languages from folder structure
@@ -49,7 +50,7 @@ pip install -e .
4950

5051
## 🔧 Setup
5152

52-
### API Keys
53+
### API Keys (Cloud Providers)
5354

5455
Choose your AI provider and set the corresponding API key:
5556

@@ -69,6 +70,26 @@ export AZURE_OPENAI_ENDPOINT='https://your-resource.openai.azure.com/'
6970
export AZURE_OPENAI_API_VERSION='2024-02-01'
7071
```
7172

73+
### Or Use Ollama (Local, No API Key Needed)
74+
75+
```bash
76+
# 1. Install Ollama
77+
curl -fsSL https://ollama.com/install.sh | sh
78+
79+
# 2. Pull a model
80+
ollama pull qwen2.5 # Best for multilingual (Arabic, Chinese, etc.)
81+
# OR
82+
ollama pull llama3.2 # Fast for European languages
83+
84+
# 3. Translate (no API key required!)
85+
gpt-po-translator --provider ollama --folder ./locales
86+
87+
# For non-Latin scripts, use qwen2.5 WITHOUT --bulk
88+
gpt-po-translator --provider ollama --model qwen2.5 --folder ./locales --lang ar
89+
```
90+
91+
> **💡 Important:** For Ollama with **non-Latin languages** (Arabic, Chinese, Japanese, etc.), **omit the `--bulk` flag**. Single-item translation is more reliable because the model doesn't have to format responses as JSON.
92+
7293
## 💡 Usage Examples
7394

7495
### Basic Translation
@@ -93,16 +114,28 @@ gpt-po-translator --provider deepseek --folder ./locales --lang de
93114

94115
# Use Azure OpenAI with auto-detection
95116
gpt-po-translator --provider azure_openai --folder ./locales --bulk
117+
118+
# Use Ollama (local, private, free) - omit --bulk for non-Latin scripts
119+
gpt-po-translator --provider ollama --folder ./locales
96120
```
97121

98122
### Docker Usage
99123
```bash
100-
# Basic usage
124+
# Basic usage with OpenAI
101125
docker run -v $(pwd):/data \
102126
-e OPENAI_API_KEY="your_key" \
103127
ghcr.io/pescheckit/python-gpt-po:latest \
104128
--folder /data --bulk
105129

130+
# With Ollama (local, no API key needed)
131+
# Note: Omit --bulk for better quality with non-Latin scripts
132+
docker run --rm \
133+
-v $(pwd):/data \
134+
--network host \
135+
ghcr.io/pescheckit/python-gpt-po:latest \
136+
--provider ollama \
137+
--folder /data
138+
106139
# With Azure OpenAI
107140
docker run -v $(pwd):/data \
108141
-e AZURE_OPENAI_API_KEY="your_key" \
@@ -135,14 +168,16 @@ This helps you:
135168
|--------|-------------|
136169
| `--folder` | Path to .po files |
137170
| `--lang` | Target languages (e.g., `de,fr,es`, `fr_CA`, `pt_BR`) |
138-
| `--provider` | AI provider: `openai`, `azure_openai`, `anthropic`, `deepseek` |
171+
| `--provider` | AI provider: `openai`, `azure_openai`, `anthropic`, `deepseek`, `ollama` |
139172
| `--bulk` | Enable batch translation (recommended for large files) |
140173
| `--bulksize` | Entries per batch (default: 50) |
141174
| `--model` | Specific model to use |
142175
| `--list-models` | Show available models |
143176
| `--fix-fuzzy` | Translate fuzzy entries |
144177
| `--folder-language` | Auto-detect languages from folders |
145178
| `--no-ai-comment` | Disable AI tagging |
179+
| `--ollama-base-url` | Ollama server URL (default: `http://localhost:11434`) |
180+
| `--ollama-timeout` | Ollama timeout in seconds (default: 120) |
146181
| `-v, --verbose` | Show progress information (use `-vv` for debug) |
147182
| `-q, --quiet` | Only show errors |
148183
| `--version` | Show version and exit |

docs/usage.md

Lines changed: 245 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -426,6 +426,251 @@ gpt-po-translator --folder ./locales --lang de --no-ai-comment
426426

427427
---
428428

429+
## Using Ollama (Local AI Provider)
430+
431+
### Overview
432+
433+
Ollama allows you to run AI models locally on your machine, providing:
434+
- **Privacy**: All translations happen locally, no data sent to cloud services
435+
- **Cost**: No API fees - completely free
436+
- **Offline**: Works without internet connection
437+
- **Control**: Full control over model and infrastructure
438+
439+
### Prerequisites
440+
441+
1. **Install Ollama**
442+
```bash
443+
# macOS/Linux
444+
curl -fsSL https://ollama.com/install.sh | sh
445+
446+
# Or download from https://ollama.com
447+
```
448+
449+
2. **Pull a model**
450+
```bash
451+
# For multilingual (Arabic, Chinese, etc.)
452+
ollama pull qwen2.5
453+
454+
# For European languages only
455+
ollama pull llama3.2
456+
457+
# Other options
458+
ollama pull llama3.1 # Better quality, slower
459+
ollama pull mistral # Good for European languages
460+
```
461+
462+
3. **Start Ollama** (if not already running)
463+
```bash
464+
ollama serve
465+
```
466+
467+
### Basic Usage
468+
469+
```bash
470+
# Latin scripts (English, French, Spanish, etc.) - can use bulk mode
471+
gpt-po-translator --provider ollama --folder ./locales --bulk
472+
473+
# Non-Latin scripts (Arabic, Chinese, Japanese, etc.) - omit --bulk for better quality
474+
gpt-po-translator --provider ollama --model qwen2.5 --folder ./locales --lang ar
475+
476+
# Specify a model
477+
gpt-po-translator --provider ollama --model llama3.1 --folder ./locales
478+
479+
# List available models
480+
gpt-po-translator --provider ollama --list-models
481+
```
482+
483+
> **⚠️ Important:** For **non-Latin languages**, **omit the `--bulk` flag**. Local models struggle with JSON formatting for Arabic/Chinese/etc., resulting in poor translation quality or errors. Single-item mode is more reliable.
484+
485+
### Configuration
486+
487+
#### Option 1: Environment Variable
488+
489+
```bash
490+
export OLLAMA_BASE_URL="http://localhost:11434"
491+
gpt-po-translator --provider ollama --folder ./locales --bulk
492+
```
493+
494+
#### Option 2: CLI Arguments
495+
496+
```bash
497+
# Custom port
498+
gpt-po-translator --provider ollama \
499+
--ollama-base-url http://localhost:8080 \
500+
--folder ./locales --bulk
501+
502+
# Increase timeout for slow models
503+
gpt-po-translator --provider ollama \
504+
--ollama-timeout 300 \
505+
--folder ./locales --bulk
506+
```
507+
508+
#### Option 3: Config File
509+
510+
Add to your `pyproject.toml`:
511+
512+
```toml
513+
[tool.gpt-po-translator.provider.ollama]
514+
base_url = "http://localhost:11434"
515+
model = "llama3.2"
516+
timeout = 120
517+
518+
[tool.gpt-po-translator]
519+
bulk_mode = true
520+
bulk_size = 50
521+
```
522+
523+
Then simply run:
524+
```bash
525+
gpt-po-translator --provider ollama --folder ./locales
526+
```
527+
528+
### Advanced Scenarios
529+
530+
#### Remote Ollama Server
531+
532+
Run Ollama on a different machine:
533+
534+
```bash
535+
# On the Ollama server (192.168.1.100)
536+
ollama serve --host 0.0.0.0
537+
538+
# On your machine
539+
gpt-po-translator --provider ollama \
540+
--ollama-base-url http://192.168.1.100:11434 \
541+
--folder ./locales --bulk
542+
```
543+
544+
Or set in `pyproject.toml`:
545+
```toml
546+
[tool.gpt-po-translator.provider.ollama]
547+
base_url = "http://192.168.1.100:11434"
548+
```
549+
550+
#### Docker with Ollama
551+
552+
Run Ollama on your host machine, then use Docker with `--network host`:
553+
554+
```bash
555+
# 1. Start Ollama on host
556+
ollama serve
557+
558+
# 2. Pull a model on host
559+
ollama pull qwen2.5
560+
561+
# 3. Run translator in Docker (Linux/macOS)
562+
docker run --rm \
563+
-v $(pwd):/data \
564+
--network host \
565+
ghcr.io/pescheckit/python-gpt-po:latest \
566+
--provider ollama \
567+
--folder /data
568+
569+
# macOS/Windows Docker Desktop: use host.docker.internal
570+
docker run --rm \
571+
-v $(pwd):/data \
572+
ghcr.io/pescheckit/python-gpt-po:latest \
573+
--provider ollama \
574+
--ollama-base-url http://host.docker.internal:11434 \
575+
--folder /data
576+
```
577+
578+
**With config file:**
579+
```bash
580+
# Add Ollama config to pyproject.toml in your project
581+
docker run --rm \
582+
-v $(pwd):/data \
583+
-v $(pwd)/pyproject.toml:/data/pyproject.toml \
584+
--network host \
585+
ghcr.io/pescheckit/python-gpt-po:latest \
586+
--provider ollama \
587+
--folder /data
588+
```
589+
590+
### Performance Considerations
591+
592+
**Pros:**
593+
- No API costs
594+
- Privacy and data control
595+
- No rate limits
596+
- Offline capability
597+
598+
**Cons:**
599+
- Quality varies by model (may not match GPT-4)
600+
- Requires local resources (RAM, GPU recommended)
601+
- Initial setup needed (install Ollama, pull models)
602+
603+
**Performance Tips:**
604+
1. **Use GPU**: Install Ollama with GPU support for 10-100x speedup
605+
2. **Choose appropriate models**:
606+
- Small projects: `llama3.2` (fast, good quality)
607+
- Better quality: `llama3.1` (slower, better accuracy)
608+
- Multilingual: `qwen2.5` (excellent for non-Latin scripts like Arabic, Chinese, etc.)
609+
- Specialized: `mistral`, `gemma2`
610+
3. **Increase timeout** for large models: `--ollama-timeout 300`
611+
4. **Bulk mode vs Single mode**:
612+
- **Bulk mode (`--bulk`)**: Faster but requires model to return valid JSON - recommended for cloud providers
613+
- **Single mode (no `--bulk`)**: Slower but more reliable for local models, especially with non-Latin scripts
614+
- For Ollama with languages like Arabic/Chinese/Japanese, **omit `--bulk`** for better quality
615+
616+
### Recommended Models for Translation
617+
618+
| Model | Size | Speed | Quality | Best For |
619+
|-------|------|-------|---------|----------|
620+
| `llama3.2` | 3B | ⚡⚡⚡ Fast | ⭐⭐⭐ Good | General use, Latin scripts only |
621+
| `llama3.1` | 8B | ⚡⚡ Medium | ⭐⭐⭐⭐ Better | Better quality, medium projects |
622+
| `qwen2.5` | 7B | ⚡⚡ Medium | ⭐⭐⭐⭐ Excellent | **Multilingual** (Arabic, Chinese, etc.) |
623+
| `mistral` | 7B | ⚡⚡ Medium | ⭐⭐⭐ Good | European languages |
624+
| `gemma2` | 9B | ⚡ Slower | ⭐⭐⭐⭐ Better | High quality translations |
625+
626+
**Note:** For non-Latin scripts (Arabic, Chinese, Japanese, etc.), use `qwen2.5` or larger models **without `--bulk` flag** for best results.
627+
628+
### Troubleshooting
629+
630+
**"Cannot connect to Ollama"**
631+
```bash
632+
# Check if Ollama is running
633+
curl http://localhost:11434/api/tags
634+
635+
# Start Ollama
636+
ollama serve
637+
638+
# Check if running on different port
639+
ollama serve --help
640+
```
641+
642+
**Slow translations**
643+
- Use GPU-enabled Ollama installation
644+
- Choose a smaller model (`llama3.2` instead of `llama3.1`)
645+
- Increase `--bulksize` to batch more entries together
646+
- Close other applications to free up RAM
647+
648+
**Model not found**
649+
```bash
650+
# List installed models
651+
ollama list
652+
653+
# Pull the model
654+
ollama pull llama3.2
655+
```
656+
657+
**Timeout errors**
658+
```bash
659+
# Increase timeout
660+
gpt-po-translator --provider ollama --ollama-timeout 300 --folder ./locales
661+
```
662+
663+
### Configuration Priority
664+
665+
Ollama settings are loaded in this order (highest to lowest):
666+
667+
1. **CLI arguments**: `--ollama-base-url`, `--ollama-timeout`
668+
2. **Environment variables**: `OLLAMA_BASE_URL`
669+
3. **Config file**: `pyproject.toml` under `[tool.gpt-po-translator.provider.ollama]`
670+
4. **Defaults**: `http://localhost:11434`, timeout `120s`
671+
672+
---
673+
429674
## Whitespace Handling in Translations
430675

431676
### Overview

python_gpt_po/models/enums.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -11,6 +11,7 @@ class ModelProvider(Enum):
1111
ANTHROPIC = "anthropic"
1212
DEEPSEEK = "deepseek"
1313
AZURE_OPENAI = "azure_openai"
14+
OLLAMA = "ollama"
1415

1516

1617
ModelProviderList = [provider.value for provider in ModelProvider]

0 commit comments

Comments
 (0)