Skip to content

Commit 218ecf5

Browse files
Merge pull request #66 from pescheckit/feature/configurable-default-context
Add configurable default context via CLI and environment variables
2 parents b1cf6a4 + 8c470ef commit 218ecf5

File tree

8 files changed

+404
-10
lines changed

8 files changed

+404
-10
lines changed

README.md

Lines changed: 23 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -142,6 +142,28 @@ The tool extracts context from your PO files and passes it to the AI for more ac
142142

143143
**Tip:** Use detailed context for best results: `msgctxt "status label (not verb)"` works better than just `msgctxt "status"`.
144144

145+
### Default Context
146+
147+
Provide a default context for entries without `msgctxt`:
148+
149+
```bash
150+
# Via command-line
151+
gpt-po-translator --folder ./locales --default-context "web application" --bulk
152+
153+
# Via environment variable
154+
export GPT_TRANSLATOR_CONTEXT="mobile app for iOS"
155+
gpt-po-translator --folder ./locales --bulk
156+
157+
# Via pyproject.toml
158+
# Add to your pyproject.toml:
159+
[tool.gpt-po-translator]
160+
default_context = "e-commerce checkout flow"
161+
```
162+
163+
**Priority:** CLI argument > Environment variable > Config file
164+
165+
The default context is applied to entries without explicit `msgctxt`, while entries with `msgctxt` always take precedence.
166+
145167
## 🏷️ AI Translation Tracking
146168

147169
**All AI translations are automatically tagged** for transparency and compliance:
@@ -172,6 +194,7 @@ This helps you:
172194
| `--list-models` | Show available models |
173195
| `--fix-fuzzy` | Translate fuzzy entries |
174196
| `--folder-language` | Auto-detect languages from folders |
197+
| `--default-context` | Default translation context for entries without msgctxt |
175198
| `--no-ai-comment` | Disable AI tagging |
176199
| `--ollama-base-url` | Ollama server URL (default: `http://localhost:11434`) |
177200
| `--ollama-timeout` | Ollama timeout in seconds (default: 120) |

docs/usage.md

Lines changed: 63 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -171,13 +171,20 @@ Below is a detailed explanation of all command-line arguments:
171171
*Description:* Provides the DeepSeek API key directly.
172172
*Behind the scenes:* This key is required to make API calls to DeepSeek’s translation service.
173173

174-
- **`--folder-language`**
175-
*Description:* Enables inferring the target language from the folder structure.
174+
- **`--folder-language`**
175+
*Description:* Enables inferring the target language from the folder structure.
176176
*Behind the scenes:* The tool inspects the path components (directory names) of each PO file and matches them against the provided language codes. Supports locale codes (e.g., folder `fr_CA` matches `-l fr_CA` for Canadian French, or falls back to `-l fr` for standard French).
177177

178-
- **`--no-ai-comment`**
179-
*Description:* Disables the automatic addition of 'AI-generated' comments to translated entries.
180-
*Behind the scenes:* **By default (without this flag), every translation made by the AI is marked with a `#. AI-generated` comment in the PO file.** This flag prevents that marking, making AI translations indistinguishable from human translations in the file.
178+
- **`--default-context CONTEXT`**
179+
*Description:* Sets a default translation context for entries without `msgctxt`.
180+
*Behind the scenes:* When the tool encounters PO entries without explicit `msgctxt` context, it applies this default context to provide additional information to the AI. Entries with explicit `msgctxt` always take precedence. Can also be set via the `GPT_TRANSLATOR_CONTEXT` environment variable or `default_context` in `pyproject.toml`.
181+
*Priority:* CLI argument > Environment variable > Config file
182+
*Example:* `--default-context "web application UI"` helps the AI understand the context for all translations without specific msgctxt.
183+
*Note:* Use descriptive context (e.g., "e-commerce product page" rather than just "web") for best results.
184+
185+
- **`--no-ai-comment`**
186+
*Description:* Disables the automatic addition of 'AI-generated' comments to translated entries.
187+
*Behind the scenes:* **By default (without this flag), every translation made by the AI is marked with a `#. AI-generated` comment in the PO file.** This flag prevents that marking, making AI translations indistinguishable from human translations in the file.
181188
*Note:* AI tagging is enabled by default for tracking, compliance, and quality assurance purposes.
182189

183190
- **`-v, --verbose`**
@@ -777,6 +784,54 @@ Translate to German: Save
777784

778785
Result: **"Speichern"** (button action) instead of **"Sparen"** (to save money)
779786

787+
### Default Context
788+
789+
For entries without explicit `msgctxt`, you can provide a default context that applies to all translations:
790+
791+
#### Configuration Methods
792+
793+
**1. Command-Line Argument (highest priority):**
794+
```bash
795+
gpt-po-translator --folder ./locales --default-context "web application" --bulk
796+
```
797+
798+
**2. Environment Variable:**
799+
```bash
800+
export GPT_TRANSLATOR_CONTEXT="mobile app for iOS"
801+
gpt-po-translator --folder ./locales --bulk
802+
```
803+
804+
**3. Configuration File (pyproject.toml):**
805+
```toml
806+
[tool.gpt-po-translator]
807+
default_context = "e-commerce checkout flow"
808+
```
809+
810+
#### Priority Order
811+
CLI argument > Environment variable > Config file
812+
813+
#### Behavior
814+
- Entries **with** `msgctxt` → Uses the explicit `msgctxt` (always takes precedence)
815+
- Entries **without** `msgctxt` → Uses the default context
816+
- No default context configured → No context provided (original behavior)
817+
818+
#### Example
819+
```bash
820+
gpt-po-translator --folder ./locales --default-context "medical device interface" --lang de
821+
```
822+
823+
With this setup:
824+
```po
825+
# Entry WITH msgctxt - uses "button"
826+
msgctxt "button"
827+
msgid "Start"
828+
msgstr "" → "Starten" (button action)
829+
830+
# Entry WITHOUT msgctxt - uses default "medical device interface"
831+
msgid "Start"
832+
msgstr "" → "Start" (medical procedure start, preserving technical term)
833+
```
834+
780835
### Best Practices
781836

782837
**✓ Good - Detailed, Explicit Context:**
@@ -796,7 +851,9 @@ msgstr "" → "Halten" (may still be wrong)
796851
**Key Points:**
797852
- **Be explicit** - Describe what you want AND what you don't want
798853
- **Provide examples** - Include similar terms or expected word forms
799-
- **Human review still needed** - msgctxt improves results but doesn't guarantee perfection
854+
- **Use default context for project-wide context** - Helps all translations understand domain (e.g., "legal contract", "gaming UI", "medical records")
855+
- **Use msgctxt for specific terms** - Override default with specific context when needed
856+
- **Human review still needed** - Context improves results but doesn't guarantee perfection
800857

801858
---
802859

python_gpt_po/main.py

Lines changed: 18 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,7 @@
66
from __future__ import annotations
77

88
import logging
9+
import os
910
import sys
1011
import traceback
1112
from argparse import Namespace
@@ -20,6 +21,7 @@
2021
from .services.translation_service import TranslationService
2122
from .utils.cli import (auto_select_provider, create_language_mapping, get_provider_from_args, parse_args,
2223
show_help_and_exit, validate_provider_key)
24+
from .utils.config_loader import ConfigLoader
2325

2426

2527
def setup_logging(verbose: int = 0, quiet: bool = False):
@@ -218,6 +220,20 @@ def main():
218220
logging.warning(
219221
"Note: --fuzzy flag is deprecated. Use --fix-fuzzy for safer fuzzy entry handling."
220222
)
223+
224+
# Get default context: Priority is CLI arg > Env var > Config file
225+
default_context = None
226+
if hasattr(args, 'default_context') and args.default_context:
227+
default_context = args.default_context
228+
elif os.getenv('GPT_TRANSLATOR_CONTEXT'):
229+
default_context = os.getenv('GPT_TRANSLATOR_CONTEXT')
230+
else:
231+
# Try to get from config file
232+
default_context = ConfigLoader.get_default_context(args.folder)
233+
234+
if default_context:
235+
logging.info("Using default translation context: %s", default_context)
236+
221237
# Create translation configuration
222238
flags = TranslationFlags(
223239
bulk_mode=args.bulk,
@@ -230,7 +246,8 @@ def main():
230246
provider_clients=provider_clients,
231247
provider=provider,
232248
model=model,
233-
flags=flags
249+
flags=flags,
250+
default_context=default_context
234251
)
235252

236253
# Process translations

python_gpt_po/models/config.py

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@
55
from __future__ import annotations
66

77
from dataclasses import dataclass
8-
from typing import TYPE_CHECKING
8+
from typing import TYPE_CHECKING, Optional
99

1010
from .enums import ModelProvider
1111

@@ -30,3 +30,4 @@ class TranslationConfig:
3030
provider: ModelProvider
3131
model: str
3232
flags: TranslationFlags
33+
default_context: Optional[str] = None

python_gpt_po/services/translation_service.py

Lines changed: 15 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -855,13 +855,26 @@ def _prepare_translation_request(self, po_file, po_file_path, file_lang, detail_
855855
"""Prepare a translation request from PO file data."""
856856
entries = [entry for entry in po_file if is_entry_untranslated(entry)]
857857
texts = [entry.msgid for entry in entries]
858-
contexts = [entry.msgctxt if hasattr(entry, 'msgctxt') else None for entry in entries]
858+
859+
# Extract contexts from entries, falling back to default_context if not present
860+
contexts = []
861+
for entry in entries:
862+
if hasattr(entry, 'msgctxt') and entry.msgctxt:
863+
contexts.append(entry.msgctxt)
864+
elif self.config.default_context:
865+
contexts.append(self.config.default_context)
866+
else:
867+
contexts.append(None)
868+
859869
detail_lang = detail_languages.get(file_lang) if detail_languages else None
860870

861871
# Log context usage
862872
context_count = sum(1 for c in contexts if c)
873+
default_context_count = sum(1 for c in contexts if c == self.config.default_context)
863874
if context_count > 0:
864-
logging.debug("Found %d entries with msgctxt in %s", context_count, po_file_path)
875+
logging.debug("Found %d entries with context in %s", context_count, po_file_path)
876+
if default_context_count > 0:
877+
logging.debug("Using default context for %d entries", default_context_count)
865878

866879
# Check for and warn about whitespace in msgid
867880
whitespace_entries = [

0 commit comments

Comments
 (0)