- Python 3.12 (
β οΈ Do not use 3.13 β compatibility issues) - FFmpeg (required for Whisper to process audio)
pyenv local 3.12.3 # ensures 3.12.x is used in this directorypython3.12 -m venv env
source env/bin/activatebrew install ffmpeg # For macOS
# OR
sudo apt install ffmpeg # For Ubuntu/Debianpip install -r requirements.txtFor testing the API endpoints, you can use the following Postman collection:
uvicorn app.main:app --reload- Swagger UI: http://localhost:8000/docs
- ReDoc UI: http://localhost:8000/redoc
Make sure your virtual environment is activated before running tests.
pytestpytest ./tests/unitpytest ./tests/integrationYou can use sample audio files from:
π https://thevoiceovervoice.co.uk/female-voice-over-samples/
Deploy a Dockerized FastAPI service to Google Cloud Run with NVIDIA L4 GPUs. Images are stored in Artifact Registry and built with Cloud Build.
- A Google Cloud project (e.g.
ruxailab-develop) - gcloud CLI installed: Install guide
- Billing enabled on the GCP project
# Project / region / registry
PROJECT_ID="ruxailab-develop" # your-gcp-project
REGION="europe-west4" # choose a region near you / with GPU
REPO="containers" # Artifact Registry repo name
# Image naming
IMAGE="transcription-api"
TAG="gpu-v1" # Change per New Releases :D
# Cloud Run service name
export SERVICE="transcription-api-gpu"gcloud auth login
# Set your active project & region
gcloud config set project "$PROJECT_ID"
gcloud config set run/region "$REGION"gcloud services enable artifactregistry.googleapis.com run.googleapis.com cloudbuild.googleapis.comgcloud artifacts repositories create "$REPO" --repository-format=docker --location="$REGION"gcloud builds submit --tag "$REGION-docker.pkg.dev/$PROJECT_ID/$REPO/$IMAGE:$TAG" .gcloud beta run deploy "$SERVICE" --image "$REGION-docker.pkg.dev/$PROJECT_ID/$REPO/$IMAGE:$TAG" --region "$REGION" --allow-unauthenticated --gpu 1 --gpu-type nvidia-l4 --cpu 4 --memory 16Gi --concurrency 1 --no-cpu-throttling --port 8000 --set-env-vars "DEVICE=cuda,OPENAI_API_KEY=YOUR_API_KEY_HERE"export TAG="gpu-v2"
gcloud builds submit --tag "$REGION-docker.pkg.dev/$PROJECT_ID/$REPO/$IMAGE:$TAG" .
gcloud beta run deploy "$SERVICE" --image "$REGION-docker.pkg.dev/$PROJECT_ID/$REPO/$IMAGE:$TAG" --region "$REGION" --allow-unauthenticated --gpu 1 --gpu-type nvidia-l4 --cpu 4 --memory 16Gi --concurrency 1 --no-cpu-throttling --port 8000export TAG="v1"
gcloud builds submit --tag "$REGION-docker.pkg.dev/$PROJECT_ID/$REPO/$IMAGE:$TAG" .
gcloud run deploy "transcription-api" --image "$REGION-docker.pkg.dev/$PROJECT_ID/$REPO/$IMAGE:$TAG" --region "$REGION" --allow-unauthenticated --cpu 2 --memory 2Gi --port 8000 --set-env-vars "DEVICE=cuda,OPENAI_API_KEY=YOUR_API_KEY_HERE"This repository is part of the Google Summer of Code (GSoC) 2025 program.
- Contributor: Basma Elhoseny
- Mentors: Karine - Marc
- π§ GSoC'25 Project Page: Transcription Tool for Usability Testing GSoC 25 Program
- π§Ύ Proof of Work: gsoc_2025_summary.md
This software is licensed under the MIT License. See the LICENSE file for more information.
Β© 2025 RUXAILAB.
