Skip to content

ruxailab/transcription-api

Repository files navigation

πŸ“ Transcription API – Setup & Usage Guide

βœ… Requirements

  • Python 3.12 (⚠️ Do not use 3.13 – compatibility issues)
  • FFmpeg (required for Whisper to process audio)

βš™οΈ Setup Instructions

1. Set Python Version (Optional if using pyenv)

pyenv local 3.12.3  # ensures 3.12.x is used in this directory

2. Create & Activate Virtual Environment

python3.12 -m venv env
source env/bin/activate

3. Install FFmpeg

brew install ffmpeg  # For macOS
# OR
sudo apt install ffmpeg  # For Ubuntu/Debian

4. Install Python Dependencies

pip install -r requirements.txt

🌐 Postman Collection

For testing the API endpoints, you can use the following Postman collection:


πŸš€ Run the API Server

uvicorn app.main:app --reload

🧺 Running Tests

Make sure your virtual environment is activated before running tests.

Run All Tests

pytest

Unit Tests Only

pytest ./tests/unit

Integration Tests Only

pytest ./tests/integration

πŸ”Š Audio Sample Links (For Testing)

You can use sample audio files from:

πŸ”— https://thevoiceovervoice.co.uk/female-voice-over-samples/


πŸ› οΈ Deployment Guide

Deploy a Dockerized FastAPI service to Google Cloud Run with NVIDIA L4 GPUs. Images are stored in Artifact Registry and built with Cloud Build.

Prerequisites

  • A Google Cloud project (e.g. ruxailab-develop)
  • gcloud CLI installed: Install guide
  • Billing enabled on the GCP project

Set your active project & region

# Project / region / registry
PROJECT_ID="ruxailab-develop"     # your-gcp-project
REGION="europe-west4"             # choose a region near you / with GPU
REPO="containers"                 # Artifact Registry repo name

# Image naming
IMAGE="transcription-api"
TAG="gpu-v1"                      # Change per New Releases :D

# Cloud Run service name
export SERVICE="transcription-api-gpu"

Authenticate & set project/region

gcloud auth login

# Set your active project & region
gcloud config set project "$PROJECT_ID"
gcloud config set run/region "$REGION"

Enable required APIs

gcloud services enable   artifactregistry.googleapis.com   run.googleapis.com   cloudbuild.googleapis.com

Create Artifact Registry (Docker)

gcloud artifacts repositories create "$REPO"   --repository-format=docker   --location="$REGION"

Build & Push the Image (Cloud Build)

gcloud builds submit   --tag "$REGION-docker.pkg.dev/$PROJECT_ID/$REPO/$IMAGE:$TAG" .

Deploy to Cloud Run with GPU (L4)

gcloud beta run deploy "$SERVICE"   --image "$REGION-docker.pkg.dev/$PROJECT_ID/$REPO/$IMAGE:$TAG"   --region "$REGION"   --allow-unauthenticated   --gpu 1   --gpu-type nvidia-l4   --cpu 4   --memory 16Gi   --concurrency 1   --no-cpu-throttling   --port 8000   --set-env-vars "DEVICE=cuda,OPENAI_API_KEY=YOUR_API_KEY_HERE"

Updating to a New Version

export TAG="gpu-v2"
gcloud builds submit   --tag "$REGION-docker.pkg.dev/$PROJECT_ID/$REPO/$IMAGE:$TAG" .

gcloud beta run deploy "$SERVICE"   --image "$REGION-docker.pkg.dev/$PROJECT_ID/$REPO/$IMAGE:$TAG"   --region "$REGION"   --allow-unauthenticated   --gpu 1   --gpu-type nvidia-l4   --cpu 4   --memory 16Gi   --concurrency 1   --no-cpu-throttling   --port 8000

Optional: CPU-only Deployment

export TAG="v1"
gcloud builds submit   --tag "$REGION-docker.pkg.dev/$PROJECT_ID/$REPO/$IMAGE:$TAG" .

gcloud run deploy "transcription-api"   --image "$REGION-docker.pkg.dev/$PROJECT_ID/$REPO/$IMAGE:$TAG"   --region "$REGION"   --allow-unauthenticated   --cpu 2   --memory 2Gi   --port 8000  --set-env-vars "DEVICE=cuda,OPENAI_API_KEY=YOUR_API_KEY_HERE"

GSoC Docs

This repository is part of the Google Summer of Code (GSoC) 2025 program.

πŸ”— Useful Links

License

This software is licensed under the MIT License. See the LICENSE file for more information.

Β© 2025 RUXAILAB.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published