-
Notifications
You must be signed in to change notification settings - Fork 5
Description
Hi team,
I’m very interested in trying out Jan Server for our company as a replacement for OpenWebUI (OWUI). We currently have OWUI operational, but as a small business (fewer than 5 employees), we’ve found it too complex to fine-tune and maintain. Jan AI’s simplicity — and the major improvements over the past few months — make it look like a fantastic alternative for SMBs like ours.
After reviewing the guides and Docker Compose examples, I’m trying to determine what would be the most optimal deployment architecture for our setup.
🧩 Overview
- Business type: Small business (<5 employees)
- Hardware: Mac M3 Ultra (used as inference machine)
- Infrastructure: Synology NAS (handles reverse proxy, MS365 EntraID SSO, file storage, and local data access)
Current setup: - Synology manages reverse proxy + SSO
- Mac runs an exposed OpenAI-compatible MLX server to handle AI inference requests
We’d like to replicate this setup with Jan Server, ideally with a simpler, more maintainable configuration.
⚙️ Deployment Questions
I’ve reviewed your docker directory and would appreciate some guidance on how best to structure our setup:
- Should I use jan-server/docker/infrastructure.yml as the main template?
- I’m unsure whether this file should also include the llm-api service.
- On the Mac M3 Ultra, should I run one or both of:
- llm-api (/service-api.yml), and/or
- vllm (/inference.yml)
- How should I connect all the components together — via the .env file?
- Do I still need Kong if the Synology NAS is already acting as our reverse proxy (terminating HTTPS connections and forwarding requests)?
- It looks like Kong might still be required to handle API routing between services.
- Would it make more sense to run llm-api on the Synology NAS or on the Mac M3?
- Is the vllm inference setup on macOS tuned for Metal to utilize the M3 Ultra’s GPU and CPU efficiently?
- Or should I continue using an MLX-based OpenAI-compatible server instead?
- Do I need Keycloak for OIDC / SSO integration with Microsoft EntraID, or can Jan Server handle that natively?
- Side question — will Jan eventually expose the OpenAI v1/audio/transcriptions API (ASR)?
I’d love to consolidate everything into Jan AI for our SMB AI platform.
🖥️ Desired Architecture
Ideally, we’d like:
- Synology NAS → runs Jan front-end, SSO integration, DB, chat history, and file management
- Mac M3 Ultra → handles all GPU-intensive inference tasks
Based on what I’ve read, my assumption is that:
- Synology would run a combined infrastructure.yml + service-api.yml setup (without inference),
- and the Mac would run inference.yml for model management and inference hosting.
Does that sound correct?
Also, would the Jan web GUI (and user sign-in) still be accessible via the default web interface on port 8000?
🙏 Closing
Thanks for building such an awesome product — it’s clearly come a long way recently.
I really appreciate your time in helping me confirm the best approach for an SMB setup like ours.
Kind regards,
David
Metadata
Metadata
Assignees
Labels
Type
Projects
Status