Change the repository type filter
All
Repositories list
25 repositories
Pixelle-Video
Public🚀 AI 全自动短视频引擎 | AI Fully Automated Short Video Engine- An Open-Source Multimodal AIGC Solution based on ComfyUI + MCP + LLM https://pixelle.ai
Ovis-Image
PublicOvis-Image is a 7B text-to-image model specifically optimized for high-quality text rendering, designed to operate efficiently under stringent computational constraints.Marco-Voice
PublicOvis-U1
PublicAn unified model that seamlessly integrates multimodal understanding, text-to-image generation, and image editing within a single powerful framework.Agentic-ADK
PublicDiffusion-SDPO
PublicDiffusion-SDPO: Safeguarded Direct Preference Optimization for Diffusion ModelsMarco-MT
PublicMarco-Bench
PublicOvis
PublicA novel Multimodal Large Language Model (MLLM) architecture, designed to structurally align visual and textual embeddings.CHATS
Public- Awesome Unified Multimodal Models
TeEFusion
PublicTeEFusion: Blending Text Embeddings to Distill Classifier-Free Guidance (ICCV 2025)flashinfer
PublicUNIC-Adapter
PublicParrot
Public🎉 The code repository for "Parrot: Multilingual Visual Instruction Tuning" in PyTorch.Marco-o1
PublicTransBench
PublicTG-LLaVA
PublicWings
PublicThe code repository for "Wings: Learning Multimodal LLMs without Text-only Forgetting" [NeurIPS 2024]M3Bench
PublicMeissonic
PublicAutoGPTQ
Public