|
1 | | -<h1 align="center">Databend: The Next-Gen Cloud [Data+AI] Analytics</h1> |
2 | | -<h2 align="center">SQL for All Data: structured, semi-structured & unstructured multimodal data</h2> |
| 1 | +<h1 align="center">Databend</h1> |
| 2 | +<h3 align="center">It's not only a Snowflake alternative, but a multimodal data warehouse for the AI era.</h3> |
| 3 | +<p align="center">Snowflake-compatible SQL for structured, semi-structured, geospatial, and vector data</p> |
3 | 4 |
|
4 | 5 | <div align="center"> |
5 | 6 |
|
6 | | -<h4 align="center"> |
7 | | - <a href="https://docs.databend.com/guides/cloud">Databend Serverless Cloud (beta)</a> | |
8 | | - <a href="https://docs.databend.com/">Documentation</a> | |
9 | | - <a href="https://benchmark.clickhouse.com/">Benchmarking</a> | |
10 | | - <a href="https://github.com/databendlabs/databend/issues/11868">Roadmap (v1.3)</a> |
11 | | -</h4> |
| 7 | +<a href="https://databend.com/">☁️ Try Cloud</a> • |
| 8 | +<a href="#quick-start">🚀 Quick Start</a> • |
| 9 | +<a href="https://docs.databend.com/">📖 Documentation</a> |
| 10 | + |
| 11 | +<br><br> |
12 | 12 |
|
13 | | -<div> |
14 | 13 | <a href="https://link.databend.com/join-slack"> |
15 | 14 | <img src="https://img.shields.io/badge/slack-databend-0abd59?logo=slack" alt="slack" /> |
16 | 15 | </a> |
17 | | - |
18 | 16 | <a href="https://github.com/databendlabs/databend/actions/workflows/release.yml"> |
19 | 17 | <img src="https://img.shields.io/github/actions/workflow/status/datafuselabs/databend/release.yml?branch=main" alt="CI Status" /> |
20 | 18 | </a> |
21 | | - |
22 | | -<img src="https://img.shields.io/badge/Platform-Linux%2C%20macOS%2C%20ARM-green.svg?style=flat" alt="Linux Platform" /> |
23 | | - |
24 | | -<a href="https://gurubase.io/g/databend"> |
25 | | -<img src="https://img.shields.io/badge/Gurubase-Ask%20Databend%20Guru-006BFF" alt="Gurubase" /> |
26 | | -</a> |
| 19 | +<img src="https://img.shields.io/badge/Platform-Linux%2C%20macOS%2C%20ARM-green.svg?style=flat" alt="Platform" /> |
27 | 20 |
|
28 | 21 | </div> |
29 | | -</div> |
| 22 | + |
| 23 | +<br> |
30 | 24 |
|
31 | 25 | <img src="https://github.com/databendlabs/databend/assets/172204/9997d8bc-6462-4dbd-90e3-527cf50a709c" alt="databend" /> |
32 | 26 |
|
33 | | -## The AI-Native Data Warehouse |
| 27 | +## Why Databend? |
34 | 28 |
|
35 | | -Databend is the **open-source alternative to Snowflake** with **near 100% SQL compatibility** and native AI capabilities. Built in Rust with MPP architecture and S3-native storage, Databend unifies structured tables, JSON documents, and vector embeddings in a single platform. Trusted by **world-class enterprises** managing **800+ petabytes** and **100+ million queries daily**. |
| 29 | +**Multimodal Data Warehouse**: Analyze structured, semi-structured, vector, and geospatial data with unified Snowflake-compatible SQL. |
36 | 30 |
|
37 | | -## Key Features |
| 31 | +**AI-Native Platform**: Built-in vector search, AI functions, embedding generation, and full-text search - no separate systems needed. |
38 | 32 |
|
39 | | -**Performance & Scale** |
40 | | -- **10x Faster**: Rust-powered vectorized execution with SIMD optimization |
41 | | -- **90% Cost Reduction**: S3-native storage eliminates proprietary overhead |
42 | | -- **Infinite Scale**: True compute-storage separation with elastic scaling |
43 | | -- **Production-Proven**: Powers financial analytics, ML pipelines, and real-time AI inference |
| 33 | +**10x Faster & 90% Cost Reduction**: Rust-powered vectorized execution with S3-native storage eliminates vendor lock-in and proprietary overhead. |
44 | 34 |
|
45 | | -**Enterprise Ready** |
46 | | -- **Snowflake Compatible**: Migrate with zero SQL rewrites |
47 | | -- **Multi-Cloud**: Deploy on AWS, Azure, GCP, or on-premise |
48 | | -- **Security**: Role-based access, data masking, audit logging |
49 | | -- **No Vendor Lock-in**: Complete data sovereignty and control |
| 35 | +**Deploy Anywhere, Connect Everything**: 100% open source - run locally with `pip install databend`, self-host, or use managed cloud clusters. All instances share the same data seamlessly. |
50 | 36 |
|
51 | | -## Performance Benchmarks |
| 37 | +**Production Proven**: Trusted by world-class enterprises managing 800+ petabytes and 100+ million queries daily. |
52 | 38 |
|
53 | | -[TPC-H Benchmark: Databend vs. Snowflake](https://docs.databend.com/guides/benchmark/tpch) | [Data Ingestion Benchmark](https://docs.databend.com/guides/benchmark/data-ingest) | [ClickBench Results](https://databend.com/blog/clickbench-databend-top) |
| 39 | +**Enterprise Ready**: Fine-grained access control, data masking, and audit logging with complete data sovereignty. |
54 | 40 |
|
55 | | -## Architecture |
| 41 | +## Quick Start |
56 | 42 |
|
57 | | - |
| 43 | +### Option 1: Databend Cloud Warehouse (Recommended) |
| 44 | +[Start with Databend Cloud](https://docs.databend.com/guides/cloud/) - Serverless warehouse clusters, production-ready in 60 seconds |
58 | 45 |
|
59 | | -**Unified Foundation**: S3-native storage + MPP query engine + elastic compute clusters |
| 46 | +### Option 2: Local Development with Python |
| 47 | +```bash |
| 48 | +pip install databend |
| 49 | +``` |
60 | 50 |
|
61 | | -### Universal Data Processing by Type |
62 | | -- **Structured**: Standard SQL with vectorized execution, ACID transactions, enterprise security, and BI integration |
63 | | -- **Semi-Structured**: [VARIANT data type](https://docs.databend.com/sql/sql-reference/data-types/variant) with [virtual columns](https://docs.databend.com/guides/performance/virtual-column) for zero-config automatic JSON acceleration |
64 | | -- **Unstructured**: [Vector data type](https://docs.databend.com/sql/sql-reference/data-types/vector) with HNSW indexing, [AI functions](https://docs.databend.com/sql/sql-functions/ai-functions/), and [full-text search](https://docs.databend.com/guides/performance/fulltext-index) for multimodal workloads |
| 51 | +```python |
| 52 | +import databend |
65 | 53 |
|
66 | | -## Quick Start |
| 54 | +ctx = databend.SessionContext() |
67 | 55 |
|
68 | | -### Cloud |
69 | | -[Start with Databend Cloud](https://docs.databend.com/guides/cloud/) - Production-ready in 60 seconds |
| 56 | +# Local table for quick testing |
| 57 | +ctx.sql("CREATE TABLE products (id INT, name STRING, price FLOAT)").collect() |
| 58 | +ctx.sql("INSERT INTO products VALUES (1, 'Laptop', 1299.99), (2, 'Phone', 899.50)").collect() |
| 59 | +ctx.sql("SELECT * FROM products").show() |
70 | 60 |
|
71 | | -### Self-Hosted |
72 | | -[Installation Guide](https://docs.databend.com/guides/deploy/QuickStart/) - Deploy anywhere with full control |
| 61 | +# S3 remote table (same as cloud warehouse) |
| 62 | +ctx.create_s3_connection("s3", "your_key", "your_secret") |
| 63 | +ctx.sql("CREATE TABLE sales (id INT, revenue FLOAT) 's3://bucket/sales/' CONNECTION=(connection_name='s3')").collect() |
| 64 | +ctx.sql("SELECT COUNT(*) FROM sales").show() |
| 65 | +``` |
73 | 66 |
|
74 | | -### Connect |
75 | | -[BendSQL CLI](https://docs.databend.com/guides/sql-clients/bendsql) | [Developers Guide](https://docs.databend.com/guides/sql-clients/developers/) |
| 67 | +### Option 3: Docker (Self-Host Experience) |
| 68 | +```bash |
| 69 | +docker run -p 8000:8000 datafuselabs/databend |
| 70 | +``` |
| 71 | +Experience the full warehouse capabilities locally - same features as cloud clusters. |
76 | 72 |
|
77 | | -## Products |
| 73 | +## Benchmarks |
78 | 74 |
|
79 | | -- **Open Source**: 100% open source, complete data sovereignty |
80 | | -- **[Databend Cloud](https://databend.com)**: Managed service with serverless autoscaling |
81 | | -- **Enterprise**: Advanced governance, compliance, and support |
| 75 | +**Performance**: [TPC-H vs Snowflake](https://docs.databend.com/guides/benchmark/tpch) | [ClickBench Results](https://www.databend.com/blog/category-product/clickbench-databend-top) |
| 76 | +**Cost**: [90% Cost Reduction](https://docs.databend.com/guides/benchmark/data-ingest) |
82 | 77 |
|
83 | | -## Community |
| 78 | +## Architecture |
84 | 79 |
|
85 | | -For guidance on using Databend, we recommend starting with the official documentation. If you need further assistance, explore the following community channels: |
| 80 | + |
86 | 81 |
|
87 | | -- [Slack](https://link.databend.com/join-slack) (For live discussion with the Community) |
88 | | -- [GitHub](https://github.com/databendlabs/databend) (Feature/Bug reports, Contributions) |
89 | | -- [Twitter](https://twitter.com/DatabendLabs/) (Get the news fast) |
90 | | -- [I'm feeling lucky](https://link.databend.com/i-m-feeling-lucky) (Pick up a good first issue now!) |
| 82 | +**Multimodal Cloud Warehouse**: Production clusters analyze structured, semi-structured, vector, and geospatial data with Snowflake-compatible SQL. Local development environments can attach to the same warehouse data for seamless development. |
91 | 83 |
|
92 | | -**Your merged code gets you into the `system.contributors` table. Forever.** |
| 84 | +## Use Cases |
93 | 85 |
|
94 | | -## Roadmap & License |
| 86 | +- **Data Analytics**: Snowflake alternative with significant cost reduction |
| 87 | +- **AI/ML Pipelines**: Vector search and AI functions built-in |
| 88 | +- **Real-time Analytics**: High-performance queries on petabyte-scale data |
| 89 | +- **Data Lake Analytics**: Query Parquet, CSV, TSV, NDJSON, Avro, ORC directly from S3 |
95 | 90 |
|
96 | | -- **Roadmap**: [2025 Development Plan](https://github.com/databendlabs/databend/issues/14167) |
97 | | -- **License**: [Apache License 2.0](licenses/Apache-2.0.txt) + [Elastic License 2.0](licenses/Elastic.txt) | [Licensing FAQs](https://docs.databend.com/guides/products/dee/license) |
| 91 | +## Community |
98 | 92 |
|
99 | | -## Acknowledgement |
| 93 | +- [📖 Documentation](https://docs.databend.com/) - Complete guides and references |
| 94 | +- [💬 Slack](https://link.databend.com/join-slack) - Live community discussion |
| 95 | +- [🐛 GitHub Issues](https://github.com/databendlabs/databend/issues) - Bug reports and feature requests |
| 96 | +- [🎯 Good First Issues](https://link.databend.com/i-m-feeling-lucky) - Start contributing today |
100 | 97 |
|
101 | | -**Inspiration**: [ClickHouse](https://github.com/clickhouse/clickhouse) and [Snowflake](https://docs.snowflake.com/en/user-guide/intro-key-concepts.html#snowflake-architecture) | **Foundation**: Apache Arrow | **Hosting**: [Vercel](https://vercel.com/?utm_source=databend&utm_campaign=oss) |
| 98 | +**Contributors get immortalized in `system.contributors` table! 🏆** |
| 99 | + |
| 100 | +## 📄 License |
| 101 | + |
| 102 | +[Apache License 2.0](licenses/Apache-2.0.txt) + [Elastic License 2.0](licenses/Elastic.txt) |
| 103 | +[Licensing FAQs](https://docs.databend.com/guides/products/dee/license) |
102 | 104 |
|
103 | 105 | --- |
104 | 106 |
|
105 | | -*Built by engineers who redefine what's possible with data.* |
| 107 | +<div align="center"> |
| 108 | +<strong>Built by engineers who redefine what's possible with data</strong><br> |
| 109 | +<a href="https://databend.com">🌐 Website</a> • |
| 110 | +<a href="https://x.com/DatabendLabs">🐦 Twitter</a> • |
| 111 | +<a href="https://github.com/databendlabs/databend/issues/14167">🗺️ Roadmap</a> |
| 112 | +</div> |
0 commit comments