Skip to content

anima-kit/milvus-docker

GitHub Workflow Status codecov

Milvus Docker Python Milvus Docker with Python

image

animated

πŸ”– About This Project

TL;DR Learn how to use a vector database on your local machine to store and search your data. Then, you can use this setup as a tool to give to locally run AI agents πŸ€–.

This repo demonstrates how to set up a Milvus server in Docker on your local machine and use it within a Python environment to store and query your data. The Milvus server utilizes a MinIO server for data storage and an etcd server for storage and coordination. It serves as part of the foundation for building AI agents by giving them the ability to obtain relevant information about custom data.

The Docker setup for this repo is based on the official Docker setup from Milvus and the Python methods use the PyMilvus library. See the license section for more details.

This project is part of my broader goal to create tutorials and resources for building agents with LangChain and LangGraph. For more details about how to use this repo and other easily digestible modules to build agents, check it out here.

Now, let's get building!

🏁 Getting Started

  1. Make sure Docker is installed and running.

  2. Clone the repo, head there, then create a Python environment:

    git clone https://github.com/anima-kit/milvus-docker.git
    cd milvus-docker
    python -m venv venv

  3. Activate the Python environment:

    venv/Scripts/activate
  4. Install the necessary Python libraries:

    pip install -r requirements.txt

  5. Build and start all the Docker containers:

    docker compose up -d
  6. Head to http://127.0.0.1:9091/webui/ to check out some useful Milvus client information.

  7. Run the test script to ensure the Milvus server can be reached through the PyMilvus library:

    python -m scripts.milvus_test

  8. When you're done, stop the Docker containers and cleanup with:

    docker compose down

πŸ“ Example Use Cases

After setting everything up, you can now add and search your own data through the provided Python methods.

The main class to interact with the vector database is the MilvusClientInit class which is built on the PyMilvus library. Once this class is initialized, you can manage collections and collection data, as well as search a collection for given queries.

This repo demonstrates how to do a full-text search. In a future tutorial, I'll show how to do a hybrid search, which is a combination of a full-text search (sparse vectors) and a dense vector search to capture semantic meaning.

For example, to manage your data and perform a full-text search through a custom script, follow these steps:

  1. Do step 3 and step 5 of the 🏁 Getting Started section to activate the Python environment and run the Milvus server.

  2. Create a script named my-data-search-ex.py with the following:

    # Import MilvusClientInit class
    from pyfiles.milvus_utils import MilvusClientInit
    
    # Initialize client
    client = MilvusClientInit()
    
    # Create collection
    collection_name = 'my_collection'
    client.create_collection(collection_name)
    
    # Create data and insert into collection
    my_data = [
        {'text': 'grocery list: bananas, bread, choco'},
        {'text': 'grocery: green beans'},
        {'text': 'todo list: start chatbot tutorial, network with community'},
        {'text': 'study list: langchain v1, lucid dreaming and asc'},
        {'text': 'study latest gradio implements'},
        {'text': 'My dream last night involved'},
        {'text': 'then I woke up, confused as to if I was still dreaming.'}
    ]
    client.insert(name=collection_name, data=my_data)
    
    # Define the maximum number of results and the search query
    num_results = 2
    query_list = ['grocery', 'study', 'dream']
    
    # Get results
    client.full_text_search(
        name=collection_name, 
        query_list=query_list, 
        limit=num_results
    )
    
    # (Optional) Delete collection when done to clean up
    client.drop_collection(name=collection_name)

  3. Run the script

    python my-data-search-ex.py
  4. Do step 8 of the Getting Started 🏁 section to stop the containers and cleanup when you're done.

Milvus also allows for a rich customization of search types and index parameters. For a more detailed discussion of what can be done with this repo and with Milvus in general, check out the companion tutorial here.

πŸ“š Next Steps & Learning Resources

This project is part of a series on building AI agents. For a deeper dive, check out my tutorials. Topics include:

  • Setting up local servers (like this one) to power the agent
  • Example agent workflows (simple chatbots to specialized agents)
  • Implementing complex RAG techniques
  • Discussing various aspects of AI beyond agents

Want to learn how to expand this setup? Visit my portfolio to explore more tutorials and projects!

🏯 Project Structure

β”œβ”€β”€ docker-compose.yml      # Docker configurations
β”œβ”€β”€ pyfiles/                # Python source code
β”‚   └── logger.py           # Python logger for tracking progress
β”‚   └── milvus_utils.py     # Python methods to use Milvus server
β”œβ”€β”€ requirements.txt        # Required Python libraries for main app
β”œβ”€β”€ requirements-dev.txt    # Required Python libraries for development
β”œβ”€β”€ scripts/                # Example scripts to use Python methods
β”œβ”€β”€ tests/                  # Testing suite
β”œβ”€β”€ third-party/            # Milvus/PyMilvus licensing
└── validators/             # Validators for Python methods

βš™οΈ Tech

  • Milvus: Vector database setup in Docker
  • PyMilvus: Interacting with Milvus in Python
  • etcd: Data storage and coordination
  • MinIO: Data storage
  • Docker: Setup of all containers

πŸ”— Contributing

This repo is a work in progress. If you'd like to suggest or add improvements, fix bugs or typos etc., feel free to contribute. Check out the contributing guidelines to get started.

πŸ“‘ License

This repo is licensed under MIT. However, note that the Docker setup for this repo is based on the official Docker setup from Milvus and the Python methods utilize the PyMilvus library. Both Milvus and PyMilvus are licensed under Apache 2.0. See the full Milvus license here and the full PyMilvus license here.

About

Building a Milvus server in Docker. Interacting with the server through PyMilvus.

Resources

License

Code of conduct

Contributing

Security policy

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •  

Languages