Batch Processing Feature

# Overview
The goal is to add support for efficient batch processing of inputs to the MLX-VLM library. This will allow users to process multiple images and text prompts simultaneously to generate corresponding outputs in a single batch, improving performance.

Use cases:

1. Generating captions for a large dataset of images.
2. Localizing objects or regions in a batch of images based on textual descriptions.
3. Classifying a large number of images into predefined categories, considering accompanying text information.
4. Answering questions based on a batch of images (single and multiple question prompts).
5. Video processing.


Note: Tag @Blaizzy for code reviews and questions.
## Requirements

### Support batched inputs:

- Accept a batch of images as input, provided as a list or array of image objects.
- Accept a batch of text prompts as input, provided as a list or array of strings.
- Accept a single text prompt as input, provided as a string.


### Perform batch processing:

- Process the batch of images and text prompts simultaneously (async) using the MLX-VLM model.
- Utilize parallel processing or GPU acceleration to optimize batch processing performance.
- Ensure that the processing of one input in the batch does not affect the processing of other inputs.


### Generate batched outputs:

- Return the generated outputs for each input in the batch.
- Maintain the order of the outputs corresponding to the order of the inputs.
- Support different output formats such as text, embeddings, or visual representations based on the specific task.


### Error handling:

- Handle errors gracefully during batch processing.
- Provide informative error messages for invalid inputs or processing failures.
- Continue processing the remaining inputs in the batch if an error occurs for a specific input.


### API design:

- Provide a clear and intuitive API for users to perform batch processing.
- Allow users to specify the maximum batch size supported by their system.
- Provide options to control the batch processing behavior, such as enabling/disabling parallel processing.


### Documentation and examples:

- Update the library documentation to include information about the batch processing feature.
- Provide code examples demonstrating how to use the batch processing API effectively.
- Include performance benchmarks and guidelines for optimal batch sizes based on system resources.


## Implementation

- Modify the existing input handling logic to accept batches of images and text prompts.
- Implement batch processing functionality using parallel processing techniques or GPU acceleration libraries.
- Optimize memory usage and performance for efficient batch processing.
- Update the output generation logic to handle batched outputs and maintain the correct order.
- Implement error handling mechanisms to gracefully handle and report errors during batch processing.
- Design and expose a user-friendly API for performing batch processing.
- Write unit tests to verify the correctness and performance of the batch processing implementation.
- Update the library documentation and provide code examples for using the batch processing feature.

## Testing

- Prepare a comprehensive test suite to validate the batch processing functionality.
- Test with different batch sizes and input variations to ensure robustness.
- Verify that the generated outputs match the expected results for each input in the batch.
- Measure the performance improvement gained by batch processing compared to individual processing.
- Conduct error handling tests to ensure graceful handling of invalid inputs and processing failures.

## Delivery

- Integrate the batch processing feature into the existing MLX-VLM library codebase.
- Ensure backward compatibility with previous versions of the library.
- Provide release notes highlighting the new batch processing capability and any breaking changes.
- Update the library version number following semantic versioning conventions.
- Publish the updated library package to the relevant package repositories or distribution channels.

By implementing this batch processing feature, MLX-VLM will provide users with the ability to efficiently process multiple inputs simultaneously, improving performance and usability of the library for various vision-language tasks.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Batch Processing Feature #40

Overview

Requirements

Support batched inputs:

Perform batch processing:

Generate batched outputs:

Error handling:

API design:

Documentation and examples:

Implementation

Testing

Delivery

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Uh oh!

Batch Processing Feature #40

Description

Overview

Requirements

Support batched inputs:

Perform batch processing:

Generate batched outputs:

Error handling:

API design:

Documentation and examples:

Implementation

Testing

Delivery

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions