Failed to use fastvlm to inference due to arguments not match

I try to use the [converted version](https://huggingface.co/mlx-community/FastVLM-0.5B-bf16) under mlx-community to do inference but encounter the error as below.

```shell
Traceback (most recent call last):
  File "/Users/hermeschen/Repo/work/taiwan-license-plate-recognition/scripts/recognition/eval.py", line 114, in <module>
    dataset = dataset.map(_generate, desc="Generating Responses")
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/hermeschen/Repo/work/taiwan-license-plate-recognition/.venv/lib/python3.12/site-packages/datasets/arrow_dataset.py", line 562, in wrapper
    out: Union["Dataset", "DatasetDict"] = func(self, *args, **kwargs)
                                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/hermeschen/Repo/work/taiwan-license-plate-recognition/.venv/lib/python3.12/site-packages/datasets/arrow_dataset.py", line 3341, in map
    for rank, done, content in Dataset._map_single(**unprocessed_kwargs):
                               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/hermeschen/Repo/work/taiwan-license-plate-recognition/.venv/lib/python3.12/site-packages/datasets/arrow_dataset.py", line 3673, in _map_single
    for i, example in iter_outputs(shard_iterable):
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/hermeschen/Repo/work/taiwan-license-plate-recognition/.venv/lib/python3.12/site-packages/datasets/arrow_dataset.py", line 3647, in iter_outputs
    yield i, apply_function(example, i, offset=offset)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/hermeschen/Repo/work/taiwan-license-plate-recognition/.venv/lib/python3.12/site-packages/datasets/arrow_dataset.py", line 3570, in apply_function
    processed_inputs = function(*fn_args, *additional_args, **fn_kwargs)
                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/hermeschen/Repo/work/taiwan-license-plate-recognition/scripts/recognition/eval.py", line 102, in _generate
    response = generate(
               ^^^^^^^^^
  File "/Users/hermeschen/Repo/work/taiwan-license-plate-recognition/.venv/lib/python3.12/site-packages/mlx_vlm/generate.py", line 539, in generate
    for response in stream_generate(model, processor, prompt, image, audio, **kwargs):
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/hermeschen/Repo/work/taiwan-license-plate-recognition/.venv/lib/python3.12/site-packages/mlx_vlm/generate.py", line 429, in stream_generate
    for n, (token, logprobs) in enumerate(
                                ^^^^^^^^^^
  File "/Users/hermeschen/Repo/work/taiwan-license-plate-recognition/.venv/lib/python3.12/site-packages/mlx_vlm/generate.py", line 319, in generate_step
    outputs = model(input_ids, pixel_values, cache=prompt_cache, mask=mask, **kwargs)
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/hermeschen/Repo/work/taiwan-license-plate-recognition/.venv/lib/python3.12/site-packages/mlx_vlm/models/fastvlm/fastvlm.py", line 166, in __call__
    logits = self.language_model(
             ^^^^^^^^^^^^^^^^^^^^
  File "/Users/hermeschen/Repo/work/taiwan-license-plate-recognition/.venv/lib/python3.12/site-packages/mlx_vlm/models/fastvlm/language.py", line 29, in __call__
    out = self.model(inputs, None, cache, inputs_embeds)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
TypeError: Qwen2Model.__call__() takes from 2 to 4 positional arguments but 5 were given
```

I'm working on macOS 26, and my environment is like
```shell
mlx                    0.30.0
mlx-lm                 0.28.3
mlx-metal              0.30.0
mlx-vlm                0.3.7
timm                   1.0.22
torch                  2.9.1
torchvision            0.24.1
transformers           4.57.1
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Failed to use fastvlm to inference due to arguments not match #594

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Uh oh!

Failed to use fastvlm to inference due to arguments not match #594

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions