Skip to content

Commit 361cd28

Browse files
author
herizhen
committed
Delete equals sign
Signed-off-by: herizhen <[email protected]>
1 parent 9ff0f26 commit 361cd28

File tree

1 file changed

+1
-1
lines changed

1 file changed

+1
-1
lines changed

docs/source/developer_guide/feature_guide/Multi_Token_Prediction.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,7 @@ MTP boosts inference performance by parallelizing the prediction of multiple tok
66
## How to Use MTP
77
To enable MTP for DeepSeek-V3 models, add the following parameter when starting the service:
88

9-
`--speculative_config {"method": "deepseek_mtp", "num_speculative_tokens": 1, "disable_padded_drafter_batch": False}`
9+
--speculative_config ' {"method": "deepseek_mtp", "num_speculative_tokens": 1, "disable_padded_drafter_batch": False} '
1010

1111
- `num_speculative_tokens`: The number of speculative tokens which enable model to predict multiple tokens at once, if provided. It will default to the number in the draft model config if present, otherwise, it is required.
1212
- `disable_padded_drafter_batch`: Disable input padding for speculative decoding. If set to True, speculative input batches can contain sequences of different lengths, which may only be supported by certain attention backends. This currently only affects the MTP method of speculation, default is False.

0 commit comments

Comments
 (0)