Skip to content

[Performance]: [AutoDeploy] Benchmark Nemotron MoE FP8 ISL/OSL 8K/16K on H100, B200 #8782

@galagam

Description

@galagam

Proposal to improve performance

No response

Report of performance regression

No response

Misc discussion on performance

No response

Your current environment (if you think it is necessary)

No response

Before submitting a new issue...

  • Make sure you already searched for relevant issues, and checked the documentation and examples for answers to frequently asked questions.

Metadata

Metadata

Assignees

Labels

General perf<NV>Broad performance issues not specific to a particular componentPerformanceTRTLLM model inference speed, throughput, efficiency. Latency, benchmarks, regressions, opts.

Type

No type

Projects

Status

In progress

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions