@@ -12,56 +12,62 @@ title: ""
1212 <b>(Required)</b> <a href="https://arxiv.org/pdf/2407.21783">The Llama 3 Herd of Models</a> (Sections 2, 3.3, and 4.1), <br/><em>Llama Team, AI @ Meta</em>
1313 </li >
1414 {% include paper_item.html key="megatron-sc21" required=true %}
15- {% include paper_item.html key="wlbllm-osdi25" required=false %}
1615</ul >
1716
17+
1818#### Scaling LLM Pre-Training
1919<ul >
20- {% include paper_item.html key="alpa-osdi22" required=true %}
21- {% include paper_item.html key="partir-asplos25" required=false %}
22- {% include paper_item.html key="rdma-sigcomm24" required=true %}
23- {% include paper_item.html key="cassini-nsdi24" required=false %}
24- {% include paper_item.html key="traincheck-osdi25" required=true %}
25- {% include paper_item.html key="superbench-atc24" required=false %}
26- {% include paper_item.html key="oobleck-sosp23" required=true %}
27- {% include paper_item.html key="tenplex-sosp24" required=false %}
20+ {% include paper_item.html key="wlbllm-osdi25" required="Context Parallelism" %}
21+ {% include paper_item.html key="hotspa-sosp24" %}
22+ {% include paper_item.html key="alpa-osdi22" required="Auto Parallelism" %}
23+ {% include paper_item.html key="partir-asplos25" %}
24+ {% include paper_item.html key="rdma-sigcomm24" required="Network" %}
25+ {% include paper_item.html key="cassini-nsdi24" %}
26+ {% include paper_item.html key="traincheck-osdi25" required="Silent Data Corruption" %}
27+ {% include paper_item.html key="superbench-atc24" %}
28+ {% include paper_item.html key="oobleck-sosp23" required="Fault-Tolerance" %}
29+ {% include paper_item.html key="tenplex-sosp24" %}
2830</ul >
2931
3032#### LLM Post-Training for Alignment
3133<ul >
32- {% include paper_item.html key="rlhfuse-nsdi25" required=true %}
33- {% include paper_item.html key="hybridflow-eurosys25" required=false %}
34- {% include paper_item.html key="areal-arxiv25" required=true %}
34+ {% include paper_item.html key="hybridflow-eurosys25" required="Resource Efficiency" %}
35+ {% include paper_item.html key="rlhfuse-nsdi25" %}
36+ {% include paper_item.html key="areal-arxiv25" required="Async RL" %}
37+ {% include paper_item.html key="asyncrlhf-iclr25" %}
3538</ul >
3639
3740#### Efficient LLM Serving
3841<ul >
39- {% include paper_item.html key="pagedattention-sosp23" required=true %}
40- {% include paper_item.html key="nanoflow-osdi25" required=true %}
41- {% include paper_item.html key="sarathiserve-osdi24" required=false %}
42- {% include paper_item.html key="distserve-osdi24" required=true %}
43- {% include paper_item.html key="llumnix-osdi24" required=true %}
42+ {% include paper_item.html key="pagedattention-sosp23" required="KV Cache Management" %}
43+ {% include paper_item.html key="orca-osdi22" %}
44+ {% include paper_item.html key="nanoflow-osdi25" required="Optimal Throughput" %}
45+ {% include paper_item.html key="sarathiserve-osdi24" %}
46+ {% include paper_item.html key="distserve-osdi24" required="Prefill/Decode Disaggregation" %}
47+ {% include paper_item.html key="loongserve-sosp24" %}
48+ {% include paper_item.html key="waferllm-osdi25" required="New Hardware" %}
49+ {% include paper_item.html key="aqua-asplos25" %}
4450</ul >
4551
4652#### Mixture-of-Experts
4753<ul >
48- {% include paper_item.html key="switch-jmlr22" required=true %}
49- {% include paper_item.html key="moe-iclr17" required=false %}
50- {% include paper_item.html key="fsmoe-asplos25" required=true %}
51- {% include paper_item.html key="tutel -mlsys23" required=false %}
52- {% include paper_item.html key="moelight-asplos25" required=true %}
53- {% include paper_item.html key="pregatedmoe-isca24" required=false %}
54- {% include paper_item.html key="readme-neurips24" required=false %}
54+ {% include paper_item.html key="switch-jmlr22" required="MoE Motivation and Architecture" %}
55+ {% include paper_item.html key="moe-iclr17" %}
56+ {% include paper_item.html key="fsmoe-asplos25" required="Training" %}
57+ {% include paper_item.html key="megablock -mlsys23" %}
58+ {% include paper_item.html key="moelight-asplos25" required="Serving" %}
59+ {% include paper_item.html key="pregatedmoe-isca24" %}
60+ {% include paper_item.html key="readme-neurips24" %}
5561</ul >
5662
5763## Part 2 - GenAI: Beyond Simple Text Generation
5864#### Multi-Modal Generation
5965<ul >
6066 {% include paper_item.html key="illstablediff" required=true %}
61- {% include paper_item.html key="approxcache-nsdi24" required=true %}
62- {% include paper_item.html key="diffserve-mlsys24" required=false %}
63- {% include paper_item.html key="cogvideox-iclr25" required=true %}
64- {% include paper_item.html key="moviegen-arxiv24" required=false %}
67+ {% include paper_item.html key="approxcache-nsdi24" required="Diffusion Model Serving" %}
68+ {% include paper_item.html key="diffserve-mlsys24" %}
69+ {% include paper_item.html key="cogvideox-iclr25" required="Video Gen Model" %}
70+ {% include paper_item.html key="moviegen-arxiv24" %}
6571</ul >
6672
6773#### Retrieval-Augmented Generation
0 commit comments