diff --git a/finetuning/StarCoder2/inference.ipynb b/finetuning/StarCoder2/inference.ipynb index ef11dc162..09a8cb6b3 100644 --- a/finetuning/StarCoder2/inference.ipynb +++ b/finetuning/StarCoder2/inference.ipynb @@ -11,7 +11,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "In the previous [notebook](https://github.com/NVIDIA/GenerativeAIExamples/blob/main/finetuning/StarCoder2/lora.ipynb), we show how to parameter efficiently finetune StarCoder2 model with a custom code (instruction, completion) pair dataset. We choose LoRA as our PEFT algorithnm and finetune for 50 interations. In this notebook, the goal is to demonstrate how to compile fintuned .nemo model into optimized TensorRT-LLM engines. The converted model engine can perform accelerated inference locally or be deployed to Triton Inference Server." + "In the previous [notebook](https://github.com/NVIDIA/GenerativeAIExamples/blob/main/finetuning/StarCoder2/lora.ipynb), we show how to parameter efficiently finetune StarCoder2 model with a custom code (instruction, completion) pair dataset. We choose LoRA as our PEFT algorithnm and finetune for 50 iterations. In this notebook, the goal is to demonstrate how to compile fintuned .nemo model into optimized TensorRT-LLM engines. The converted model engine can perform accelerated inference locally or be deployed to Triton Inference Server." ] }, {