diff --git a/Gemma/Run_with_Ollama_Python.ipynb b/Gemma/Run_with_Ollama_Python.ipynb index 48967a6..1294a34 100644 --- a/Gemma/Run_with_Ollama_Python.ipynb +++ b/Gemma/Run_with_Ollama_Python.ipynb @@ -1,419 +1,419 @@ { - "cells": [ - { - "cell_type": "markdown", - "metadata": { - "id": "j4qMNV_533ls" - }, - "source": [ - "##### Copyright 2024 Google LLC." - ] + "cells": [ + { + "cell_type": "markdown", + "metadata": { + "id": "j4qMNV_533ls" + }, + "source": [ + "##### Copyright 2024 Google LLC." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "cellView": "form", + "id": "urf-mQKk348O" + }, + "outputs": [], + "source": [ + "# @title Licensed under the Apache License, Version 2.0 (the \"License\");\n", + "# you may not use this file except in compliance with the License.\n", + "# You may obtain a copy of the License at\n", + "#\n", + "# https://www.apache.org/licenses/LICENSE-2.0\n", + "#\n", + "# Unless required by applicable law or agreed to in writing, software\n", + "# distributed under the License is distributed on an \"AS IS\" BASIS,\n", + "# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n", + "# See the License for the specific language governing permissions and\n", + "# limitations under the License." + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "iS5fT77w4ZNj" + }, + "source": [ + "# Gemma - Run with Ollama Python library\n", + "\n", + "Author: Sitam Meur\n", + "\n", + "* GitHub: [github.com/sitamgithub-MSIT](https://github.com/sitamgithub-MSIT/)\n", + "* X: [@sitammeur](https://x.com/sitammeur)\n", + "\n", + "Description: This notebook demonstrates how you can run inference on a Gemma model using [Ollama Python library](https://github.com/ollama/ollama-python). The Ollama Python library provides the easiest way to integrate Python 3.8+ projects with Ollama.\n", + "\n", + "\n", + " \n", + "
\n", + " Run in Google Colab\n", + "
" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "FF6vOV_74aqj" + }, + "source": [ + "## Setup\n", + "\n", + "### Select the Colab runtime\n", + "To complete this tutorial, you'll need to have a Colab runtime with sufficient resources to run the Gemma model. In this case, you can use a T4 GPU:\n", + "\n", + "1. In the upper-right of the Colab window, select **▾ (Additional connection options)**.\n", + "2. Select **Change runtime type**.\n", + "3. Under **Hardware accelerator**, select **T4 GPU**." + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "4tlnekw44gaq" + }, + "source": [ + "## Installation" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "AL7futP_4laS" + }, + "source": [ + "Install Ollama through the offical installation script." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "MuV7cWtcAoSV" + }, + "outputs": [], + "source": [ + "!curl -fsSL https://ollama.com/install.sh | sh" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "FpV183Rv6-1P" + }, + "source": [ + "Install Ollama Python library through the official Python client for Ollama." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "0Mrj29SH-3OD" + }, + "outputs": [], + "source": [ + "!pip install -q ollama" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "iNxJFvGIe48_" + }, + "source": [ + "## Start Ollama\n", + "\n", + "Start Ollama in background using nohup." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "5CX39xKVe9UN" + }, + "outputs": [], + "source": [ + "!nohup ollama serve > ollama.log &" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "6YfDqlyo46Rp" + }, + "source": [ + "## Prerequisites\n", + "\n", + "* Ollama should be installed and running. (This was already completed in previous steps.)\n", + "* Pull the gemma2 model to use with the library: `ollama pull gemma2:2b`\n", + " * See [Ollama.com](https://ollama.com/) for more information on the models available." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "oPU5dA1-B5Fn" + }, + "outputs": [], + "source": [ + "import ollama" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "NE1AWlucBza_" + }, + "outputs": [], + "source": [ + "ollama.pull('gemma2:2b')" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "KL5Kc6HaKjmF" + }, + "source": [ + "## Inference\n", + "\n", + "Run inference using Ollama Python library." + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "HN0nrhpmFUUB" + }, + "source": [ + "### Generate" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "AC5FsDQuFZfb" + }, + "outputs": [], + "source": [ + "import markdown\n", + "from ollama import generate\n", + "\n", + "# Generate a response to a prompt\n", + "response = generate(\"gemma2:2b\", \"Explain the process of photosynthesis.\")\n", + "print(response[\"response\"])" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "p8v050JkFhuY" + }, + "source": [ + "#### Streaming Responses\n", + "\n", + "To enable response streaming, set `stream=True`." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "jw8UD2v5Fms6" + }, + "outputs": [], + "source": [ + "# Stream the generated response\n", + "response = generate('gemma2:2b', 'Explain the process of photosynthesis.', stream=True)\n", + "\n", + "for part in response:\n", + " print(part['response'], end='', flush=True)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "utyVRlIYFvdm" + }, + "source": [ + "#### Async client\n", + "\n", + "To make asynchronous requests, use the `AsyncClient` class." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "JVvKQE0XF2kh" + }, + "outputs": [], + "source": [ + "import asyncio\n", + "import nest_asyncio\n", + "from ollama import AsyncClient\n", + "\n", + "nest_asyncio.apply()\n", + "\n", + "\n", + "async def generate():\n", + " \"\"\"\n", + " Asynchronously generates a response to a given prompt using the AsyncClient.\n", + "\n", + " This function creates an instance of AsyncClient and sends a request to generate\n", + " a response for the specified prompt. The response is then printed.\n", + " \"\"\"\n", + " # Create an instance of the AsyncClient\n", + " client = AsyncClient()\n", + "\n", + " # Send a request to generate a response to the prompt\n", + " response = await client.generate(\n", + " \"gemma2:2b\", \"Explain the process of photosynthesis.\"\n", + " )\n", + " print(response[\"response\"])\n", + "\n", + "# Run the generate function\n", + "asyncio.run(generate())" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "WFyi_zzWAwe7" + }, + "source": [ + "### Chat" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "UN20MoSv_S76" + }, + "outputs": [], + "source": [ + "from ollama import chat\n", + "\n", + "# Start a conversation with the model\n", + "messages = [\n", + " {\n", + " \"role\": \"user\",\n", + " \"content\": \"What is keras?\",\n", + " },\n", + "]\n", + "\n", + "# Get the model's response to the message\n", + "response = chat(\"gemma2:2b\", messages=messages)\n", + "print(response[\"message\"][\"content\"])" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "HZGsL1FI7kj8" + }, + "source": [ + "#### Streaming Responses\n", + "\n", + "To enable response streaming, set `stream=True`." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "UdxAuS1C7lm_" + }, + "outputs": [], + "source": [ + "# Stream the chat response\n", + "stream = chat(\n", + " model=\"gemma2:2b\",\n", + " messages=[{\"role\": \"user\", \"content\": \"What is keras?\"}],\n", + " stream=True,\n", + ")\n", + "\n", + "for chunk in stream:\n", + " print(chunk[\"message\"][\"content\"], end=\"\", flush=True)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "czWceNYOEizg" + }, + "source": [ + "#### Async client + Streaming\n", + "\n", + "To make asynchronous requests, use the `AsyncClient` class, and for streaming, use `stream=True`." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "YJmd92z1IUVl" + }, + "outputs": [], + "source": [ + "import asyncio\n", + "import nest_asyncio\n", + "from ollama import AsyncClient\n", + "\n", + "nest_asyncio.apply()\n", + "\n", + "\n", + "async def chat():\n", + " \"\"\"\n", + " Asynchronously sends a chat message to the specified model and prints the response.\n", + "\n", + " This function sends a message with the role \"user\" and the content \"What is keras?\"\n", + " to the model \"gemma2:2b\" using the AsyncClient's chat method. The response is then streamed.\n", + " \"\"\"\n", + " # Define the message to send to the model\n", + " message = {\"role\": \"user\", \"content\": \"What is keras?\"}\n", + "\n", + " # Send the message to the model and print the response\n", + " async for part in await AsyncClient().chat(\n", + " model=\"gemma2:2b\", messages=[message], stream=True\n", + " ):\n", + " print(part[\"message\"][\"content\"], end=\"\", flush=True)\n", + "\n", + "# Run the chat function\n", + "asyncio.run(chat())" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "cDr8VzvGIXFC" + }, + "source": [ + "## Conclusion\n", + "\n", + "Congratulations! You have successfully run inference on a Gemma model using the Ollama Python library. You can now integrate this into your Python projects." + ] + } + ], + "metadata": { + "accelerator": "GPU", + "colab": { + "name": "Run_with_Ollama_Python.ipynb", + "toc_visible": true + }, + "kernelspec": { + "display_name": "Python 3", + "name": "python3" + } }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "cellView": "form", - "id": "urf-mQKk348O" - }, - "outputs": [], - "source": [ - "# @title Licensed under the Apache License, Version 2.0 (the \"License\");\n", - "# you may not use this file except in compliance with the License.\n", - "# You may obtain a copy of the License at\n", - "#\n", - "# https://www.apache.org/licenses/LICENSE-2.0\n", - "#\n", - "# Unless required by applicable law or agreed to in writing, software\n", - "# distributed under the License is distributed on an \"AS IS\" BASIS,\n", - "# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n", - "# See the License for the specific language governing permissions and\n", - "# limitations under the License." - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "iS5fT77w4ZNj" - }, - "source": [ - "# Gemma - Run with Ollama Python library\n", - "\n", - "Author: Sitam Meur\n", - "\n", - "* GitHub: [github.com/sitamgithub-MSIT](https://github.com/sitamgithub-MSIT/)\n", - "* X: [@sitammeur](https://x.com/sitammeur)\n", - "\n", - "Description: This notebook demonstrates how you can run inference on a Gemma model using [Ollama Python library](https://github.com/ollama/ollama-python). The Ollama Python library provides the easiest way to integrate Python 3.8+ projects with Ollama.\n", - "\n", - "\n", - " \n", - "
\n", - " Run in Google Colab\n", - "
" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "FF6vOV_74aqj" - }, - "source": [ - "## Setup\n", - "\n", - "### Select the Colab runtime\n", - "To complete this tutorial, you'll need to have a Colab runtime with sufficient resources to run the Gemma model. In this case, you can use a T4 GPU:\n", - "\n", - "1. In the upper-right of the Colab window, select **▾ (Additional connection options)**.\n", - "2. Select **Change runtime type**.\n", - "3. Under **Hardware accelerator**, select **T4 GPU**." - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "4tlnekw44gaq" - }, - "source": [ - "## Installation" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "AL7futP_4laS" - }, - "source": [ - "Install Ollama through the offical installation script." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "id": "MuV7cWtcAoSV" - }, - "outputs": [], - "source": [ - "!curl -fsSL https://ollama.com/install.sh | sh" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "FpV183Rv6-1P" - }, - "source": [ - "Install Ollama Python library through the official Python client for Ollama." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "id": "0Mrj29SH-3OD" - }, - "outputs": [], - "source": [ - "!pip install -q ollama" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "iNxJFvGIe48_" - }, - "source": [ - "## Start Ollama\n", - "\n", - "Start Ollama in background using nohup." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "id": "5CX39xKVe9UN" - }, - "outputs": [], - "source": [ - "!nohup ollama serve > ollama.log &" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "6YfDqlyo46Rp" - }, - "source": [ - "## Prerequisites\n", - "\n", - "* Ollama should be installed and running. (This was already completed in previous steps.)\n", - "* Pull the gemma2 model to use with the library: `ollama pull gemma2:2b`\n", - " * See [Ollama.com](https://ollama.com/) for more information on the models available." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "id": "oPU5dA1-B5Fn" - }, - "outputs": [], - "source": [ - "import ollama" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "id": "NE1AWlucBza_" - }, - "outputs": [], - "source": [ - "ollama.pull('gemma2:2b')" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "KL5Kc6HaKjmF" - }, - "source": [ - "## Inference\n", - "\n", - "Run inference using Ollama Python library." - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "HN0nrhpmFUUB" - }, - "source": [ - "### Generate" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "id": "AC5FsDQuFZfb" - }, - "outputs": [], - "source": [ - "import markdown\n", - "from ollama import generate\n", - "\n", - "# Generate a response to a prompt\n", - "response = generate(\"gemma2:2b\", \"Explain the process of photosynthesis.\")\n", - "print(response[\"response\"])" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "p8v050JkFhuY" - }, - "source": [ - "#### Streaming Responses\n", - "\n", - "To enable response streaming, set `stream=True`." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "id": "jw8UD2v5Fms6" - }, - "outputs": [], - "source": [ - "# Stream the generated response\n", - "response = generate('gemma2:2b', 'Explain the process of photosynthesis.', stream=True)\n", - "\n", - "for part in response:\n", - " print(part['response'], end='', flush=True)" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "utyVRlIYFvdm" - }, - "source": [ - "#### Async client\n", - "\n", - "To make asynchronous requests, use the `AsyncClient` class." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "id": "JVvKQE0XF2kh" - }, - "outputs": [], - "source": [ - "import asyncio\n", - "import nest_asyncio\n", - "from ollama import AsyncClient\n", - "\n", - "nest_asyncio.apply()\n", - "\n", - "\n", - "async def generate():\n", - " \"\"\"\n", - " Asynchronously generates a response to a given prompt using the AsyncClient.\n", - "\n", - " This function creates an instance of AsyncClient and sends a request to generate\n", - " a response for the specified prompt. The response is then printed.\n", - " \"\"\"\n", - " # Create an instance of the AsyncClient\n", - " client = AsyncClient()\n", - "\n", - " # Send a request to generate a response to the prompt\n", - " response = await client.generate(\n", - " \"gemma2:2b\", \"Explain the process of photosynthesis.\"\n", - " )\n", - " print(response[\"response\"])\n", - "\n", - "# Run the generate function\n", - "asyncio.run(generate())" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "WFyi_zzWAwe7" - }, - "source": [ - "### Chat" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "id": "UN20MoSv_S76" - }, - "outputs": [], - "source": [ - "from ollama import chat\n", - "\n", - "# Start a conversation with the model\n", - "messages = [\n", - " {\n", - " \"role\": \"user\",\n", - " \"content\": \"What is keras?\",\n", - " },\n", - "]\n", - "\n", - "# Get the model's response to the message\n", - "response = chat(\"gemma2:2b\", messages=messages)\n", - "print(response[\"message\"][\"content\"])" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "HZGsL1FI7kj8" - }, - "source": [ - "#### Streaming Responses\n", - "\n", - "To enable response streaming, set `stream=True`." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "id": "UdxAuS1C7lm_" - }, - "outputs": [], - "source": [ - "# Stream the chat response\n", - "stream = chat(\n", - " model=\"gemma2:2b\",\n", - " messages=[{\"role\": \"user\", \"content\": \"What is keras?\"}],\n", - " stream=True,\n", - ")\n", - "\n", - "for chunk in stream:\n", - " print(chunk[\"message\"][\"content\"], end=\"\", flush=True)" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "czWceNYOEizg" - }, - "source": [ - "#### Async client + Streaming\n", - "\n", - "To make asynchronous requests, use the `AsyncClient` class, and for streaming, use `stream=True`." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "id": "YJmd92z1IUVl" - }, - "outputs": [], - "source": [ - "import asyncio\n", - "import nest_asyncio\n", - "from ollama import AsyncClient\n", - "\n", - "nest_asyncio.apply()\n", - "\n", - "\n", - "async def chat():\n", - " \"\"\"\n", - " Asynchronously sends a chat message to the specified model and prints the response.\n", - "\n", - " This function sends a message with the role \"user\" and the content \"What is keras?\"\n", - " to the model \"gemma2:2b\" using the AsyncClient's chat method. The response is then streamed.\n", - " \"\"\"\n", - " # Define the message to send to the model\n", - " message = {\"role\": \"user\", \"content\": \"What is keras?\"}\n", - "\n", - " # Send the message to the model and print the response\n", - " async for part in await AsyncClient().chat(\n", - " model=\"gemma2:2b\", messages=[message], stream=True\n", - " ):\n", - " print(part[\"message\"][\"content\"], end=\"\", flush=True)\n", - "\n", - "# Run the chat function\n", - "asyncio.run(chat())" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "cDr8VzvGIXFC" - }, - "source": [ - "## Conclusion\n", - "\n", - "Congratulations! You have successfully run inference on a Gemma model using the Ollama Python library. You can now integrate this into your Python projects." - ] - } - ], - "metadata": { - "accelerator": "GPU", - "colab": { - "name": "Run_with_Ollama_Python.ipynb", - "toc_visible": true - }, - "kernelspec": { - "display_name": "Python 3", - "name": "python3" - } - }, - "nbformat": 4, - "nbformat_minor": 0 + "nbformat": 4, + "nbformat_minor": 0 }