diff --git a/Gemma/Run_with_Ollama_Python.ipynb b/Gemma/Run_with_Ollama_Python.ipynb
index 48967a6..1294a34 100644
--- a/Gemma/Run_with_Ollama_Python.ipynb
+++ b/Gemma/Run_with_Ollama_Python.ipynb
@@ -1,419 +1,419 @@
{
- "cells": [
- {
- "cell_type": "markdown",
- "metadata": {
- "id": "j4qMNV_533ls"
- },
- "source": [
- "##### Copyright 2024 Google LLC."
- ]
+ "cells": [
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "j4qMNV_533ls"
+ },
+ "source": [
+ "##### Copyright 2024 Google LLC."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "cellView": "form",
+ "id": "urf-mQKk348O"
+ },
+ "outputs": [],
+ "source": [
+ "# @title Licensed under the Apache License, Version 2.0 (the \"License\");\n",
+ "# you may not use this file except in compliance with the License.\n",
+ "# You may obtain a copy of the License at\n",
+ "#\n",
+ "# https://www.apache.org/licenses/LICENSE-2.0\n",
+ "#\n",
+ "# Unless required by applicable law or agreed to in writing, software\n",
+ "# distributed under the License is distributed on an \"AS IS\" BASIS,\n",
+ "# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n",
+ "# See the License for the specific language governing permissions and\n",
+ "# limitations under the License."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "iS5fT77w4ZNj"
+ },
+ "source": [
+ "# Gemma - Run with Ollama Python library\n",
+ "\n",
+ "Author: Sitam Meur\n",
+ "\n",
+ "* GitHub: [github.com/sitamgithub-MSIT](https://github.com/sitamgithub-MSIT/)\n",
+ "* X: [@sitammeur](https://x.com/sitammeur)\n",
+ "\n",
+ "Description: This notebook demonstrates how you can run inference on a Gemma model using [Ollama Python library](https://github.com/ollama/ollama-python). The Ollama Python library provides the easiest way to integrate Python 3.8+ projects with Ollama.\n",
+ "\n",
+ "
"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "FF6vOV_74aqj"
+ },
+ "source": [
+ "## Setup\n",
+ "\n",
+ "### Select the Colab runtime\n",
+ "To complete this tutorial, you'll need to have a Colab runtime with sufficient resources to run the Gemma model. In this case, you can use a T4 GPU:\n",
+ "\n",
+ "1. In the upper-right of the Colab window, select **▾ (Additional connection options)**.\n",
+ "2. Select **Change runtime type**.\n",
+ "3. Under **Hardware accelerator**, select **T4 GPU**."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "4tlnekw44gaq"
+ },
+ "source": [
+ "## Installation"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "AL7futP_4laS"
+ },
+ "source": [
+ "Install Ollama through the offical installation script."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "MuV7cWtcAoSV"
+ },
+ "outputs": [],
+ "source": [
+ "!curl -fsSL https://ollama.com/install.sh | sh"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "FpV183Rv6-1P"
+ },
+ "source": [
+ "Install Ollama Python library through the official Python client for Ollama."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "0Mrj29SH-3OD"
+ },
+ "outputs": [],
+ "source": [
+ "!pip install -q ollama"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "iNxJFvGIe48_"
+ },
+ "source": [
+ "## Start Ollama\n",
+ "\n",
+ "Start Ollama in background using nohup."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "5CX39xKVe9UN"
+ },
+ "outputs": [],
+ "source": [
+ "!nohup ollama serve > ollama.log &"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "6YfDqlyo46Rp"
+ },
+ "source": [
+ "## Prerequisites\n",
+ "\n",
+ "* Ollama should be installed and running. (This was already completed in previous steps.)\n",
+ "* Pull the gemma2 model to use with the library: `ollama pull gemma2:2b`\n",
+ " * See [Ollama.com](https://ollama.com/) for more information on the models available."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "oPU5dA1-B5Fn"
+ },
+ "outputs": [],
+ "source": [
+ "import ollama"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "NE1AWlucBza_"
+ },
+ "outputs": [],
+ "source": [
+ "ollama.pull('gemma2:2b')"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "KL5Kc6HaKjmF"
+ },
+ "source": [
+ "## Inference\n",
+ "\n",
+ "Run inference using Ollama Python library."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "HN0nrhpmFUUB"
+ },
+ "source": [
+ "### Generate"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "AC5FsDQuFZfb"
+ },
+ "outputs": [],
+ "source": [
+ "import markdown\n",
+ "from ollama import generate\n",
+ "\n",
+ "# Generate a response to a prompt\n",
+ "response = generate(\"gemma2:2b\", \"Explain the process of photosynthesis.\")\n",
+ "print(response[\"response\"])"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "p8v050JkFhuY"
+ },
+ "source": [
+ "#### Streaming Responses\n",
+ "\n",
+ "To enable response streaming, set `stream=True`."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "jw8UD2v5Fms6"
+ },
+ "outputs": [],
+ "source": [
+ "# Stream the generated response\n",
+ "response = generate('gemma2:2b', 'Explain the process of photosynthesis.', stream=True)\n",
+ "\n",
+ "for part in response:\n",
+ " print(part['response'], end='', flush=True)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "utyVRlIYFvdm"
+ },
+ "source": [
+ "#### Async client\n",
+ "\n",
+ "To make asynchronous requests, use the `AsyncClient` class."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "JVvKQE0XF2kh"
+ },
+ "outputs": [],
+ "source": [
+ "import asyncio\n",
+ "import nest_asyncio\n",
+ "from ollama import AsyncClient\n",
+ "\n",
+ "nest_asyncio.apply()\n",
+ "\n",
+ "\n",
+ "async def generate():\n",
+ " \"\"\"\n",
+ " Asynchronously generates a response to a given prompt using the AsyncClient.\n",
+ "\n",
+ " This function creates an instance of AsyncClient and sends a request to generate\n",
+ " a response for the specified prompt. The response is then printed.\n",
+ " \"\"\"\n",
+ " # Create an instance of the AsyncClient\n",
+ " client = AsyncClient()\n",
+ "\n",
+ " # Send a request to generate a response to the prompt\n",
+ " response = await client.generate(\n",
+ " \"gemma2:2b\", \"Explain the process of photosynthesis.\"\n",
+ " )\n",
+ " print(response[\"response\"])\n",
+ "\n",
+ "# Run the generate function\n",
+ "asyncio.run(generate())"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "WFyi_zzWAwe7"
+ },
+ "source": [
+ "### Chat"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "UN20MoSv_S76"
+ },
+ "outputs": [],
+ "source": [
+ "from ollama import chat\n",
+ "\n",
+ "# Start a conversation with the model\n",
+ "messages = [\n",
+ " {\n",
+ " \"role\": \"user\",\n",
+ " \"content\": \"What is keras?\",\n",
+ " },\n",
+ "]\n",
+ "\n",
+ "# Get the model's response to the message\n",
+ "response = chat(\"gemma2:2b\", messages=messages)\n",
+ "print(response[\"message\"][\"content\"])"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "HZGsL1FI7kj8"
+ },
+ "source": [
+ "#### Streaming Responses\n",
+ "\n",
+ "To enable response streaming, set `stream=True`."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "UdxAuS1C7lm_"
+ },
+ "outputs": [],
+ "source": [
+ "# Stream the chat response\n",
+ "stream = chat(\n",
+ " model=\"gemma2:2b\",\n",
+ " messages=[{\"role\": \"user\", \"content\": \"What is keras?\"}],\n",
+ " stream=True,\n",
+ ")\n",
+ "\n",
+ "for chunk in stream:\n",
+ " print(chunk[\"message\"][\"content\"], end=\"\", flush=True)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "czWceNYOEizg"
+ },
+ "source": [
+ "#### Async client + Streaming\n",
+ "\n",
+ "To make asynchronous requests, use the `AsyncClient` class, and for streaming, use `stream=True`."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "YJmd92z1IUVl"
+ },
+ "outputs": [],
+ "source": [
+ "import asyncio\n",
+ "import nest_asyncio\n",
+ "from ollama import AsyncClient\n",
+ "\n",
+ "nest_asyncio.apply()\n",
+ "\n",
+ "\n",
+ "async def chat():\n",
+ " \"\"\"\n",
+ " Asynchronously sends a chat message to the specified model and prints the response.\n",
+ "\n",
+ " This function sends a message with the role \"user\" and the content \"What is keras?\"\n",
+ " to the model \"gemma2:2b\" using the AsyncClient's chat method. The response is then streamed.\n",
+ " \"\"\"\n",
+ " # Define the message to send to the model\n",
+ " message = {\"role\": \"user\", \"content\": \"What is keras?\"}\n",
+ "\n",
+ " # Send the message to the model and print the response\n",
+ " async for part in await AsyncClient().chat(\n",
+ " model=\"gemma2:2b\", messages=[message], stream=True\n",
+ " ):\n",
+ " print(part[\"message\"][\"content\"], end=\"\", flush=True)\n",
+ "\n",
+ "# Run the chat function\n",
+ "asyncio.run(chat())"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "cDr8VzvGIXFC"
+ },
+ "source": [
+ "## Conclusion\n",
+ "\n",
+ "Congratulations! You have successfully run inference on a Gemma model using the Ollama Python library. You can now integrate this into your Python projects."
+ ]
+ }
+ ],
+ "metadata": {
+ "accelerator": "GPU",
+ "colab": {
+ "name": "Run_with_Ollama_Python.ipynb",
+ "toc_visible": true
+ },
+ "kernelspec": {
+ "display_name": "Python 3",
+ "name": "python3"
+ }
},
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {
- "cellView": "form",
- "id": "urf-mQKk348O"
- },
- "outputs": [],
- "source": [
- "# @title Licensed under the Apache License, Version 2.0 (the \"License\");\n",
- "# you may not use this file except in compliance with the License.\n",
- "# You may obtain a copy of the License at\n",
- "#\n",
- "# https://www.apache.org/licenses/LICENSE-2.0\n",
- "#\n",
- "# Unless required by applicable law or agreed to in writing, software\n",
- "# distributed under the License is distributed on an \"AS IS\" BASIS,\n",
- "# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n",
- "# See the License for the specific language governing permissions and\n",
- "# limitations under the License."
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "id": "iS5fT77w4ZNj"
- },
- "source": [
- "# Gemma - Run with Ollama Python library\n",
- "\n",
- "Author: Sitam Meur\n",
- "\n",
- "* GitHub: [github.com/sitamgithub-MSIT](https://github.com/sitamgithub-MSIT/)\n",
- "* X: [@sitammeur](https://x.com/sitammeur)\n",
- "\n",
- "Description: This notebook demonstrates how you can run inference on a Gemma model using [Ollama Python library](https://github.com/ollama/ollama-python). The Ollama Python library provides the easiest way to integrate Python 3.8+ projects with Ollama.\n",
- "\n",
- ""
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "id": "FF6vOV_74aqj"
- },
- "source": [
- "## Setup\n",
- "\n",
- "### Select the Colab runtime\n",
- "To complete this tutorial, you'll need to have a Colab runtime with sufficient resources to run the Gemma model. In this case, you can use a T4 GPU:\n",
- "\n",
- "1. In the upper-right of the Colab window, select **▾ (Additional connection options)**.\n",
- "2. Select **Change runtime type**.\n",
- "3. Under **Hardware accelerator**, select **T4 GPU**."
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "id": "4tlnekw44gaq"
- },
- "source": [
- "## Installation"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "id": "AL7futP_4laS"
- },
- "source": [
- "Install Ollama through the offical installation script."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {
- "id": "MuV7cWtcAoSV"
- },
- "outputs": [],
- "source": [
- "!curl -fsSL https://ollama.com/install.sh | sh"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "id": "FpV183Rv6-1P"
- },
- "source": [
- "Install Ollama Python library through the official Python client for Ollama."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {
- "id": "0Mrj29SH-3OD"
- },
- "outputs": [],
- "source": [
- "!pip install -q ollama"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "id": "iNxJFvGIe48_"
- },
- "source": [
- "## Start Ollama\n",
- "\n",
- "Start Ollama in background using nohup."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {
- "id": "5CX39xKVe9UN"
- },
- "outputs": [],
- "source": [
- "!nohup ollama serve > ollama.log &"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "id": "6YfDqlyo46Rp"
- },
- "source": [
- "## Prerequisites\n",
- "\n",
- "* Ollama should be installed and running. (This was already completed in previous steps.)\n",
- "* Pull the gemma2 model to use with the library: `ollama pull gemma2:2b`\n",
- " * See [Ollama.com](https://ollama.com/) for more information on the models available."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {
- "id": "oPU5dA1-B5Fn"
- },
- "outputs": [],
- "source": [
- "import ollama"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {
- "id": "NE1AWlucBza_"
- },
- "outputs": [],
- "source": [
- "ollama.pull('gemma2:2b')"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "id": "KL5Kc6HaKjmF"
- },
- "source": [
- "## Inference\n",
- "\n",
- "Run inference using Ollama Python library."
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "id": "HN0nrhpmFUUB"
- },
- "source": [
- "### Generate"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {
- "id": "AC5FsDQuFZfb"
- },
- "outputs": [],
- "source": [
- "import markdown\n",
- "from ollama import generate\n",
- "\n",
- "# Generate a response to a prompt\n",
- "response = generate(\"gemma2:2b\", \"Explain the process of photosynthesis.\")\n",
- "print(response[\"response\"])"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "id": "p8v050JkFhuY"
- },
- "source": [
- "#### Streaming Responses\n",
- "\n",
- "To enable response streaming, set `stream=True`."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {
- "id": "jw8UD2v5Fms6"
- },
- "outputs": [],
- "source": [
- "# Stream the generated response\n",
- "response = generate('gemma2:2b', 'Explain the process of photosynthesis.', stream=True)\n",
- "\n",
- "for part in response:\n",
- " print(part['response'], end='', flush=True)"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "id": "utyVRlIYFvdm"
- },
- "source": [
- "#### Async client\n",
- "\n",
- "To make asynchronous requests, use the `AsyncClient` class."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {
- "id": "JVvKQE0XF2kh"
- },
- "outputs": [],
- "source": [
- "import asyncio\n",
- "import nest_asyncio\n",
- "from ollama import AsyncClient\n",
- "\n",
- "nest_asyncio.apply()\n",
- "\n",
- "\n",
- "async def generate():\n",
- " \"\"\"\n",
- " Asynchronously generates a response to a given prompt using the AsyncClient.\n",
- "\n",
- " This function creates an instance of AsyncClient and sends a request to generate\n",
- " a response for the specified prompt. The response is then printed.\n",
- " \"\"\"\n",
- " # Create an instance of the AsyncClient\n",
- " client = AsyncClient()\n",
- "\n",
- " # Send a request to generate a response to the prompt\n",
- " response = await client.generate(\n",
- " \"gemma2:2b\", \"Explain the process of photosynthesis.\"\n",
- " )\n",
- " print(response[\"response\"])\n",
- "\n",
- "# Run the generate function\n",
- "asyncio.run(generate())"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "id": "WFyi_zzWAwe7"
- },
- "source": [
- "### Chat"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {
- "id": "UN20MoSv_S76"
- },
- "outputs": [],
- "source": [
- "from ollama import chat\n",
- "\n",
- "# Start a conversation with the model\n",
- "messages = [\n",
- " {\n",
- " \"role\": \"user\",\n",
- " \"content\": \"What is keras?\",\n",
- " },\n",
- "]\n",
- "\n",
- "# Get the model's response to the message\n",
- "response = chat(\"gemma2:2b\", messages=messages)\n",
- "print(response[\"message\"][\"content\"])"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "id": "HZGsL1FI7kj8"
- },
- "source": [
- "#### Streaming Responses\n",
- "\n",
- "To enable response streaming, set `stream=True`."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {
- "id": "UdxAuS1C7lm_"
- },
- "outputs": [],
- "source": [
- "# Stream the chat response\n",
- "stream = chat(\n",
- " model=\"gemma2:2b\",\n",
- " messages=[{\"role\": \"user\", \"content\": \"What is keras?\"}],\n",
- " stream=True,\n",
- ")\n",
- "\n",
- "for chunk in stream:\n",
- " print(chunk[\"message\"][\"content\"], end=\"\", flush=True)"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "id": "czWceNYOEizg"
- },
- "source": [
- "#### Async client + Streaming\n",
- "\n",
- "To make asynchronous requests, use the `AsyncClient` class, and for streaming, use `stream=True`."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {
- "id": "YJmd92z1IUVl"
- },
- "outputs": [],
- "source": [
- "import asyncio\n",
- "import nest_asyncio\n",
- "from ollama import AsyncClient\n",
- "\n",
- "nest_asyncio.apply()\n",
- "\n",
- "\n",
- "async def chat():\n",
- " \"\"\"\n",
- " Asynchronously sends a chat message to the specified model and prints the response.\n",
- "\n",
- " This function sends a message with the role \"user\" and the content \"What is keras?\"\n",
- " to the model \"gemma2:2b\" using the AsyncClient's chat method. The response is then streamed.\n",
- " \"\"\"\n",
- " # Define the message to send to the model\n",
- " message = {\"role\": \"user\", \"content\": \"What is keras?\"}\n",
- "\n",
- " # Send the message to the model and print the response\n",
- " async for part in await AsyncClient().chat(\n",
- " model=\"gemma2:2b\", messages=[message], stream=True\n",
- " ):\n",
- " print(part[\"message\"][\"content\"], end=\"\", flush=True)\n",
- "\n",
- "# Run the chat function\n",
- "asyncio.run(chat())"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "id": "cDr8VzvGIXFC"
- },
- "source": [
- "## Conclusion\n",
- "\n",
- "Congratulations! You have successfully run inference on a Gemma model using the Ollama Python library. You can now integrate this into your Python projects."
- ]
- }
- ],
- "metadata": {
- "accelerator": "GPU",
- "colab": {
- "name": "Run_with_Ollama_Python.ipynb",
- "toc_visible": true
- },
- "kernelspec": {
- "display_name": "Python 3",
- "name": "python3"
- }
- },
- "nbformat": 4,
- "nbformat_minor": 0
+ "nbformat": 4,
+ "nbformat_minor": 0
}