quic
diff --git a/‎docs/python_arm64.md‎
Lines changed: 52 additions & 2 deletions b/‎docs/python_arm64.md‎
Lines changed: 52 additions & 2 deletions
diff --git a/‎docs/user_guide.md‎
Lines changed: 2 additions & 50 deletions b/‎docs/user_guide.md‎
Lines changed: 2 additions & 50 deletions
diff --git a/‎samples/genie/c++/BUILD.md‎
Lines changed: 32 additions & 0 deletions b/‎samples/genie/c++/BUILD.md‎
Lines changed: 32 additions & 0 deletions
diff --git a/‎samples/genie/c++/README.md‎
Lines changed: 30 additions & 43 deletions b/‎samples/genie/c++/README.md‎
Lines changed: 30 additions & 43 deletions
diff --git a/‎samples/genie/python/README.md‎
Lines changed: 11 additions & 9 deletions b/‎samples/genie/python/README.md‎
Lines changed: 11 additions & 9 deletions
diff --git a/‎samples/python/aotgan/README.md‎
Lines changed: 0 additions & 55 deletions b/‎samples/python/aotgan/README.md‎
Lines changed: 0 additions & 55 deletions
@@ -1,4 +1,54 @@
-Using the Python extensions with ARM64 Python will make it easier for developers to build GUI app for Windows on Snapdragon(WoS) platforms. Python 3.12.6 ARM64 version has support for following modules: PyQt6, OpenCV, Numpy, PyTorch*, Torchvision*, ONNX*, ONNX Runtime*. Developers can design apps that benefit from rich Python ecosystem. <br>
+### 1. Intruduction: 
 
-**PyTorch, Torchvision, ONNX, ONNX Runtime: need to compile from source code.* <br>
+Using the Python extensions with ARM64 Python will get better performance for developers to build GUI app for Windows on Snapdragon(WoS) platforms. Python 3.12.6 ARM64 version has support for following modules: PyQt6, OpenCV, Numpy, PyTorch*, Torchvision*, ONNX*, ONNX Runtime*. <br>
 
+**PyTorch, Torchvision, ONNX, ONNX Runtime: need to compile from source code today.* <br>
+
+### 2. Python and common python extensions: 
+Get ARM64 version 'python-3.12.6-arm64.exe' from below link and install it. Make sure to add Python to your PATH environment.
+https://www.python.org/ftp/python/3.12.6/python-3.12.6-arm64.exe
+
+Get common Python extensions such as OpenCV, NumPy, Pillow from here:
+https://github.com/cgohlke/win_arm64-wheels/releases/download/v2024.6.15/2024.6.15-experimental-cp312-win_arm64.whl.zip
+Please to use numpy-1.26.4, do not use numpy-2.0.0.
+```
+pip install numpy-1.26.4-cp312-cp312-win_arm64.whl
+pip install opencv_python_headless-4.10.0.82-cp39-abi3-win_arm64.whl
+pip install pillow-10.3.0-cp312-cp312-win_arm64.whl
+```
+
+Get PyQt6 from here, refer to the *Notes* below on compiling PyQt6_sip:
+https://github.com/RockLakeGrass/Windows-on-ARM64-Toolchain/tree/main/Python/packages/PyQt/PyQt6
+
+### 3. PyTorch, TorchVision, ONNX, ONNX Runtime:
+If need these Python extensioins for ARM64 Python, you need compile them by yourselves. If need support on how to compile them, you can contact with us.
+
+### 4. MSVC library:
+You need ARM64 version 'msvcp140.dll' from 'Microsoft Visual C++ 2022 Redistributable (Arm64)'. You can download it from here and install it:
+https://aka.ms/arm64previewredist/
+
+### 5. Notes: <br>
+a. For C++(Visual Studio) projects, you need to set 'Runtime Library' to 'Multi-threaded DLL (/MD)'. Please refer to below link for detailed information:
+https://learn.microsoft.com/en-us/cpp/build/reference/md-mt-ld-use-run-time-library?view=msvc-170
+
+b. Plese use the API *LogLevel.SetLogLevel()* for Python and *SetLogLevel()* for C++ to initialize the log function before you call any other APIs. 
+
+c. If using Python 3.11.5, get OpenCV from here:
+https://github.com/RockLakeGrass/Windows-on-ARM64-Toolchain/tree/main/Python/packages/opencv-python
+
+d. PyQt6 install:
+If using Python 3.12.6, you perhaps need to setup compile environment according to below link for compiling PyQt6_sip: 13.4.0:
+https://github.com/quic/ai-engine-direct-helper/tree/main?tab=readme-ov-file#build
+
+Steps to install PyQt6 for Python 3.12.6:
+1. Download PyQt6-6.3.1-cp37-abi3-win_arm64.whl & PyQt6_Qt6-6.3.1-py3-none-win_arm64.whl from below link:
+https://github.com/RockLakeGrass/Windows-on-ARM64-Toolchain/tree/main/Python/packages/PyQt/PyQt6/PyQt6-6.3.1
+2. Use below commands to install PyQt6:
+
+```
+pip install PyQt6-6.3.1-cp37-abi3-win_arm64.whl
+pip install PyQt6_Qt6-6.3.1-py3-none-win_arm64.whl
+pip install PyQt6_sip==13.4.0
+```
+You can get PyQt6_sip for Python 3.11.5 from here directly:
+https://github.com/RockLakeGrass/Windows-on-ARM64-Toolchain/blob/main/Python/packages/PyQt/PyQt6/PyQt6-sip/PyQt6_sip-13.4.0-cp311-cp311-win_arm64.whl
@@ -29,56 +29,8 @@ C:\Qualcomm\AIStack\QAIRT\{SDK Version}\lib\hexagon-v73\unsigned\libqnnhtpv73.ca
 ```
 
 We can copy these libraries to one folder. E.g.: ```C:\<Project Name>\qnn\``` <br>
-
-### 2. Python and common python extensions: 
-Get ARM64 version 'python-3.12.6-arm64.exe' from below link and install it. Make sure to add Python to your PATH environment.
-https://www.python.org/ftp/python/3.12.6/python-3.12.6-arm64.exe
-
-Get common Python extensions such as OpenCV, NumPy, Pillow from here:
-https://github.com/cgohlke/win_arm64-wheels/releases/download/v2024.6.15/2024.6.15-experimental-cp312-win_arm64.whl.zip
-Please to use numpy-1.26.4, do not use numpy-2.0.0.
-```
-pip install numpy-1.26.4-cp312-cp312-win_arm64.whl
-pip install opencv_python_headless-4.10.0.82-cp39-abi3-win_arm64.whl
-pip install pillow-10.3.0-cp312-cp312-win_arm64.whl
-```
-
-Get PyQt6 from here, refer to the *Notes* below on compiling PyQt6_sip:
-https://github.com/RockLakeGrass/Windows-on-ARM64-Toolchain/tree/main/Python/packages/PyQt/PyQt6
-
-### 3. PyTorch, TorchVision, ONNX, ONNX Runtime:
-If need these Python extensioins for ARM64 Python, you need compile them by yourselves. If need support on how to compile them, you can contact with us.
-
-### 4. MSVC library:
-You need ARM64 version 'msvcp140.dll' from 'Microsoft Visual C++ 2022 Redistributable (Arm64)'. You can download it from here and install it:
-https://aka.ms/arm64previewredist/
-
-### 5. Notes: <br>
-a. For C++(Visual Studio) projects, you need to set 'Runtime Library' to 'Multi-threaded DLL (/MD)'. Please refer to below link for detailed information:
-https://learn.microsoft.com/en-us/cpp/build/reference/md-mt-ld-use-run-time-library?view=msvc-170
-
-b. Plese use the API *LogLevel.SetLogLevel()* for Python and *SetLogLevel()* for C++ to initialize the log function before you call any other APIs. 
-
-c. If using Python 3.11.5, get OpenCV from here:
-https://github.com/RockLakeGrass/Windows-on-ARM64-Toolchain/tree/main/Python/packages/opencv-python
-
-d. PyQt6 install:
-If using Python 3.12.6, you perhaps need to setup compile environment according to below link for compiling PyQt6_sip: 13.4.0:
-https://github.com/quic/ai-engine-direct-helper/tree/main?tab=readme-ov-file#build
-
-Steps to install PyQt6 for Python 3.12.6:
-1. Download PyQt6-6.3.1-cp37-abi3-win_arm64.whl & PyQt6_Qt6-6.3.1-py3-none-win_arm64.whl from below link:
-https://github.com/RockLakeGrass/Windows-on-ARM64-Toolchain/tree/main/Python/packages/PyQt/PyQt6/PyQt6-6.3.1
-2. Use below commands to install PyQt6:
-
-```
-pip install PyQt6-6.3.1-cp37-abi3-win_arm64.whl
-pip install PyQt6_Qt6-6.3.1-py3-none-win_arm64.whl
-pip install PyQt6_sip==13.4.0
-```
-You can get PyQt6_sip for Python 3.11.5 from here directly:
-https://github.com/RockLakeGrass/Windows-on-ARM64-Toolchain/blob/main/Python/packages/PyQt/PyQt6/PyQt6-sip/PyQt6_sip-13.4.0-cp311-cp311-win_arm64.whl
-
+* [python.md](python.md) can help setup the x64 Python environment automatically.
+* In WoS platform, ARM64 Python has better performance, but some Python extensions such as 'PyTorch' don't work for ARM64 Python today. For detailed help information on how to setup environment for using ARM64 Python, you can refer to [python_arm64.md](python_arm64.md)
 
 ## API from AppBuilder Python binding extension for Python projects.<br>
 There're several Python classes from this extension:
 
@@ -0,0 +1,32 @@
+# Source code
+## Service:
+  The code under this folder is C++ implementation of the service. It can be compiled to Windows, Android and Linux target.
+
+## Android:
+  The code under this folder is Android app which can be used to launch the service in Android device.
+
+## Build Service from source code:
+
+### Build GenieAPIServer for WoS:<br>
+Install Qualcomm® AI Runtime SDK, CMake, Visual Studio etc, before you compile this service.<br>
+```
+Set QNN_SDK_ROOT=C:\Qualcomm\AIStack\QAIRT\2.34.0.250424\
+cd samples\genie\c++\Service
+mkdir build && cd build
+cmake -S .. -B . -A ARM64
+cmake --build . --config Release
+```
+
+### Build GenieAPIServer for Android: <br>
+Install Qualcomm® AI Runtime SDK, Android NDK etc, before you compile this service.<br>
+```
+Set QNN_SDK_ROOT=C:\Qualcomm\AIStack\QAIRT\2.34.0.250424\
+set PATH=%PATH%;C:\Programs\android-ndk-r26d\toolchains\llvm\prebuilt\windows-x86_64\bin
+Set NDK_ROOT=C:/Programs/android-ndk-r26d/
+Set ANDROID_NDK_ROOT=%NDK_ROOT%
+
+"C:\Programs\android-ndk-r26d\prebuilt\windows-x86_64\bin\make.exe" android
+```
+
+### Build Android app:<br>
+You can install Android Studio for build the Android app.
@@ -1,84 +1,71 @@
 # README
 
 ## Introduction 
-This sample helps developers use C++ to build Genie based Open AI compatibility API service on Windows on Snapdragon (WoS), Mobile, Linux platforms.
+This sample helps developers use C++ to build Genie based Open AI compatibility API service on Windows on Snapdragon (WoS), Mobile and Linux platforms.
 
 # GenieAPIService
 Genie OpenAI Compatible API Service.
 
 This is an OpenAI compatible API service that can be used to access the Genie AI model.
 This service can be used on multiple platforms such as Android, Windows, Linux, etc.
 
-# Source code
-## Service:
-  The code under this folder is C++ implementation of the service. It can be compiled to Windows, Android and Linux target.
-
-## Android:
-  The code under this folder is Android app which can be used to launch the service in Android device.
-
-## Build Service from source code:
-
-### Build GenieAPIServer for WoS:<br>
-Install Qualcomm® AI Runtime SDK, CMake, Visual Studio etc, before you compile this service.<br>
-```
-Set QNN_SDK_ROOT=C:\Qualcomm\AIStack\QAIRT\2.34.0.250424\
-cd samples\genie\c++\Service
-mkdir build && cd build
-cmake -S .. -B . -A ARM64
-cmake --build . --config Release
-```
-
-### Build GenieAPIServer for Android: <br>
-Install Qualcomm® AI Runtime SDK, Android NDK etc, before you compile this service.<br>
-```
-Set QNN_SDK_ROOT=C:\Qualcomm\AIStack\QAIRT\2.34.0.250424\
-set PATH=%PATH%;C:\Programs\android-ndk-r26d\toolchains\llvm\prebuilt\windows-x86_64\bin
-Set NDK_ROOT=C:/Programs/android-ndk-r26d/
-Set ANDROID_NDK_ROOT=%NDK_ROOT%
-
-"C:\Programs\android-ndk-r26d\prebuilt\windows-x86_64\bin\make.exe" android
-```
-
 ### Run the service on WoS: <br>
-1. [Setup LLM models](https://github.com/quic/ai-engine-direct-helper/tree/main/samples/genie/python#step-3-download-models-and-tokenizer-files) first before running the service. <br>
-2. Download [Pre-build app](https://github.com/quic/ai-engine-direct-helper/releases/download/v2.34.0/GenieAPIService_2.34.zip) and copy "GenieAPIService" folder to path "ai-engine-direct-helper\samples".<br>
-3. Run the commands below to launch the Service.
+1. [Setup LLM models](https://github.com/quic/ai-engine-direct-helper/tree/main/samples/genie/python#step-3-download-models-and-tokenizer-files) first before running this service. <br>
+2. Download [GenieAPIService binary](https://github.com/quic/ai-engine-direct-helper/releases/download/v2.34.0/GenieAPIService_2.34.zip) and copy the subdirectory "GenieAPIService" to path "ai-engine-direct-helper\samples".<br>
+3. Run the following commands to launch the Service (Do *not* close this terminal window while service is running). 
 
 ```
 cd ai-engine-direct-helper\samples
 GenieAPIService\GenieAPIService.exe -c "genie\python\models\IBM-Granite-v3.1-8B\config.json"  -l
 ```
+The service prints the following log, indicating that GenieAPIService started successfully.
+```
+process_arguments c: genie\python\models\IBM-Granite-v3.1-8B\config.json
+process_arguments l: load model
+model path: genie\python\models\IBM-Granite-v3.1-8B
+model name: IBM-Granite-v3.1-8B
+INFO: loading model <<< IBM-Granite-v3.1-8B >>>
+[INFO]  "Using create From Binary"
+[INFO]  "Allocated total size = 353404160 across 10 buffers"
+SetStopSequence: {
+  "stop-sequence" : ["<|end_of_text|>", "<|end_of_role|>", "<|start_of_role|>"]
+}
+INFO: model <<< IBM-Granite-v3.1-8B >>> is ready!
+INFO: [TIME] | model load time >> 8103.10 ms
+INFO: Service Is Ready Now!
+```
 
 ### Run the service on Mobile(Snapdragon® 8 Elite Mobile device): <br>
-1. Copy GenieModels folder to the root folder of mobile sdcard.<br>
-2. Copy your LLM model & tokenizer.json to "/sdcard/GenieModels/qwen2.0_7b"<br>
+1. Copy the subdirectory "GenieModels" in the folder "Android" in [GenieAPIService binary](https://github.com/quic/ai-engine-direct-helper/releases/download/v2.34.0/GenieAPIService_2.34.zip) to the root path of mobile sdcard.<br>
+2. Copy your QWen QNN model & tokenizer.json to "/sdcard/GenieModels/qwen2.0_7b"<br>
 3. Modify the config file "/sdcard/GenieModels/qwen2.0_7b/config.json" if necessary.<br>
 4. Install the GenieAPIService.apk to mobile and start it.<br>
+* You can also try other models such [IBM-Granite-v3.1-8B-Instruct](https://aihub.qualcomm.com/mobile/models/ibm_granite_v3_1_8b_instruct?domain=Generative+AI&useCase=Text+Generation) which is for "Snapdragon® 8 Elite Mobile" device. You can create a subdirectory in the path "/sdcard/GenieModels/" for your model and customize the "config.json" for your model.
 
 ## Client Usage:
-  The service can be access through http://ip:8910/, it's compatible with OpenAI API.
-  Here is a Python client sample:
+  The service can be access through the ip address 'localhost:8910', it's compatible with OpenAI API.
+  Here is a Python client sample (You can run this Python client in a new terminal window):
+
 ```
 import argparse
 from openai import OpenAI
 
-HOST="localhost"
-PORT="8910"
+IP_ADDR = "localhost:8910"
 
 parser = argparse.ArgumentParser()
 parser.add_argument("--stream", action="store_true")
 parser.add_argument("--prompt", default="Hello", type=str)
 args = parser.parse_args()
 
-client = OpenAI(base_url="http://" + HOST + ":" + PORT + "/v1", api_key="123")
+client = OpenAI(base_url="http://" + IP_ADDR + "/v1", api_key="123")
 
 model_lst = client.models.list()
 print(model_lst)
 
 messages = [{"role": "system", "content": "You are a math teacher who teaches algebra."}, {"role": "user", "content": args.prompt}]
 extra_body = {"size": 4096, "temp": 1.5, "top_k": 13, "top_p": 0.6}
 
-model_name = "qwen2.0_7b"
+model_name = "IBM-Granite-v3.1-8B"
 
 if args.stream:
     response = client.chat.completions.create(model=model_name, stream=True, messages=messages, extra_body=extra_body)
@@ -90,4 +77,4 @@ if args.stream:
 else:
     response = client.chat.completions.create(model=model_name, messages=messages, extra_body=extra_body)
     print(response.choices[0].message.content)
-```
+```
@@ -15,34 +15,36 @@ pip install uvicorn pydantic_settings fastapi langchain langchain_core langchain
 ```
 
 ### Step 3: Download models and tokenizer files
-Download files for the models listed at the end of this page, save them to following path. Need to unzip the 'weight_sharing_model_N_of_N.serialized.bin' files from model package to following path.
+Download files for the [AI-Hub LLM models](https://github.com/quic/ai-engine-direct-helper/tree/main/samples/genie/python#ai-hub-llm-models) list at the end of this page, save them to following path. You need to unzip the 'weight_sharing_model_N_of_N.serialized.bin' files from model package to following path. Copy the corresponding 'tokenizer.json' file to the following directory path too.
 ```
-ai-engine-direct-helper\samples\genie\python\models\<model folder>
+ai-engine-direct-helper\samples\genie\python\models\<model name>
 ```
 If you want to modify the relative path of the directory where the model file is located, you need to modify the "config.json" file in the corresponding directory of the model to ensure that the tokenizer.json, htp_backend_ext_config.json and model files set in the configuration file can be found correctly.
 ```
-ai-engine-direct-helper\samples\genie\python\models\<model folder>\config.json
+ai-engine-direct-helper\samples\genie\python\models\<model name>\config.json
 ```
+* You can also use your own QNN LLM model (if you have one). You can create a subdirectory in the path "ai-engine-direct-helper\samples\genie\python\models\" for your model and customize the "config.json" for your model. Then use your model name in the client application.
 
-### Step 4: Switch to webui directory
+### Step 4: Switch to samples directory
 Run following commands in Windows terminal:
 ```
 cd ai-engine-direct-helper\samples
 ```
 
 ### Step 5: Run service
-Run the following commands to launch Genie API Service:
+Run the following commands to launch Genie API Service (Do *not* close this terminal window while service is running)
 ```
 python genie\python\GenieAPIService.py
 ```
 
 ### Step 6: Now you can access the API service
-The default IP address for this API is: [http://localhost:8910](http://localhost:8910)
-You can try using the following commands to generate text or image:
+The default IP address for this API is: 'localhost:8910', you can access this IP address in the client app.
+You can try using the following commands to generate text or image (You can run these Python in a new terminal window):
 ```
-python genie\python\GenieAPIClient.py --prompt "<Your query>" --stream
-python genie\python\GenieAPIClientImage.py --prompt "<Your prompt>"
+python genie\python\GenieAPIClient.py --prompt "How to fish?" --stream
+python genie\python\GenieAPIClientImage.py --prompt "spectacular view of northern lights from Alaska"
 ```
+When you run the client, you can see the current status of processing client requests from the server. When you run the request of image generation for the first time, the server may have to download the Stable Diffusion model from AI-Hub, and it will take a long time.
 
 ### AI-Hub LLM models:
 
 
@@ -2,58 +2,3 @@
 
 ## Introduction
 This is sample code for using AppBuilder to load aotgan QNN model to HTP and execute inference to erase and in-paint part of given input image.
-
-## Setup AppBuilder environment and prepare QNN SDK libraries by referring to the links below: 
-https://github.com/quic/ai-engine-direct-helper/blob/main/README.md
-https://github.com/quic/ai-engine-direct-helper/blob/main/Docs/User_Guide.md
-
-Copy the QNN libraries from QNN SDK to below path:
-```
-C:\ai-hub\aotgan\qai_libs\libQnnHtpV73Skel.so
-C:\ai-hub\aotgan\qai_libs\QnnHtp.dll
-C:\ai-hub\aotgan\qai_libs\QnnHtpV73Stub.dll
-C:\ai-hub\aotgan\qai_libs\QnnSystem.dll
-C:\ai-hub\aotgan\qai_libs\libqnnhtpv73.cat
-```
-
-## aotgan QNN models
-Download the quantized aotgan QNN models from Qualcomm® AI Hub:
-https://aihub.qualcomm.com/compute/models/aotgan
-
-After downloaded the model, copy it to the following path:
-```
-"C:\ai-hub\aotgan\models\aotgan.bin"
-```
-
-## Run the sample code
-Download the sample code from the following link:
-https://github.com/quic/ai-engine-direct-helper/blob/main/Samples/aotgan/aotgan.py
-
-After downloaded the sample code, please copy it to the following path:
-```
-C:\ai-hub\aotgan\
-```
-
-Copy one sample 512x512 image and mask to following path:
-```
-C:\ai-hub\aotgan\test_input_image.png
-C:\ai-hub\aotgan\test_input_mask.png
-```
-
-Run the sample code:	
-```
-python aotgan.py
-```
-
-## Output
-The output image will be saved to the following path:
-```
-C:\ai-hub\aotgan\out.png
-```
-
-## Reference
-You need to setup the AppBuilder environment before you run the sample code. Below is the guide on how to setup the AppBuilder environment:
-https://github.com/quic/ai-engine-direct-helper/blob/main/README.md
-https://github.com/quic/ai-engine-direct-helper/blob/main/Docs/User_Guide.md
-
-