This repository is a template for getting started with using PyTorch on the
Intel Gaudi. It is based on the ngc service of the
Cresset template.
The template is mostly targeted towards researchers who must change their development environment frequently and customize many aspects.
-
Install the Docker engine on the host machine. Instructions for Ubuntu hosts
-
Run
make install-composeto install Docker Compose if necessary. -
Install the Habana Docker Runtime on the host.
3-1. Visit this link for installation instructions.
3-2. Go to "Install Using Containers" and install according to the host platform.
- Press the green "Use this Template" button to create a new repository. Clone the newly created repository to your host.
- Run
make envto create a.envfile. This need only be done once per directory. - Run
make buildto build the Docker image and start the container. Run this command when you wish to rebuild the Docker image. - Run
make upif the image has already been built and you do not wish to rebuild. Note that this will delete the previous container, deleting any operations performed on it.
The docker-compose.yaml file is responsible for configuring the build and runtime environments.
It reads the .env file to fetch variables and uses default values if the variables are not available in the .env file.
However, editing the .env file to configure variables is recommended over directly editing the docker-compose.yaml file if possible.
For example, if the Synapse AI version for the project is to be updated,
add the following lines to the .env file.
PYTORCH_VERSION=2.3.1
SYNAPSE_VERSION=1.17.1
Visit https://developer.habana.ai/catalog/pytorch-container for available Intel Gaudi PyTorch images.
- Run
make overto add a new directory to the container. - Edit the
docker-compose.override.yamlfile generated by themake overcommand to mount desired host directories to the container. Do not edit thedocker-compose.yamlfile if possible.
- Run
make execto enter an existing Docker container. - Run
tmuxor related commands once inside the container to prevent interruptions from disconnects. - Use ^d (cmd+d) to exit from the container.
This will not stop the container and
tmuxshells will continue to run.
- To add
aptpackages to your project, edit theapt.requirements.txtfile. - To add
condaorpippackages to your project, edit theenvironments.yamlfile. - Edit the
pip.uninstalls.txtfile to remove pre-installedpippackages on the image.
Note that pip packages installed on system Python have higher priority than
user-installed pip packages by design.
Use the pip.uninstalls.txt file to remove system pip packages if
custom versions of those packages are required.
If the user wishes to use the template inside an existing project, first clone this repository directly into a folder.
Then append HOST_ROOT=.. to the .env file to configure the parent directory as the root of the project on the host.
Other configurations using relative paths, such as HOST_ROOT=../..
for the parent of the parent directory, are also possible.