Skip to content

Commit 3d362bd

Browse files
committed
Add hyperparameter optimization tutorial
1 parent d6184db commit 3d362bd

File tree

1 file changed

+177
-0
lines changed

1 file changed

+177
-0
lines changed
Lines changed: 177 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,177 @@
1+
{
2+
"cells": [
3+
{
4+
"cell_type": "markdown",
5+
"id": "93eda882-3c30-46cf-a33c-0a6f29ca18b4",
6+
"metadata": {},
7+
"source": [
8+
"# Hyperparameter Optimization with Optuna in cellmaps_vnn\n",
9+
"\n",
10+
"This tutorial shows how to define a training configuration with search spaces for Optuna, run training using the config file, and then use the resulting optimized parameters for prediction."
11+
]
12+
},
13+
{
14+
"cell_type": "markdown",
15+
"id": "e4eb7598-d496-4864-ad70-44596bbfcfc8",
16+
"metadata": {},
17+
"source": [
18+
"### Step 1: Define Your Configuration \n",
19+
"Below we define a configuration dictionary for training.\n",
20+
"\n",
21+
"If a parameter is given a list of values, Optuna will treat it as a search space. If it is a single value, it will remain fixed during training."
22+
]
23+
},
24+
{
25+
"cell_type": "code",
26+
"execution_count": null,
27+
"id": "3fe289d2-2a8b-46e3-a48a-e3d7021736dc",
28+
"metadata": {},
29+
"outputs": [],
30+
"source": [
31+
"# Define or modify your hyperparameter configuration here\n",
32+
"config = {\n",
33+
" # Training settings\n",
34+
" 'epoch': 20,\n",
35+
" 'cuda': 0,\n",
36+
" 'zscore_method': 'auc',\n",
37+
"\n",
38+
" # Optimization settings\n",
39+
" 'optimize': 1, # Set to 1 to enable Optuna optimization\n",
40+
" 'n_trials': 2, # Number of trials for Optuna\n",
41+
"\n",
42+
" # Parameters (if parameter is given a list of values, it will be considered for optimization)\n",
43+
" 'batchsize': [32, 64, 128],\n",
44+
" 'lr': [0.1, 0.01, 0.001],\n",
45+
" 'wd': [0.0001, 0.001, 0.01],\n",
46+
" 'alpha': 0.3,\n",
47+
" 'genotype_hiddens': 4,\n",
48+
" 'patience': 30,\n",
49+
" 'delta': [0.001, 0.002, 0.003],\n",
50+
" 'min_dropout_layer': 2,\n",
51+
" 'dropout_fraction': 0.3,\n",
52+
"\n",
53+
" # Input files\n",
54+
" 'training_data': '../examples/training_data.txt',\n",
55+
" 'predict_data': '../examples/test_data.txt',\n",
56+
" 'gene2id': '../examples/gene2ind.txt',\n",
57+
" 'cell2id': '../examples/cell2ind.txt',\n",
58+
" 'mutations': '../examples/cell2mutation.txt',\n",
59+
" 'cn_deletions': '../examples/cell2cndeletion.txt',\n",
60+
" 'cn_amplifications': '../examples/cell2cnamplification.txt'\n",
61+
"}"
62+
]
63+
},
64+
{
65+
"cell_type": "markdown",
66+
"id": "38754843-3c40-4f08-a081-3fa022ada41f",
67+
"metadata": {},
68+
"source": [
69+
"### Step 2: Save Configuration to YAML\n",
70+
"We'll save the configuration to a YAML file. The training pipeline will load this file and extract parameter values and ranges."
71+
]
72+
},
73+
{
74+
"cell_type": "code",
75+
"execution_count": null,
76+
"id": "6ad5102e-441c-4bc3-a95f-793c9580511c",
77+
"metadata": {},
78+
"outputs": [],
79+
"source": [
80+
"import yaml\n",
81+
"\n",
82+
"# Save to a YAML config file\n",
83+
"config_path = './vnn_config.yaml'\n",
84+
"\n",
85+
"with open(config_path, 'w') as f:\n",
86+
" yaml.dump(config, f, default_flow_style=False, sort_keys=False)\n",
87+
"\n",
88+
"print(f'Configuration saved to {config_path}')"
89+
]
90+
},
91+
{
92+
"cell_type": "markdown",
93+
"id": "f00ed72c-33dc-42c8-82b6-df115eefa715",
94+
"metadata": {},
95+
"source": [
96+
"### Step 3: Train the VNN Model with Optuna\n",
97+
"Use **cellmaps_vnncmd.py train** and provide the config file via **--config_file**.\n",
98+
"This will automatically trigger Optuna-based optimization for any parameter listed with multiple values.\n",
99+
"\n",
100+
"After training completes, the output folder (out_train_optuna) will contain a config.yaml file — a flattened version of the original config with the best parameters from Optuna."
101+
]
102+
},
103+
{
104+
"cell_type": "code",
105+
"execution_count": null,
106+
"id": "28c0456c-96b8-44b3-aa3e-6561d7d92788",
107+
"metadata": {},
108+
"outputs": [],
109+
"source": [
110+
"import subprocess\n",
111+
"\n",
112+
"train_out = 'out_train_optuna'\n",
113+
"inputdir = '../examples/'\n",
114+
"command = (\n",
115+
" f\"cellmaps_vnncmd.py train {train_out} --inputdir {inputdir} --config_file {config_path}\"\n",
116+
")\n",
117+
"subprocess.run(command, shell=True, check=True)"
118+
]
119+
},
120+
{
121+
"cell_type": "markdown",
122+
"id": "6f65a487-7fd0-45ca-bd32-f1b6b90466fc",
123+
"metadata": {},
124+
"source": [
125+
"### Step 2: Make predictions with the Optimized Model\n",
126+
"Use the saved config.yaml from training (with best parameters) to perform prediction."
127+
]
128+
},
129+
{
130+
"cell_type": "code",
131+
"execution_count": null,
132+
"id": "058378cd-8ae9-433b-a19b-917c3ce18993",
133+
"metadata": {},
134+
"outputs": [],
135+
"source": [
136+
"test_out = './out_test'\n",
137+
"new_config = f\"{train_out}/config.yaml\"\n",
138+
"\n",
139+
"\n",
140+
"command = (\n",
141+
" f\"cellmaps_vnncmd.py predict {test_out} --inputdir {train_out} \"\n",
142+
" f\"--config_file {new_config}\"\n",
143+
")\n",
144+
"subprocess.run(command, shell=True, check=True)"
145+
]
146+
},
147+
{
148+
"cell_type": "code",
149+
"execution_count": null,
150+
"id": "1fdcac84-fbd2-4446-9a8c-df247e182a87",
151+
"metadata": {},
152+
"outputs": [],
153+
"source": []
154+
}
155+
],
156+
"metadata": {
157+
"kernelspec": {
158+
"display_name": "Python 3 (ipykernel)",
159+
"language": "python",
160+
"name": "python3"
161+
},
162+
"language_info": {
163+
"codemirror_mode": {
164+
"name": "ipython",
165+
"version": 3
166+
},
167+
"file_extension": ".py",
168+
"mimetype": "text/x-python",
169+
"name": "python",
170+
"nbconvert_exporter": "python",
171+
"pygments_lexer": "ipython3",
172+
"version": "3.8.18"
173+
}
174+
},
175+
"nbformat": 4,
176+
"nbformat_minor": 5
177+
}

0 commit comments

Comments
 (0)