Implementation of VaultGemma Fine Tuning with Differential Privacy and Inference #244

RubensZimbres · 2025-10-01T18:34:18Z

Add VaultGemma Fine-tuning with Differential Privacy and Inference

Overview

This PR adds a complete pipeline for privacy-preserving fine-tuning and inference of VaultGemma 1B on medical data using LoRA adapters and differential privacy via Opacus.

What's Added

Complete training pipeline with 4-bit quantization, LoRA, and differential privacy
Inference code for loading and running fine-tuned models
Comprehensive README with setup instructions and usage examples
Notebook: VaultGemma_FineTuning_Inference_Huggingface.ipynb

Key Features

4-bit quantization using BitsAndBytes NF4 for memory efficiency
LoRA fine-tuning targeting all projection layers with r=8, alpha=16
Differential privacy with configurable epsilon and delta budgets
Prompt masking to train only on response tokens
Automatic checkpointing based on loss thresholds
Cosine learning rate schedule with warmup

Technical Details

Training Configuration

Model: VaultGemma 1B (4-bit quantized)
Dataset: Medical Meadow Medical Flashcards (1000 samples)
Privacy Budget: ε=8.0, δ=1e-5
Batch Size: 1 with gradient accumulation of 8
Learning Rate: 2e-5 with cosine schedule
Train/Validation Split: 90/10

Privacy Guarantees

The implementation provides (ε, δ)-differential privacy guarantees through gradient clipping (max norm: 1.0) and automatic privacy accounting via Opacus.

Inference

Includes simple inference functions with adjustable generation parameters (temperature, top_p, max_new_tokens) and support for single or batch processing.

Files Changed

VaultGemma_FineTuning_Inference_Huggingface.ipynb - Main training and inference notebook
README.md - Documentation

review-notebook-app · 2025-10-01T18:34:24Z

Check out this pull request on

See visual diffs & provide feedback on Jupyter Notebooks.

Powered by ReviewNB

gemini-code-assist · 2025-10-01T18:34:36Z

Summary of Changes

Hello @RubensZimbres, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request delivers a robust solution for fine-tuning the VaultGemma 1B language model, specifically tailored for medical question-answering, with a strong emphasis on data privacy. It integrates state-of-the-art techniques like LoRA and differential privacy to enable efficient and secure model adaptation, making it suitable for applications involving sensitive information. The changes provide a complete, documented pipeline for both training and inference.

Highlights

Privacy-Preserving Fine-Tuning: Introduces a complete pipeline for fine-tuning Google's VaultGemma 1B model with differential privacy using Opacus, ensuring privacy guarantees for sensitive data.
Efficient Model Adaptation: Leverages LoRA (Low-Rank Adaptation) with 4-bit quantization (BitsAndBytes NF4) for memory-efficient and parameter-efficient fine-tuning, targeting all projection layers of the model.
Medical Q&A Application: The fine-tuning is demonstrated on the Medical Meadow Medical Flashcards dataset, aiming to create a medical question-answering model.
Comprehensive Workflow: Includes a full workflow from dataset loading and tokenization (with prompt masking) to training with gradient accumulation, learning rate scheduling, automatic checkpointing, and inference capabilities.
Detailed Documentation: A new README.md provides an overview, features, repository structure, training details, quick start for inference, requirements, and explicit privacy guarantees.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request introduces a comprehensive notebook for fine-tuning VaultGemma with differential privacy. The implementation is well-structured, but there are several critical issues and inconsistencies that need to be addressed. My review focuses on a critical discrepancy in the reported privacy budget, errors in model checkpointing and loading logic that will cause failures, and various inconsistencies between the code and documentation. Addressing these points will significantly improve the reliability and usability of the notebook.

gemini-code-assist · 2025-10-01T18:36:29Z

VaultGemma/VaultGemma_FineTuning_Inference_Huggingface.ipynb

+    "epsilon = privacy_engine.get_epsilon(delta=target_delta)\n",
+    "print(f\"Final privacy cost: ε = {epsilon:.2f} for δ = {target_delta}\")"


There is a critical inconsistency in the reported privacy budget. The cell's output shows a final privacy cost of ε = 22.21 for δ = 0.01. However, the code is configured with target_epsilon = 3.0 and target_delta = 1e-5. This discrepancy suggests the output is from a different execution or there is a fundamental issue in the privacy accounting. The reported epsilon is also significantly higher than the target. This must be corrected to ensure the privacy claims of this notebook are valid.

gemini-code-assist · 2025-10-01T18:36:29Z

VaultGemma/VaultGemma_FineTuning_Inference_Huggingface.ipynb

+    "    device_map=\"auto\",\n",
+    ")\n",
+    "\n",
+    "adapter_path = \"./final_model\"\n",


The path to the adapter for inference is hardcoded to ./final_model. However, the training loop saves checkpoints to a dynamically generated path based on the training loss (e.g., ./final_model_acc_...). This will cause a FileNotFoundError when running the inference cell. The path should be updated to point to a valid checkpoint saved during training.

gemini-code-assist · 2025-10-01T18:36:30Z

VaultGemma/VaultGemma_FineTuning_Inference_Huggingface.ipynb

+    "\n",
+    "# Training hyperparameters\n",
+    "device = \"cuda\" if torch.cuda.is_available() else \"cpu\"\n",
+    "num_train_epochs = 2\n",


There is an inconsistency between the code and the documentation. The number of training epochs is set to 2 here, but the markdown cell in section 8 (b84fda3c) states, "The loop will run for 20 epochs." Please update either the code or the markdown to ensure they are consistent.

gemini-code-assist · 2025-10-01T18:36:30Z

VaultGemma/VaultGemma_FineTuning_Inference_Huggingface.ipynb

+    "                log_message = f\"Step {global_step}: Train Loss = {avg_train_loss:.4f}\"\n",
+    "                \n",
+    "                # Save checkpoint if loss is below threshold\n",
+    "                if avg_train_loss < 0.06:\n",


Checkpointing based on a hardcoded training loss threshold (avg_train_loss < 0.06) is unreliable. This condition may never be met, or it could be met too frequently, leading to no checkpoints or too many. A more robust strategy is to save checkpoints based on improvements in the validation loss or simply at the end of each epoch.

gemini-code-assist · 2025-10-01T18:36:30Z

VaultGemma/README.md

+```python
+from transformers import AutoModelForCausalLM, GemmaTokenizer
+from peft import PeftModel
+
+# Load model and adapters
+model = AutoModelForCausalLM.from_pretrained("google/vaultgemma-1b")
+tokenizer = GemmaTokenizer.from_pretrained("google/vaultgemma-1b")
+peft_model = PeftModel.from_pretrained(model, "path/to/adapters")
+
+# Generate response
+question = "What are the symptoms of diabetes?"
+response = generate_response(question)
+```


The code snippet in the Quick Start section is incomplete because the generate_response function is not defined. This will cause an error for users trying to run this example directly. Please include the function definition or add a note directing users to the notebook for the full implementation.

gemini-code-assist · 2025-10-01T18:36:30Z

VaultGemma/VaultGemma_FineTuning_Inference_Huggingface.ipynb

+    "!pip install -q -U transformers peft accelerate bitsandbytes datasets pandas\n",
+    "!pip install git+https://github.com/huggingface/[email protected]\n",
+    "!pip install kagglehub ipywidgets opacus -q"


The transformers library is installed twice: first from pip and then immediately overwritten by an installation from a specific git commit. This is redundant. To streamline the setup, you can remove transformers from the first pip install command.

!pip install -q -U peft accelerate bitsandbytes datasets pandas !pip install git+https://github.com/huggingface/[email protected] !pip install kagglehub ipywidgets opacus -q

gemini-code-assist · 2025-10-01T18:36:30Z

VaultGemma/VaultGemma_FineTuning_Inference_Huggingface.ipynb

+    "\n",
+    "# Load medical dataset\n",
+    "medical_data = load_dataset(\"medalpaca/medical_meadow_medical_flashcards\", split=\"train\")\n",
+    "data = medical_data.to_pandas().head(1000)\n",


The number of samples to be used from the dataset is hardcoded as 1000. This makes the notebook less flexible for experimentation. It would be better to define this as a configurable variable at the top of the cell or in a dedicated configuration section.

NUM_SAMPLES = 1000 data = medical_data.to_pandas().head(NUM_SAMPLES)

gemini-code-assist · 2025-10-01T18:36:30Z

VaultGemma/VaultGemma_FineTuning_Inference_Huggingface.ipynb

+    "model = AutoModelForCausalLM.from_pretrained(\n",
+    "    model_path,\n",
+    "    quantization_config=quantization_config,\n",
+    "    torch_dtype=torch.bfloat16,\n",


The torch_dtype argument is deprecated and will be removed in a future version of the transformers library. The notebook's output already includes a warning about this. You should use dtype instead for forward compatibility and to remove the warning.

dtype=torch.bfloat16,

gemini-code-assist · 2025-10-01T18:36:30Z

VaultGemma/VaultGemma_FineTuning_Inference_Huggingface.ipynb

+   ],
+   "source": [
+    "def generate_response(question, max_new_tokens=128, temperature=0.9, top_p=0.9):\n",
+    "    prompt = f\"Instruction:\\nAnswer this medical question concisely.\\n\\nQuestion:\\n{question}\\n\\nResponse:\\n\"\n",


The prompt template used for inference (Answer this medical question concisely.) is different from the one used during training (Answer this question truthfully.). This inconsistency can lead to suboptimal model performance, as the model is being prompted in a way it was not trained for. For best results, the prompt templates for training and inference should be identical.

prompt = f"Instruction:\nAnswer this question truthfully.\n\nQuestion:\n{question}\n\nResponse:\n"

bebechien · 2025-10-02T02:09:11Z

Hi @RubensZimbres
Thanks for your contribution.
Since VaultGemma is research focused model. I think Research folder is the correct place for this notebook.

So, could you modify it with the following points?

move VaultGemma/VaultGemma_FineTuning_Inference_Huggingface.ipynb -> Reserach/[VaultGemma]FineTuning_with_Huggingface.ipynb
run nbfmt script

$ python3 -m pip install -U --user git+https://github.com/tensorflow/docs
$ python3 -m tensorflow_docs.tools.nbfmt notebook.ipynb

…tebook

RubensZimbres · 2025-10-02T11:56:38Z

Done, @bebechien , as well as gemini-code-assist issues addressed.

bebechien

lgtm!

initial commit VaultGemma fine tuning + Inference

3d775ea

github-actions bot added the status:awaiting review label Oct 1, 2025

gemini-code-assist bot reviewed Oct 1, 2025

View reviewed changes

commit fixing Gemini suggestions

91c3271

RubensZimbres added 3 commits October 2, 2025 08:43

refactor: move and rename VaultGemma notebook per review

7cfeffd

refactor: move and rename VaultGemma notebook per review + rename

ace9204

deleted original VaultGemma_FineTuning_Inference_Huggingface.ipynb no…

9f85ba8

…tebook

Delete VaultGemma/README.md

7b2acbe

bebechien approved these changes Oct 2, 2025

View reviewed changes

bebechien merged commit 99d909c into google-gemini:main Oct 2, 2025
3 checks passed

		"epsilon = privacy_engine.get_epsilon(delta=target_delta)\n",
		"print(f\"Final privacy cost: ε = {epsilon:.2f} for δ = {target_delta}\")"

Implementation of VaultGemma Fine Tuning with Differential Privacy and Inference #244

Implementation of VaultGemma Fine Tuning with Differential Privacy and Inference #244

Uh oh!

Conversation

RubensZimbres commented Oct 1, 2025

Add VaultGemma Fine-tuning with Differential Privacy and Inference

Overview

What's Added

Key Features

Technical Details

Training Configuration

Privacy Guarantees

Inference

Files Changed

Uh oh!

review-notebook-app bot commented Oct 1, 2025

Uh oh!

gemini-code-assist bot commented Oct 1, 2025

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Oct 1, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Oct 1, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Oct 1, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Oct 1, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Oct 1, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Oct 1, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Oct 1, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Oct 1, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Oct 1, 2025

Choose a reason for hiding this comment

Uh oh!

bebechien commented Oct 2, 2025

Uh oh!

RubensZimbres commented Oct 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

bebechien left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

RubensZimbres commented Oct 2, 2025 •

edited

Loading