-
Notifications
You must be signed in to change notification settings - Fork 371
docs: add sub-billion slicing guides and config tool #259
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
docs: add sub-billion slicing guides and config tool #259
Conversation
Summary of ChangesHello @Solventerritory, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request introduces a complete package of documentation and a utility script designed to enable the efficient deployment of Gemma 3n models on devices with limited resources, such as mobile phones and web browsers. It provides clear, actionable guidance for creating sub-billion parameter models, including a recommended 0.9B configuration, and explores the potential for extending these optimization techniques to the audio encoder. The changes aim to make Gemma 3n more accessible for a wider range of applications by offering tested and ready-to-use slicing configurations. Highlights
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request is an excellent contribution, adding comprehensive documentation and a utility script for creating sub-billion parameter Gemma models. The guides are well-structured and provide valuable information for developers targeting resource-constrained environments. My review focuses on improving the correctness of code snippets, enhancing the maintainability of the documentation, and increasing the robustness of the Python script. The key suggestions involve ensuring model dimensions are integers, fixing minor inaccuracies in the documentation, and adding safeguards to the code.
|
Warning Gemini encountered an error creating the review. You can try again by commenting |
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
|
@evansenter @markmcd @smsohan @justinmahood is it good to merge |
|
/gemini review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request introduces a comprehensive set of documentation and a helper script for creating sub-billion parameter Gemma 3n models. The documentation is thorough and covers various aspects from quick-start to deep technical details. The Python script provides useful presets and validation. My main feedback is to address a recurring issue where floating-point numbers are used for FFN hidden dimensions, which will cause errors. I've left specific suggestions to cast these to integers. Additionally, there's a significant amount of duplicated content across the documentation files, which could be streamlined to improve maintainability by using a single source of truth for configurations.
| 1. Open: Gemma/[Gemma_3n]MatFormer_Lab.ipynb | ||
| 2. In "Config details" cell, set: | ||
| layers_to_skip = [19, 20, 21, 22, 23, 24, 25, 26, 27] | ||
| ffn_hidden_dims = [2048*3]*10 + [2048*3.5]*9 + [2048*4]*7 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
|
/gemini review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request introduces a comprehensive set of documentation and a utility script for creating sub-billion parameter Gemma 3n models. The documentation is exceptionally detailed and well-structured, providing clear, actionable guidance for developers. The inclusion of a Python script for generating and validating configurations is a fantastic addition that will greatly aid in creating custom models.
My review focuses on ensuring consistency and improving maintainability. The primary issue identified is the inconsistent use of int() when calculating ffn_hidden_dims in various code examples, which can lead to float values where integers are expected. I have also suggested some minor enhancements to the Python script to replace magic numbers with named constants, which will make the code easier to understand and maintain in the long run.
Overall, this is an excellent contribution that will be highly valuable for teams working on resource-constrained deployments.
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
|
/gemini review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request introduces a comprehensive set of documentation and a Python helper script for creating and using sub-billion parameter Gemma 3n models. The documentation is exceptionally detailed, well-structured, and provides clear guidance for users with different levels of expertise, from quick-start guides to deep technical analyses. The Python script is a valuable tool for programmatically generating and validating model configurations.
My review includes a critical fix for a bug in the Python script that would cause it to crash, along with a few suggestions to improve clarity and correctness in the documentation and the script's output. Overall, this is an excellent contribution that will be highly valuable for developers working in resource-constrained environments.
| def validate_config(config: Dict, base_model_num_layers: int = 35, base_hidden_size: int = 2048, max_ffn_dim: int = 16384) -> Tuple[bool, List[str]]: | ||
| """ | ||
| Validate a configuration for logical consistency. | ||
|
|
||
| Returns: | ||
| (is_valid, list_of_errors) | ||
| """ | ||
| errors = [] | ||
|
|
||
| expected_layers = BASE_MODEL_NUM_LAYERS - len(config["layers_to_skip"]) | ||
| if expected_layers != config["num_layers"]: | ||
| errors.append( | ||
| f"Layer mismatch: {expected_layers} expected but {config['num_layers']} specified" | ||
| ) | ||
|
|
||
| if len(config["ffn_hidden_dims"]) != config["num_layers"]: | ||
| errors.append( | ||
| f"FFN dims length ({len(config['ffn_hidden_dims'])}) != num_layers ({config['num_layers']})" | ||
| ) | ||
|
|
||
| if not all(isinstance(d, int) for d in config["ffn_hidden_dims"]): | ||
| errors.append("FFN dimensions must be integers.") | ||
|
|
||
| for dim in config["ffn_hidden_dims"]: | ||
| if dim < BASE_HIDDEN_SIZE or dim > MAX_FFN_DIM: | ||
| errors.append(f"FFN dimension {dim} outside reasonable range [{BASE_HIDDEN_SIZE}, {MAX_FFN_DIM}]") | ||
|
|
||
| return len(errors) == 0, errors |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The validate_config function will raise a NameError at runtime because it references undefined variables. The function parameters are defined in snake_case (e.g., base_model_num_layers), but they are used in UPPER_SNAKE_CASE (e.g., BASE_MODEL_NUM_LAYERS) inside the function. This needs to be corrected for the script to run.
def validate_config(config: Dict, base_model_num_layers: int = 35, base_hidden_size: int = 2048, max_ffn_dim: int = 16384) -> Tuple[bool, List[str]]:
"""
Validate a configuration for logical consistency.
Returns:
(is_valid, list_of_errors)
"""
errors = []
expected_layers = base_model_num_layers - len(config["layers_to_skip"])
if expected_layers != config["num_layers"]:
errors.append(
f"Layer mismatch: {expected_layers} expected but {config['num_layers']} specified"
)
if len(config["ffn_hidden_dims"]) != config["num_layers"]:
errors.append(
f"FFN dims length ({len(config['ffn_hidden_dims'])}) != num_layers ({config['num_layers']})"
)
if not all(isinstance(d, int) for d in config["ffn_hidden_dims"]):
errors.append("FFN dimensions must be integers.")
for dim in config["ffn_hidden_dims"]:
if dim < base_hidden_size or dim > max_ffn_dim:
errors.append(f"FFN dimension {dim} outside reasonable range [{base_hidden_size}, {max_ffn_dim}]")
return len(errors) == 0, errorsCo-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
…lventerritory/gemma-cookbook into feat/sub-billion-slicing-docs
|
/gemini review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request introduces a comprehensive set of documentation and a Python helper script for creating sub-billion parameter Gemma 3n models, targeting resource-constrained environments. The documentation is extensive, providing a technical deep-dive, a quick-start guide, and performance benchmarks. The Python script is a valuable tool for managing and exporting slicing configurations.
My review focuses on ensuring consistency and accuracy across the newly added files. I've provided feedback on:
- Correcting a misleading comment and a typo in the documentation and Python script.
- Ensuring consistency in model configuration parameters across different files.
- Addressing minor redundancy and formatting issues in the documentation.
Overall, this is a very well-executed and valuable addition that will greatly help users looking to deploy Gemma on mobile and web platforms. The changes are clear and the provided guidance is excellent.
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
|
/gemini review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request introduces a comprehensive set of documentation and a Python helper script for creating sub-billion parameter Gemma 3n models, targeting resource-constrained environments. The documentation is exceptionally thorough, covering technical analysis, quick-start guides, and tooling. The Python script is well-designed with clear configurations and validation functions. My review primarily focuses on enhancing the clarity, consistency, and correctness of code snippets within the documentation to ensure they are complete and runnable for a better user experience. I've identified several minor inconsistencies in configuration definitions and some incomplete code examples across the new markdown files.
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
|
/gemini review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request introduces extensive documentation and a helper script for creating smaller, sub-billion parameter Gemma models, which is a valuable addition for developers working in resource-constrained environments. The documentation is well-structured, comprehensive, and provides clear guidance. The Python script is a useful utility for generating and validating configurations. My review focuses on improving the accuracy and consistency of the information presented to avoid potential user confusion. I've identified a few areas for improvement, including a misleading calculation in the Python script that underestimates model sizes, a confusing file self-reference, and a hardcoded path in the documentation. Addressing these points will enhance the overall quality and usability of these new assets.
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
|
/gemini review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request introduces a comprehensive set of documentation and a helper script for creating sub-billion parameter Gemma 3n models. The documentation is thorough, well-structured, and provides clear guidance for users looking to deploy these models in resource-constrained environments. The Python script is a great addition for programmatically generating and validating configurations.
My review includes a few suggestions to improve consistency and correctness in the provided configurations and documentation. Specifically, I've pointed out a configuration error in one of the text files, a formatting inconsistency in a markdown guide, and a missing data point in the Python script's configuration dictionaries that affects the generated output.
|
/gemini review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request introduces a comprehensive set of documentation and a helper script for creating sub-billion parameter Gemma 3n models, targeting resource-constrained environments. The documentation is well-structured, providing deep technical analysis, quick-start guides, and performance metrics. The Python script is a valuable tool for generating and validating model configurations. My review focuses on improving consistency across the different files, correcting minor errors in code snippets, and enhancing the utility of the helper script to ensure a clear and error-free experience for developers.
| Configuration: 0.9B (26 layers) | ||
|
|
||
| layers_to_skip = [19, 20, 21, 22, 23, 24, 25, 26, 27] | ||
| ffn_hidden_dims = [2048*3]*10 + [2048*3.5]*9 + [2048*4]*7 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The ffn_hidden_dims configuration for the 0.9B model appears to have a typo. The expression [2048*3.5]*9 will create a list of floats, but model configuration dimensions typically require integers. In other files and in the Python script, this is correctly written as [int(2048*3.5)]*9. To ensure consistency and prevent potential errors for users who copy-paste this configuration, it should be corrected.
ffn_hidden_dims = [2048*3]*10 + [int(2048*3.5)]*9 + [2048*4]*7
| # Best: 0.5B with extreme quantization | ||
| layers_to_skip = [12, 13, 14, 15, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27] | ||
| ffn_hidden_dims = [2048*2]*8 + [int(2048*2.5)]*7 + [2048*3]*5 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Summary
What: Adds documentation and a small tooling script to enable creation, validation, and export of sub‑billion (≤1B params) Gemma 3n submodels for resource‑constrained deployments (mobile/web). Includes a recommended 0.9B (26‑layer) config and guidance to extend slicing to the audio encoder.
Why: Provide teams with tested, ready‑to‑use slicing configs and an implementation path so Gemma can run efficiently on 4–6GB mobile devices and in browser environments.
#256
What I Changed
Docs:
RESPONSE_SUB_BILLION_AND_AUDIO_SLICING.md — deep technical analysis, FFN/layer strategies, audio encoder design
QUICK_START_SUB_BILLION_MODELS.md — step‑by‑step quickstart, troubleshooting, deployment tips
FEATURE_REQUEST_RESPONSE_SUMMARY.md — executive summary, recommendations, FAQ
README_SUB_BILLION_MODELS.md — navigation + TL;DR for quick adoption
INDEX_SUB_BILLION_RESPONSE.txt — consolidated index of created assets
Tooling:
custom_slicing_configs.py — runnable helper that lists presets, validates configs, and exports MatFormer Lab snippets