Fixing some Docs link issue (#627)

abukhoy · web-flow · commit 8a9839405b91 · 2025-11-20T12:17:51.000+05:30
Signed-off-by: Abukhoyer Shaik &lt;abukhoye@qti.qualcomm.com&gt;
diff --git a/examples/audio/README.md b/examples/audio/README.md
@@ -82,6 +82,6 @@ This example:
 
 ## Documentation
 
-- [QEff Auto Classes](https://quic.github.io/efficient-transformers/qeff_autoclasses.html)
-- [Validated Audio Models](https://quic.github.io/efficient-transformers/validate.html#audio-models)
-- [Quick Start Guide](https://quic.github.io/efficient-transformers/quick_start.html)
+- [QEff Auto Classes](https://quic.github.io/efficient-transformers/source/qeff_autoclasses.html)
+- [Validated Audio Models](https://quic.github.io/efficient-transformers/source/validate.html#audio-models)
+- [Quick Start Guide](https://quic.github.io/efficient-transformers/source/quick_start.html)
diff --git a/examples/embeddings/README.md b/examples/embeddings/README.md
@@ -66,6 +66,6 @@ The example supports different pooling strategies:
 
 ## Documentation
 
-- [QEff Auto Classes](https://quic.github.io/efficient-transformers/qeff_autoclasses.html)
-- [Validated Embedding Models](https://quic.github.io/efficient-transformers/validate.html#embedding-models)
-- [Quick Start Guide](https://quic.github.io/efficient-transformers/quick_start.html)
+- [QEff Auto Classes](https://quic.github.io/efficient-transformers/source/qeff_autoclasses.html)
+- [Validated Embedding Models](https://quic.github.io/efficient-transformers/source/validate.html#embedding-models)
+- [Quick Start Guide](https://quic.github.io/efficient-transformers/source/quick_start.html)
diff --git a/examples/image_text_to_text/README.md b/examples/image_text_to_text/README.md
@@ -109,4 +109,4 @@ Some models have specialized examples demonstrating advanced features:
 
 ## Documentation
 - **Full Guide**: [VLM Documentation](../../docs/source/quick_start.md#vision-language-models)
-- **API Reference**: [QEFFAutoModelForImageTextToText](../../docs/source/qeff_autoclasses.md)
+- **API Reference**: [QEFFAutoModelForImageTextToText](../../docs/source/qeff_autoclasses.md#QEFFAutoModelForImageTextToText)
diff --git a/examples/peft/README.md b/examples/peft/README.md
@@ -77,7 +77,7 @@ qeff_model.unload_adapter("adapter_name")
 
 ## Documentation
 
-- [QEff Auto Classes](https://quic.github.io/efficient-transformers/qeff_autoclasses.html)
-- [Validated Base Models](https://quic.github.io/efficient-transformers/validate.html#text-only-language-models)
+- [QEff Auto Classes](https://quic.github.io/efficient-transformers/source/qeff_autoclasses.html)
+- [Validated Base Models](https://quic.github.io/efficient-transformers/source/validate.html#text-only-language-models)
 - [PEFT Documentation](https://huggingface.co/docs/peft)
-- [Quick Start Guide](https://quic.github.io/efficient-transformers/quick_start.html)
+- [Quick Start Guide](https://quic.github.io/efficient-transformers/source/quick_start.html)
diff --git a/examples/performance/README.md b/examples/performance/README.md
@@ -37,6 +37,8 @@ python speculative_decoding/draft_based.py \
     --target-device-group 0,1 \
     --draft-device-group 2
 ```
+errors in this example
+
 
 #### prompt_lookup.py
 Prompt Lookup Decoding (PLD) - N-gram based speculation without a draft model.
@@ -57,6 +59,7 @@ Multi-projection speculative decoding (Turbo models).
 python speculative_decoding/multi_projection.py \
     --pretrained-model-name-or-path TinyLlama/TinyLlama-1.1B-Chat-v1.0
 ```
+error 
 
 ### On-Device Sampling
 
@@ -102,6 +105,6 @@ python on_device_sampling.py \
 
 ## Documentation
 
-- [QEff Auto Classes](https://quic.github.io/efficient-transformers/qeff_autoclasses.html)
-- [Performance Features](https://quic.github.io/efficient-transformers/features_enablement.html)
-- [Quick Start Guide](https://quic.github.io/efficient-transformers/quick_start.html)
+- [QEff Auto Classes](https://quic.github.io/efficient-transformers/source/qeff_autoclasses.html)
+- [Performance Features](https://quic.github.io/efficient-transformers/source/features_enablement.html)
+- [Quick Start Guide](https://quic.github.io/efficient-transformers/source/quick_start.html)
diff --git a/examples/performance/compute_context_length/README.md b/examples/performance/compute_context_length/README.md
@@ -318,6 +318,6 @@ model = QEFFAutoModelForCausalLM.from_pretrained(
 
 ## Documentation
 
-- [QEff Auto Classes](https://quic.github.io/efficient-transformers/qeff_autoclasses.html)
-- [Performance Features](https://quic.github.io/efficient-transformers/features_enablement.html)
-- [Quick Start Guide](https://quic.github.io/efficient-transformers/quick_start.html)
+- [QEff Auto Classes](https://quic.github.io/efficient-transformers/source/qeff_autoclasses.html)
+- [Performance Features](https://quic.github.io/efficient-transformers/source/features_enablement.html)
+- [Quick Start Guide](https://quic.github.io/efficient-transformers/source/quick_start.html)
diff --git a/examples/performance/compute_context_length/qwen3moe_example/ccl_qwen3moe_inference.py b/examples/performance/compute_context_length/qwen3moe_example/ccl_qwen3moe_inference.py
diff --git a/examples/performance/cpp_execution/README.md b/examples/performance/cpp_execution/README.md
@@ -24,7 +24,7 @@ make -j 8
 cd ../../../  # Need to be in base folder - efficient-transformers to run below cmd
 
 # Run the python script to get the generated text
-python examples/cpp_execution/text_inference_using_cpp.py --model_name gpt2 --batch_size 1 --prompt_len 32 --ctx_len 128 --mxfp6 --num_cores 14 --device_group [0] --prompt "My name is" --mos 1 --aic_enable_depth_first
+python examples/performance/cpp_execution/text_inference_cpp.py --model_name gpt2 --batch_size 1 --prompt_len 32 --ctx_len 128 --mxfp6 --num_cores 14 --device_group [0] --prompt "My name is" --mos 1 --aic_enable_depth_first
 
 ```
 
diff --git a/examples/performance/speculative_decoding/README.md b/examples/performance/speculative_decoding/README.md
@@ -176,6 +176,6 @@ Avg number of accepted tokens = 2.8  # Speculation effectiveness
 
 ## Documentation
 
-- [Speculative Decoding Guide](https://quic.github.io/efficient-transformers/features_enablement.html#speculative-decoding)
-- [QEff Auto Classes](https://quic.github.io/efficient-transformers/qeff_autoclasses.html)
-- [Performance Optimization](https://quic.github.io/efficient-transformers/features_enablement.html)
+- [Speculative Decoding Guide](https://quic.github.io/efficient-transformers/source/features_enablement.html#speculative-decoding)
+- [QEff Auto Classes](https://quic.github.io/efficient-transformers/source/qeff_autoclasses.html)
+- [Performance Optimization](https://quic.github.io/efficient-transformers/source/features_enablement.html)
diff --git a/examples/text_generation/README.md b/examples/text_generation/README.md
@@ -144,7 +144,7 @@ python -m QEfficient.cloud.infer \
 2. Compiles to QPC 
 3. Executes inference with your prompt
 
-**CLI API Reference:** [`QEfficient.cloud.infer`](https://quic.github.io/efficient-transformers/cli_api.html#qefficient-cloud-infer)
+**CLI API Reference:** [`QEfficient.cloud.infer`](https://quic.github.io/efficient-transformers/source/cli_api.html#qefficient-cloud-infer)
 
 ### Step-by-Step Workflow
 
@@ -162,7 +162,7 @@ python -m QEfficient.cloud.export \
 
 This downloads the model and converts it to ONNX format. The ONNX model is saved in the QEfficient cache directory.
 
-**CLI API Reference:** [`QEfficient.cloud.export`](https://quic.github.io/efficient-transformers/cli_api.html#qefficient-cloud-export)
+**CLI API Reference:** [`QEfficient.cloud.export`](https://quic.github.io/efficient-transformers/source/cli_api.html#qefficient-cloud-export)
 
 #### Step 2: Compile Model to QPC
 
@@ -184,7 +184,7 @@ python -m QEfficient.cloud.compile \
 
 **Note:** The `compile` API is deprecated for direct use. Use the unified `infer` API instead for most use cases.
 
-**CLI API Reference:** [`QEfficient.cloud.compile`](https://quic.github.io/efficient-transformers/cli_api.html#qefficient-cloud-compile)
+**CLI API Reference:** [`QEfficient.cloud.compile`](https://quic.github.io/efficient-transformers/source/cli_api.html#qefficient-cloud-compile)
 
 #### Step 3: Execute Inference
 
@@ -200,7 +200,7 @@ python -m QEfficient.cloud.execute \
 
 This uses the pre-compiled QPC for fast inference. You can run this multiple times with different prompts without recompiling.
 
-**CLI API Reference:** [`QEfficient.cloud.execute`](https://quic.github.io/efficient-transformers/cli_api.html#qefficient-cloud-execute)
+**CLI API Reference:** [`QEfficient.cloud.execute`](https://quic.github.io/efficient-transformers/source/cli_api.html#qefficient-cloud-execute)
 
 ### Common CLI Parameters
 
@@ -239,7 +239,7 @@ python -m QEfficient.cloud.infer \
     --aic_enable_depth_first
 ```
 
-**Documentation:** [Multi-Qranium Inference](https://quic.github.io/efficient-transformers/features_enablement.html#multi-qranium-inference)
+**Documentation:** [Multi-Qranium Inference](https://quic.github.io/efficient-transformers/source/features_enablement.html#multi-qranium-inference)
 
 #### Continuous Batching
 
@@ -260,7 +260,7 @@ python -m QEfficient.cloud.infer \
 
 **Note:** Use pipe (`|`) to separate multiple prompts. When using continuous batching, do not specify `--batch_size`.
 
-**Documentation:** [Continuous Batching](https://quic.github.io/efficient-transformers/features_enablement.html#continuous-batching)
+**Documentation:** [Continuous Batching](https://quic.github.io/efficient-transformers/source/features_enablement.html#continuous-batching)
 
 #### Batch Processing from File
 
@@ -284,7 +284,7 @@ python -m QEfficient.cloud.infer \
 For a comprehensive collection of copy-paste ready CLI commands, run:
 
 ```bash
-bash examples/text_generation/cli_examples.sh
+bash cli_examples.sh
 ```
 
 This script demonstrates:
@@ -300,11 +300,11 @@ This script demonstrates:
 ## Additional Resources
 
 ### Documentation
-- [CLI API Reference](https://quic.github.io/efficient-transformers/cli_api.html) - Complete CLI command documentation
-- [Quick Start Guide](https://quic.github.io/efficient-transformers/quick_start.html) - Getting started with QEfficient
-- [Features Enablement](https://quic.github.io/efficient-transformers/features_enablement.html) - Advanced features guide
-- [QEff Auto Classes](https://quic.github.io/efficient-transformers/qeff_autoclasses.html) - Python API reference
-- [Validated Models](https://quic.github.io/efficient-transformers/validate.html#text-only-language-models) - Supported models list
+- [CLI API Reference](https://quic.github.io/efficient-transformers/source/cli_api.html) - Complete CLI command documentation
+- [Quick Start Guide](https://quic.github.io/efficient-transformers/source/quick_start.html) - Getting started with QEfficient
+- [Features Enablement](https://quic.github.io/efficient-transformers/source/features_enablement.html) - Advanced features guide
+- [QEff Auto Classes](https://quic.github.io/efficient-transformers/source/qeff_autoclasses.html) - Python API reference
+- [Validated Models](https://quic.github.io/efficient-transformers/source/validate.html) - Supported models list
 
 
 ### Model Storage