From fd0fb4a250b7a20d09af6659d34b70611fcbe6ea Mon Sep 17 00:00:00 2001
From: Bobby Wang <wbo4958@gmail.com>
Date: Mon, 10 Nov 2025 14:38:55 +0800
Subject: [PATCH 1/3] Add doc for how to run nds power over Spark Connect

Signed-off-by: Bobby Wang <wbo4958@gmail.com>
---
 nds/README.md | 56 +++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 56 insertions(+)

diff --git a/nds/README.md b/nds/README.md
index b9ffb8e..b6378af 100644
--- a/nds/README.md
+++ b/nds/README.md
@@ -378,6 +378,62 @@ time.csv \
 --output_format parquet
 ```
 
+### Power Run over Spark Connect
+
+Power Run currently supports execution over Spark Connect, starting with Spark 4.0.0. However,
+you cannot run `nds_power.py` via Spark Connect using the associated `spark-submit-template`.
+Instead, execute it directly.
+
+Before proceeding, ensure `pyspark-client` is installed locally. For example:
+
+- Install `pyspark-client`
+
+``` bash
+pip install pyspark-client==4.0.0
+```
+
+- Run `nds_power.py`
+
+```shell
+export SPARK_REMOTE=sc://localhost
+python nds_power.py \
+    parquet_sf3k \
+    ./nds_query_streams/query_0.sql \
+    time.csv \
+    --output_prefix /data/query_output \
+    --output_format parquet
+```
+
+Alternatively, you can import the APIs in a notebook and execute them as follows:
+
+``` shell
+
+from nds_power import gen_sql_from_stream, run_query_stream
+
+import os
+os.environ["SPARK_REMOTE"] = "sc://localhost"
+
+query_stream_file = "nds_query_streams/query_0.sql"
+nds_data_path = "parquet_sf3k"
+time_log_file = "time.csv"
+
+query_dict = gen_sql_from_stream(query_stream_file)
+
+run_query_stream(input_prefix=nds_data_path,
+                 property_file=None,
+                 query_dict=query_dict,
+                 time_log_output_path=time_log_file,
+                 extra_time_log_output_path=None,
+                 sub_queries=None,
+                 warmup_iterations=0,
+                 iterations=1,
+                 plan_types="logical",
+                 )
+```
+
+`Note:` the python listener is disabled when running nds_power.py over Spark Connect, as py4j
+is not available in the Spark Connect environment.
+
 ### Throughput Run
 
 Throughput Run simulates the scenario that multiple query sessions are running simultaneously in

From 0142e6a3f595c36c5f28a9e79aa9d715b819c9d3 Mon Sep 17 00:00:00 2001
From: Bobby Wang <bobwang@nvidia.com>
Date: Tue, 11 Nov 2025 10:01:41 +0800
Subject: [PATCH 2/3] Update nds/README.md

Co-authored-by: Gera Shegalov <gshegalov@nvidia.com>
---
 nds/README.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/nds/README.md b/nds/README.md
index b6378af..21af319 100644
--- a/nds/README.md
+++ b/nds/README.md
@@ -406,7 +406,7 @@ python nds_power.py \
 
 Alternatively, you can import the APIs in a notebook and execute them as follows:
 
-``` shell
+```Python
 
 from nds_power import gen_sql_from_stream, run_query_stream
 

From 645f8667df1d58f6f35bbfa15fdeb1e1dc45fdd6 Mon Sep 17 00:00:00 2001
From: Bobby Wang <wbo4958@gmail.com>
Date: Tue, 11 Nov 2025 10:21:28 +0800
Subject: [PATCH 3/3] comments

---
 nds/README.md | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/nds/README.md b/nds/README.md
index 21af319..2cff3bf 100644
--- a/nds/README.md
+++ b/nds/README.md
@@ -345,7 +345,10 @@ optional arguments:
                         query39_part2
 ```
 
-Example command to submit nds_power.py by spark-submit-template utility:
+#### Power Run with spark-submit
+
+Users can use the `spark-submit-template` script to run the power run with spark-submit.
+An example command to submit nds_power.py by spark-submit-template utility is:
 
 ```bash
 ./spark-submit-template power_run_gpu.template \
@@ -378,7 +381,7 @@ time.csv \
 --output_format parquet
 ```
 
-### Power Run over Spark Connect
+#### Power Run over Spark Connect
 
 Power Run currently supports execution over Spark Connect, starting with Spark 4.0.0. However,
 you cannot run `nds_power.py` via Spark Connect using the associated `spark-submit-template`.