Skip to content

Commit 66da440

Browse files
committed
Applied layout to nf-test and first part of scripting patterns
Still need to do the Summary section on scripting patterns but kid 2 wants me to play Roblox with her
1 parent f3a0d9b commit 66da440

File tree

2 files changed

+190
-88
lines changed

2 files changed

+190
-88
lines changed

docs/side_quests/essential_scripting_patterns.md

Lines changed: 40 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,10 @@ You can write a lot of Nextflow without venturing beyond basic syntax for variab
66

77
However, when you need to manipulate data, parse complex filenames, implement conditional logic, or build robust production workflows, it helps to think about two distinct aspects of your code: **dataflow** (channels, operators, processes, and workflows) and **scripting** (the code inside closures, functions, and process scripts). While this distinction is somewhat arbitrary—it's all Nextflow code—it provides a useful mental model for understanding when you're orchestrating your pipeline versus when you're manipulating data. Mastering both dramatically improves your ability to write clear, maintainable workflows.
88

9-
This side quest takes you on a hands-on journey from basic concepts to production-ready patterns. We'll transform a simple CSV-reading workflow into a sophisticated bioinformatics pipeline, evolving it step-by-step through realistic challenges:
9+
### Learning goals
10+
11+
This side quest takes you on a hands-on journey from basic concepts to production-ready patterns.
12+
We'll transform a simple CSV-reading workflow into a sophisticated bioinformatics pipeline, evolving it step-by-step through realistic challenges:
1013

1114
- **Understanding boundaries:** Distinguish between dataflow operations and scripting, and understand how they work together
1215
- **Data manipulation:** Extract, transform, and subset maps and collections using powerful operators
@@ -17,32 +20,40 @@ This side quest takes you on a hands-on journey from basic concepts to productio
1720
- **Safe operations:** Handle missing data gracefully with null-safe operators and validate inputs with clear error messages
1821
- **Configuration-based handlers:** Use workflow event handlers for logging, notifications, and lifecycle management
1922

20-
---
23+
### Prerequisites
2124

22-
## 0. Warmup
25+
Before taking on this side quest, you should:
2326

24-
### 0.1. Prerequisites
27+
- Have completed the [Hello Nextflow](../hello_nextflow/README.md) tutorial or equivalent beginner's course.
28+
- Be comfortable using basic Nextflow concepts and mechanisms (processes, channels, operators, working with files, meta data)
29+
- Have basic familiarity with common programming constructs (variables, maps, lists)
2530

26-
Before taking on this side quest you should:
31+
This tutorial will explain programming concepts as we encounter them, so you don't need extensive programming experience.
32+
We'll start with fundamental concepts and build up to advanced patterns.
2733

28-
- Complete the [Hello Nextflow](../hello_nextflow/README.md) tutorial or have equivalent experience
29-
- Understand basic Nextflow concepts (processes, channels, workflows)
30-
- Have basic familiarity with common programming constructs (variables, maps, lists)
34+
---
3135

32-
This tutorial will explain programming concepts as we encounter them, so you don't need extensive programming experience. We'll start with fundamental concepts and build up to advanced patterns.
36+
## 0. Get started
3337

34-
### 0.2. Starting Point
38+
#### Open the training codespace
3539

36-
Navigate to the project directory:
40+
If you haven't yet done so, make sure to open the training environment as described in the [Environment Setup](../envsetup/index.md).
3741

38-
```bash title="Navigate to project directory"
42+
[![Open in GitHub Codespaces](https://github.com/codespaces/badge.svg)](https://codespaces.new/nextflow-io/training?quickstart=1&ref=master)
43+
44+
#### Move into the project directory
45+
46+
Let's move into the directory where the files for this tutorial are located.
47+
48+
```bash
3949
cd side-quests/essential_scripting_patterns
4050
```
4151

42-
The `data` directory contains sample files and a main workflow file we'll evolve throughout.
52+
#### Review the materials
53+
54+
You'll find a main workflow file and a `data` directory containing example data files.
4355

4456
```console title="Directory contents"
45-
> tree
4657
.
4758
├── collect.nf
4859
├── data
@@ -57,8 +68,6 @@ The `data` directory contains sample files and a main workflow file we'll evolve
5768
│ ├── generate_report.nf
5869
│ └── trimgalore.nf
5970
└── nextflow.config
60-
61-
3 directories, 10 files
6271
```
6372

6473
Our sample CSV contains information about biological samples that need different processing based on their characteristics:
@@ -72,6 +81,21 @@ SAMPLE_003,human,kidney,45000000,data/sequences/SAMPLE_003_S3_L001_R1_001.fastq,
7281

7382
We'll use this realistic dataset to explore practical programming techniques that you'll encounter in real bioinformatics workflows.
7483

84+
<!-- TODO: Can we make this more domain-agnostic? -->
85+
86+
<!-- TODO: add an assignment statement? #### Review the assignment -->
87+
88+
#### Readiness checklist
89+
90+
Think you're ready to dive in?
91+
92+
- [ ] I understand the goal of this course and its prerequisites
93+
- [ ] My codespace is up and running
94+
- [ ] I've set my working directory appropriately
95+
<!-- - [ ] I understand the assignment -->
96+
97+
If you can check all the boxes, you're good to go.
98+
7599
---
76100

77101
## 1. Dataflow vs Scripting: Understanding the Boundaries

docs/side_quests/nf-test.md

Lines changed: 150 additions & 72 deletions
Original file line numberDiff line numberDiff line change
@@ -27,89 +27,145 @@ Testing individual processes is analogous to unit tests in other languages. Test
2727

2828
[**nf-test**](https://www.nf-test.com/) is a tool that allows you to write module, workflow and pipeline level test. In short, it allows you to systematically check every individual part of the pipeline is working as expected, _in isolation_.
2929

30-
In this part of the training, we're going to show you how to use nf-test to write module-level tests for the three processes in our pipeline.
30+
### Learning goals
31+
32+
In this side quest, you'll learn to use nf-test to write a workflow-level test for the pipeline as well as module-level tests for the three processes it calls on.
33+
34+
By the end of this side quest, you'll be able to use the following techniques effectively:
35+
36+
- Initialize nf-test in your project
37+
- Generate module-level and workflow-level tests
38+
- Add common types of assertions
39+
- Understand when to use snapshots vs. content assertions
40+
- Run tests for an entire project
41+
42+
These skills will help you implement a comprehensive testing strategy in your pipeline projects, ensuring they are more robust and maintainable.
43+
44+
### Prerequisites
45+
46+
Before taking on this side quest, you should:
47+
48+
- Have completed the [Hello Nextflow](../hello_nextflow/README.md) tutorial or equivalent beginner's course.
49+
- Be comfortable using basic Nextflow concepts and mechanisms (processes, channels, operators, working with files, meta data)
3150

3251
---
3352

34-
## 0. Warmup
53+
## 0. Get started
3554

36-
Let's move into the project directory.
55+
#### Open the training codespace
56+
57+
If you haven't yet done so, make sure to open the training environment as described in the [Environment Setup](../envsetup/index.md).
58+
59+
[![Open in GitHub Codespaces](https://github.com/codespaces/badge.svg)](https://codespaces.new/nextflow-io/training?quickstart=1&ref=master)
60+
61+
#### Move into the project directory
62+
63+
Let's move into the directory where the files for this tutorial are located.
3764

3865
```bash
3966
cd side-quests/nf-test
4067
```
4168

42-
The `nf-test` directory has the file content like:
69+
You can set VSCode to focus on this directory:
70+
71+
```bash
72+
code .
73+
```
74+
75+
#### Review the materials
76+
77+
You'll find a main workflow file and a CSV file called `greetings.csv` that contains the input to the pipeline.
4378

4479
```console title="Directory contents"
45-
nf-test
80+
.
4681
├── greetings.csv
47-
└──main.nf
82+
└── main.nf
4883
```
4984

5085
For a detailed description of the files, see the [warmup from Hello Nextflow](../hello_nextflow/00_orientation.md).
51-
The workflow we'll be testing is part of the workflow built in [Hello Workflow](../hello_nextflow/03_hello_workflow.md), and is composed of two processes: `sayHello` and `convertToUpper`:
5286

53-
```bash title="Workflow code"
54-
/*
55-
* Pipeline parameters
56-
*/
57-
params.input_file = "greetings.csv"
87+
The workflow we'll be testing is a subset of the Hello workflow built in [Hello Workflow](../hello_nextflow/03_hello_workflow.md).
5888

59-
/*
60-
* Use echo to print 'Hello World!' to standard out
61-
*/
62-
process sayHello {
89+
??? example "What does the Hello Nextflow workflow do?"
6390

64-
publishDir 'results', mode: 'copy'
91+
If you haven't done the [Hello Nextflow](../hello_nextflow/index.md) training, here's a quick overview of what this simple workflow does.
6592

66-
input:
67-
val greeting
93+
The workflow takes a CSV file containing greetings, runs four consecutive transformation steps on them, and outputs a single text file containing an ASCII picture of a fun character saying the greetings.
6894

69-
output:
70-
path "${greeting}-output.txt"
95+
The four steps are implemented as Nextflow processes (`sayHello`, `convertToUpper`, `collectGreetings`, and `cowpy`) stored in separate module files.
7196

72-
script:
73-
"""
74-
echo '$greeting' > '$greeting-output.txt'
75-
"""
76-
}
97+
1. **`sayHello`:** Writes each greeting to its own output file (e.g., "Hello-output.txt")
98+
2. **`convertToUpper`:** Converts each greeting to uppercase (e.g., "HELLO")
99+
3. **`collectGreetings`:** Collects all uppercase greetings into a single batch file
100+
4. **`cowpy`:** Generates ASCII art using the `cowpy` tool
77101

78-
/*
79-
* Use a text replace utility to convert the greeting to uppercase
80-
*/
81-
process convertToUpper {
102+
The results are published to a directory called `results/`, and the final output of the pipeline (when run with default parameters) is a plain text file containing ASCII art of a character saying the uppercased greetings.
82103

83-
publishDir 'results', mode: 'copy'
104+
In this side quest, we use an intermediate form of the Hello workflow that only contains the first two processes. <!-- TODO: change this to use the full finished workflow as suggested in https://github.com/nextflow-io/training/issues/735 -->
84105

85-
input:
86-
path input_file
106+
The subset we'll be working with is composed of two processes: `sayHello` and `convertToUpper`.
107+
You can see the full workflow code below.
87108

88-
output:
89-
path "UPPER-${input_file}"
109+
??? example "Workflow code"
90110

91-
script:
92-
"""
93-
cat '$input_file' | tr '[a-z]' '[A-Z]' > UPPER-${input_file}
94-
"""
95-
}
111+
```groovy title="main.nf"
112+
/*
113+
* Pipeline parameters
114+
*/
115+
params.input_file = "greetings.csv"
96116

97-
workflow {
117+
/*
118+
* Use echo to print 'Hello World!' to standard out
119+
*/
120+
process sayHello {
98121

99-
// create a channel for inputs from a CSV file
100-
greeting_ch = channel.fromPath(params.input_file).splitCsv().flatten()
122+
publishDir 'results', mode: 'copy'
101123

102-
// emit a greeting
103-
sayHello(greeting_ch)
124+
input:
125+
val greeting
104126

105-
// convert the greeting to uppercase
106-
convertToUpper(sayHello.out)
107-
}
108-
```
127+
output:
128+
path "${greeting}-output.txt"
129+
130+
script:
131+
"""
132+
echo '$greeting' > '$greeting-output.txt'
133+
"""
134+
}
135+
136+
/*
137+
* Use a text replace utility to convert the greeting to uppercase
138+
*/
139+
process convertToUpper {
140+
141+
publishDir 'results', mode: 'copy'
142+
143+
input:
144+
path input_file
145+
146+
output:
147+
path "UPPER-${input_file}"
148+
149+
script:
150+
"""
151+
cat '$input_file' | tr '[a-z]' '[A-Z]' > UPPER-${input_file}
152+
"""
153+
}
109154

110-
We're going to assume an understanding of this workflow, but if you're not sure, you can refer back to [Hello Workflow](../hello_nextflow/03_hello_workflow.md).
155+
workflow {
111156

112-
### 0.1. Run the workflow
157+
// create a channel for inputs from a CSV file
158+
greeting_ch = channel.fromPath(params.input_file).splitCsv().flatten()
159+
160+
// emit a greeting
161+
sayHello(greeting_ch)
162+
163+
// convert the greeting to uppercase
164+
convertToUpper(sayHello.out)
165+
}
166+
```
167+
168+
#### Run the workflow
113169

114170
Let's run the workflow to make sure it's working as expected.
115171

@@ -137,15 +193,25 @@ Let's break down what just happened.
137193

138194
You ran the workflow with the default parameters, you confirmed it worked and you're happy with the results. This is the essence of testing. If you worked through the Hello Nextflow training course, you'll have noticed we always started every section by running the workflow we were using as a starting point, to confirm everything is set up correctly.
139195

140-
Testing software essentially does this process for us. Let's replace our simple `nextflow run main.nf` with a standardised test provided by nf-test.
196+
Testing software essentially does this process for us.
141197

142-
### Takeaway
198+
#### Review the assignment
143199

144-
You should be able to 'test' a pipeline by manually running it.
200+
Your challenge is to add standardized tests to this workflow using nf-test, in order to make it easy to verify that every part continues to work as expected in case any further changes are made.
145201

146-
### What's next?
202+
<!-- TODO: give a bit more details, similar to how it's done in the Metadata side quest -->
203+
204+
#### Readiness checklist
147205

148-
Initialize `nf-test`.
206+
Think you're ready to dive in?
207+
208+
- [ ] I understand the goal of this course and its prerequisites
209+
- [ ] My codespace is up and running
210+
- [ ] I've set my working directory appropriately
211+
- [ ] I've run the workflow successfully
212+
- [ ] I understand the assignment
213+
214+
If you can check all the boxes, you're good to go.
149215

150216
---
151217

@@ -1083,31 +1149,37 @@ SUCCESS: Executed 3 tests in 5.007s
10831149

10841150
Check that out! We ran 3 tests, 1 for each process and 1 for the whole pipeline with a single command. Imagine how powerful this is on a large codebase!
10851151

1086-
## 4. Summary
1152+
---
10871153

1088-
In this side quest, we've learned:
1154+
## Summary
10891155

1090-
1. How to initialize nf-test in a Nextflow project
1091-
2. How to write and run pipeline-level tests:
1092-
- Basic success testing
1093-
- Process count verification
1094-
- Output file existence checks
1095-
3. How to write and run process-level tests
1096-
4. Two approaches to output validation:
1097-
- Using snapshots for complete output verification
1098-
- Using direct content assertions for specific content checks
1099-
5. Best practices for test naming and organization
1100-
6. How to run all tests in a repository with a single command
1156+
In this side quest, you've learned to leverage nf-test's features to create and run tests for individual processes as well as end-to-end tests for the entire pipeline.
1157+
You're now aware of the main two approaches to output validation, snapshots and direct content assertions, and and when to use either one.
1158+
You also know how to run tests either one by one or for an entire project.
11011159

1102-
Testing is a critical part of pipeline development that helps ensure:
1160+
Applying these techniques in your own work will enable you to ensure that:
11031161

11041162
- Your code works as expected
11051163
- Changes don't break existing functionality
11061164
- Other developers can contribute with confidence
11071165
- Problems can be identified and fixed quickly
11081166
- Output content matches expectations
11091167

1110-
### What's next?
1168+
### Key patterns
1169+
1170+
<!-- TODO: Can we add snippets of code below to illustrate? -->
1171+
1172+
1. Pipeline-level tests:
1173+
- Basic success testing
1174+
- Process count verification
1175+
- Output file existence checks
1176+
2. Process-level tests
1177+
3. Two approaches to output validation:
1178+
- Using snapshots for complete output verification
1179+
- Using direct content assertions for specific content checks
1180+
4. Running all tests in a repository with a single command
1181+
1182+
### Additional resources
11111183

11121184
Check out the [nf-test documentation](https://www.nf-test.com/) for more advanced testing features and best practices. You might want to:
11131185

@@ -1117,4 +1189,10 @@ Check out the [nf-test documentation](https://www.nf-test.com/) for more advance
11171189
- Learn about other types of tests like workflow and module tests
11181190
- Explore more advanced content validation techniques
11191191

1120-
Remember: Tests are living documentation of how your code should behave. The more tests you write, and the more specific your assertions are, the more confident you can be in your pipeline's reliability.
1192+
**Remember:** Tests are living documentation of how your code should behave. The more tests you write, and the more specific your assertions are, the more confident you can be in your pipeline's reliability.
1193+
1194+
---
1195+
1196+
## What's next?
1197+
1198+
Return to the [menu of Side Quests](./index.md) or click the button in the bottom right of the page to move on to the next topic in the list.

0 commit comments

Comments
 (0)