You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/side_quests/essential_scripting_patterns.md
+40-16Lines changed: 40 additions & 16 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -6,7 +6,10 @@ You can write a lot of Nextflow without venturing beyond basic syntax for variab
6
6
7
7
However, when you need to manipulate data, parse complex filenames, implement conditional logic, or build robust production workflows, it helps to think about two distinct aspects of your code: **dataflow** (channels, operators, processes, and workflows) and **scripting** (the code inside closures, functions, and process scripts). While this distinction is somewhat arbitrary—it's all Nextflow code—it provides a useful mental model for understanding when you're orchestrating your pipeline versus when you're manipulating data. Mastering both dramatically improves your ability to write clear, maintainable workflows.
8
8
9
-
This side quest takes you on a hands-on journey from basic concepts to production-ready patterns. We'll transform a simple CSV-reading workflow into a sophisticated bioinformatics pipeline, evolving it step-by-step through realistic challenges:
9
+
### Learning goals
10
+
11
+
This side quest takes you on a hands-on journey from basic concepts to production-ready patterns.
12
+
We'll transform a simple CSV-reading workflow into a sophisticated bioinformatics pipeline, evolving it step-by-step through realistic challenges:
10
13
11
14
-**Understanding boundaries:** Distinguish between dataflow operations and scripting, and understand how they work together
12
15
-**Data manipulation:** Extract, transform, and subset maps and collections using powerful operators
@@ -17,32 +20,40 @@ This side quest takes you on a hands-on journey from basic concepts to productio
17
20
-**Safe operations:** Handle missing data gracefully with null-safe operators and validate inputs with clear error messages
18
21
-**Configuration-based handlers:** Use workflow event handlers for logging, notifications, and lifecycle management
19
22
20
-
---
23
+
### Prerequisites
21
24
22
-
## 0. Warmup
25
+
Before taking on this side quest, you should:
23
26
24
-
### 0.1. Prerequisites
27
+
- Have completed the [Hello Nextflow](../hello_nextflow/README.md) tutorial or equivalent beginner's course.
28
+
- Be comfortable using basic Nextflow concepts and mechanisms (processes, channels, operators, working with files, meta data)
29
+
- Have basic familiarity with common programming constructs (variables, maps, lists)
25
30
26
-
Before taking on this side quest you should:
31
+
This tutorial will explain programming concepts as we encounter them, so you don't need extensive programming experience.
32
+
We'll start with fundamental concepts and build up to advanced patterns.
27
33
28
-
- Complete the [Hello Nextflow](../hello_nextflow/README.md) tutorial or have equivalent experience
- Have basic familiarity with common programming constructs (variables, maps, lists)
34
+
---
31
35
32
-
This tutorial will explain programming concepts as we encounter them, so you don't need extensive programming experience. We'll start with fundamental concepts and build up to advanced patterns.
36
+
## 0. Get started
33
37
34
-
###0.2. Starting Point
38
+
#### Open the training codespace
35
39
36
-
Navigate to the project directory:
40
+
If you haven't yet done so, make sure to open the training environment as described in the [Environment Setup](../envsetup/index.md).
37
41
38
-
```bash title="Navigate to project directory"
42
+
[](https://codespaces.new/nextflow-io/training?quickstart=1&ref=master)
43
+
44
+
#### Move into the project directory
45
+
46
+
Let's move into the directory where the files for this tutorial are located.
47
+
48
+
```bash
39
49
cd side-quests/essential_scripting_patterns
40
50
```
41
51
42
-
The `data` directory contains sample files and a main workflow file we'll evolve throughout.
52
+
#### Review the materials
53
+
54
+
You'll find a main workflow file and a `data` directory containing example data files.
43
55
44
56
```console title="Directory contents"
45
-
> tree
46
57
.
47
58
├── collect.nf
48
59
├── data
@@ -57,8 +68,6 @@ The `data` directory contains sample files and a main workflow file we'll evolve
57
68
│ ├── generate_report.nf
58
69
│ └── trimgalore.nf
59
70
└── nextflow.config
60
-
61
-
3 directories, 10 files
62
71
```
63
72
64
73
Our sample CSV contains information about biological samples that need different processing based on their characteristics:
@@ -27,89 +27,145 @@ Testing individual processes is analogous to unit tests in other languages. Test
27
27
28
28
[**nf-test**](https://www.nf-test.com/) is a tool that allows you to write module, workflow and pipeline level test. In short, it allows you to systematically check every individual part of the pipeline is working as expected, _in isolation_.
29
29
30
-
In this part of the training, we're going to show you how to use nf-test to write module-level tests for the three processes in our pipeline.
30
+
### Learning goals
31
+
32
+
In this side quest, you'll learn to use nf-test to write a workflow-level test for the pipeline as well as module-level tests for the three processes it calls on.
33
+
34
+
By the end of this side quest, you'll be able to use the following techniques effectively:
35
+
36
+
- Initialize nf-test in your project
37
+
- Generate module-level and workflow-level tests
38
+
- Add common types of assertions
39
+
- Understand when to use snapshots vs. content assertions
40
+
- Run tests for an entire project
41
+
42
+
These skills will help you implement a comprehensive testing strategy in your pipeline projects, ensuring they are more robust and maintainable.
43
+
44
+
### Prerequisites
45
+
46
+
Before taking on this side quest, you should:
47
+
48
+
- Have completed the [Hello Nextflow](../hello_nextflow/README.md) tutorial or equivalent beginner's course.
49
+
- Be comfortable using basic Nextflow concepts and mechanisms (processes, channels, operators, working with files, meta data)
31
50
32
51
---
33
52
34
-
## 0. Warmup
53
+
## 0. Get started
35
54
36
-
Let's move into the project directory.
55
+
#### Open the training codespace
56
+
57
+
If you haven't yet done so, make sure to open the training environment as described in the [Environment Setup](../envsetup/index.md).
58
+
59
+
[](https://codespaces.new/nextflow-io/training?quickstart=1&ref=master)
60
+
61
+
#### Move into the project directory
62
+
63
+
Let's move into the directory where the files for this tutorial are located.
37
64
38
65
```bash
39
66
cd side-quests/nf-test
40
67
```
41
68
42
-
The `nf-test` directory has the file content like:
69
+
You can set VSCode to focus on this directory:
70
+
71
+
```bash
72
+
code .
73
+
```
74
+
75
+
#### Review the materials
76
+
77
+
You'll find a main workflow file and a CSV file called `greetings.csv` that contains the input to the pipeline.
43
78
44
79
```console title="Directory contents"
45
-
nf-test
80
+
.
46
81
├── greetings.csv
47
-
└──main.nf
82
+
└──main.nf
48
83
```
49
84
50
85
For a detailed description of the files, see the [warmup from Hello Nextflow](../hello_nextflow/00_orientation.md).
51
-
The workflow we'll be testing is part of the workflow built in [Hello Workflow](../hello_nextflow/03_hello_workflow.md), and is composed of two processes: `sayHello` and `convertToUpper`:
52
86
53
-
```bash title="Workflow code"
54
-
/*
55
-
* Pipeline parameters
56
-
*/
57
-
params.input_file = "greetings.csv"
87
+
The workflow we'll be testing is a subset of the Hello workflow built in [Hello Workflow](../hello_nextflow/03_hello_workflow.md).
58
88
59
-
/*
60
-
* Use echo to print 'Hello World!' to standard out
61
-
*/
62
-
process sayHello {
89
+
??? example "What does the Hello Nextflow workflow do?"
63
90
64
-
publishDir 'results', mode: 'copy'
91
+
If you haven't done the [Hello Nextflow](../hello_nextflow/index.md) training, here's a quick overview of what this simple workflow does.
65
92
66
-
input:
67
-
val greeting
93
+
The workflow takes a CSV file containing greetings, runs four consecutive transformation steps on them, and outputs a single text file containing an ASCII picture of a fun character saying the greetings.
68
94
69
-
output:
70
-
path "${greeting}-output.txt"
95
+
The four steps are implemented as Nextflow processes (`sayHello`, `convertToUpper`, `collectGreetings`, and `cowpy`) stored in separate module files.
71
96
72
-
script:
73
-
"""
74
-
echo '$greeting' > '$greeting-output.txt'
75
-
"""
76
-
}
97
+
1. **`sayHello`:** Writes each greeting to its own output file (e.g., "Hello-output.txt")
98
+
2. **`convertToUpper`:** Converts each greeting to uppercase (e.g., "HELLO")
99
+
3. **`collectGreetings`:** Collects all uppercase greetings into a single batch file
100
+
4. **`cowpy`:** Generates ASCII art using the `cowpy` tool
77
101
78
-
/*
79
-
* Use a text replace utility to convert the greeting to uppercase
80
-
*/
81
-
process convertToUpper {
102
+
The results are published to a directory called `results/`, and the final output of the pipeline (when run with default parameters) is a plain text file containing ASCII art of a character saying the uppercased greetings.
82
103
83
-
publishDir 'results', mode: 'copy'
104
+
In this side quest, we use an intermediate form of the Hello workflow that only contains the first two processes. <!-- TODO: change this to use the full finished workflow as suggested in https://github.com/nextflow-io/training/issues/735 -->
84
105
85
-
input:
86
-
path input_file
106
+
The subset we'll be working with is composed of two processes: `sayHello` and `convertToUpper`.
We're going to assume an understanding of this workflow, but if you're not sure, you can refer back to [Hello Workflow](../hello_nextflow/03_hello_workflow.md).
Let's run the workflow to make sure it's working as expected.
115
171
@@ -137,15 +193,25 @@ Let's break down what just happened.
137
193
138
194
You ran the workflow with the default parameters, you confirmed it worked and you're happy with the results. This is the essence of testing. If you worked through the Hello Nextflow training course, you'll have noticed we always started every section by running the workflow we were using as a starting point, to confirm everything is set up correctly.
139
195
140
-
Testing software essentially does this process for us. Let's replace our simple `nextflow run main.nf` with a standardised test provided by nf-test.
196
+
Testing software essentially does this process for us.
141
197
142
-
###Takeaway
198
+
#### Review the assignment
143
199
144
-
You should be able to 'test' a pipeline by manually running it.
200
+
Your challenge is to add standardized tests to this workflow using nf-test, in order to make it easy to verify that every part continues to work as expected in case any further changes are made.
145
201
146
-
### What's next?
202
+
<!-- TODO: give a bit more details, similar to how it's done in the Metadata side quest -->
203
+
204
+
#### Readiness checklist
147
205
148
-
Initialize `nf-test`.
206
+
Think you're ready to dive in?
207
+
208
+
-[ ] I understand the goal of this course and its prerequisites
209
+
-[ ] My codespace is up and running
210
+
-[ ] I've set my working directory appropriately
211
+
-[ ] I've run the workflow successfully
212
+
-[ ] I understand the assignment
213
+
214
+
If you can check all the boxes, you're good to go.
149
215
150
216
---
151
217
@@ -1083,31 +1149,37 @@ SUCCESS: Executed 3 tests in 5.007s
1083
1149
1084
1150
Check that out! We ran 3 tests, 1 for each process and 1 for the whole pipeline with a single command. Imagine how powerful this is on a large codebase!
1085
1151
1086
-
## 4. Summary
1152
+
---
1087
1153
1088
-
In this side quest, we've learned:
1154
+
## Summary
1089
1155
1090
-
1. How to initialize nf-test in a Nextflow project
1091
-
2. How to write and run pipeline-level tests:
1092
-
- Basic success testing
1093
-
- Process count verification
1094
-
- Output file existence checks
1095
-
3. How to write and run process-level tests
1096
-
4. Two approaches to output validation:
1097
-
- Using snapshots for complete output verification
1098
-
- Using direct content assertions for specific content checks
1099
-
5. Best practices for test naming and organization
1100
-
6. How to run all tests in a repository with a single command
1156
+
In this side quest, you've learned to leverage nf-test's features to create and run tests for individual processes as well as end-to-end tests for the entire pipeline.
1157
+
You're now aware of the main two approaches to output validation, snapshots and direct content assertions, and and when to use either one.
1158
+
You also know how to run tests either one by one or for an entire project.
1101
1159
1102
-
Testing is a critical part of pipeline development that helps ensure:
1160
+
Applying these techniques in your own work will enable you to ensure that:
1103
1161
1104
1162
- Your code works as expected
1105
1163
- Changes don't break existing functionality
1106
1164
- Other developers can contribute with confidence
1107
1165
- Problems can be identified and fixed quickly
1108
1166
- Output content matches expectations
1109
1167
1110
-
### What's next?
1168
+
### Key patterns
1169
+
1170
+
<!-- TODO: Can we add snippets of code below to illustrate? -->
1171
+
1172
+
1. Pipeline-level tests:
1173
+
- Basic success testing
1174
+
- Process count verification
1175
+
- Output file existence checks
1176
+
2. Process-level tests
1177
+
3. Two approaches to output validation:
1178
+
- Using snapshots for complete output verification
1179
+
- Using direct content assertions for specific content checks
1180
+
4. Running all tests in a repository with a single command
1181
+
1182
+
### Additional resources
1111
1183
1112
1184
Check out the [nf-test documentation](https://www.nf-test.com/) for more advanced testing features and best practices. You might want to:
1113
1185
@@ -1117,4 +1189,10 @@ Check out the [nf-test documentation](https://www.nf-test.com/) for more advance
1117
1189
- Learn about other types of tests like workflow and module tests
1118
1190
- Explore more advanced content validation techniques
1119
1191
1120
-
Remember: Tests are living documentation of how your code should behave. The more tests you write, and the more specific your assertions are, the more confident you can be in your pipeline's reliability.
1192
+
**Remember:** Tests are living documentation of how your code should behave. The more tests you write, and the more specific your assertions are, the more confident you can be in your pipeline's reliability.
1193
+
1194
+
---
1195
+
1196
+
## What's next?
1197
+
1198
+
Return to the [menu of Side Quests](./index.md) or click the button in the bottom right of the page to move on to the next topic in the list.
0 commit comments