Migrating to static types
Nextflow 25.10 introduces the ability to use static types in a Nextflow pipeline. This tutorial demonstrates how to migrate to static types using the rnaseq-nf pipeline as an example.
Static types in Nextflow 25.10 are optional. All existing code will continue to work.
Overview
Static types allow you to specify the types of variables and parameters in Nextflow code. This language feature serves two purposes:
-
Better documentation: Type annotations serve as code documentation, making your code more precise and easier to understand.
-
Better validation: The Nextflow language server uses these type annotations to perform type checking, which allows it to identify type-related errors during development without requiring code execution.
While Nextflow inherited type annotations from Groovy, types were limited to functions and local variables and weren't supported by Nextflow-specific constructs, such as processes, workflows, and pipeline parameters. Additionally, the Groovy type system is significantly larger and more complex than necessary for Nextflow pipelines.
Nextflow 25.10 provides a native way to specify types at every level of a pipeline, from pipeline parameters to process inputs and outputs, using the standard types in the Nextflow standard library.
Developer tooling
Static types work best with the Nextflow language server and Nextflow VS Code extension.
See Environment setup for instructions on how to setup VS Code and the Nextflow extension.
Type checking
Static type checking is currently available through the language server as an experimental feature.
Automatic migration
The Nextflow VS Code extension provides a command for automatically migrating Nextflow pipelines to static types. To migrate a script, open the Command Palette, search for Convert script to static types, and select it.
The extension can also convert an entire project using the Convert pipeline to static types command.
The conversion consists of the following steps:
-
Legacy parameter declarations are converted to a
paramsblock. If anextflow_schema.jsonfile is present, it is used to infer parameter types. -
Legacy processes are converted to typed processes.
The language server uses the following rules when attempting to convert a legacy process:
-
When input or output types cannot be inferred (e.g.,
valinputs and outputs), the type remains unspecified and the language server reports an error. However, if a process includes an nf-core meta.yml, the language server uses it to infer the appropriate type. -
File inputs (
fileandpathqualifiers) are converted toPathorSet<Path>based on (1) thearityoption or (2) stage name when specified. To ensure accurate conversion, specify thearityoption in the legacy syntax and review the converted code to verify the correct type is used. -
File outputs (
fileandpathqualifiers) are converted tofile()orfiles()based on thearityoption when specified. To ensure accurate conversion, specify thearityoption in the legacy syntax and review the converted code to verify the correct output function is used.
Example: rnaseq-nf
This section demonstrates how to migrate a pipeline to static types using the rnaseq-nf pipeline as an example. The completed migration is available in the preview-25-10 branch.
See Getting started with rnaseq-nf for an introduction to the rnaseq-nf pipeline.
While much of this migration can be performed automatically by the language server, this example is intended to provide a concrete example of a pipeline that uses static types. Always review generated code for correctness.
Migrating pipeline parameters
The pipeline defines the following parameters in the main script using the legacy syntax:
params.reads = "$baseDir/data/ggal/ggal_gut_{1,2}.fq"
params.transcriptome = "$baseDir/data/ggal/ggal_1_48850000_49020000.Ggal71.500bpflank.fa"
params.outdir = "results"
params.multiqc = "$baseDir/multiqc"
The pipeline also has a nextflow_schema.json schema with the following properties:
"reads": {
"type": "string",
"description": "The input read-pair files",
"default": "${projectDir}/data/ggal/ggal_gut_{1,2}.fq"
},
"transcriptome": {
"type": "string",
"format": "file-path",
"description": "The input transcriptome file",
"default": "${projectDir}/data/ggal/ggal_1_48850000_49020000.Ggal71.500bpflank.fa"
},
"outdir": {
"type": "string",
"format": "directory-path",
"description": "The output directory where the results will be saved",
"default": "results"
},
"multiqc": {
"type": "string",
"format": "directory-path",
"description": "Directory containing the configuration for MultiQC",
"default": "${projectDir}/multiqc"
}
To migrate the pipeline parameters, use the schema and legacy parameters to define the equivalent params block:
params {
// The input read-pair files
reads: String = "${projectDir}/data/ggal/ggal_gut_{1,2}.fq"
// The input transcriptome file
transcriptome: Path = "${projectDir}/data/ggal/ggal_1_48850000_49020000.Ggal71.500bpflank.fa"
// The output directory where the results will be saved
outdir: Path = 'results'
// Directory containing the configuration for MultiQC
multiqc: Path = "${projectDir}/multiqc"
}
See Parameters for more information about the params block.
Parameters used only in the config file should be declared there, not in the script. Since rnaseq-nf has no such parameters, all parameters are declared in the script. See Parameters for more information.
The rnaseq-nf pipeline initializes the reads and transcriptome parameters to a test dataset by default, as it is designed as a toy example. In practice, defaults for test data should be defined in a config profile (e.g., test).
Migrating workflows
The type of each workflow input can be inferred by examining how the workflow is called.
workflow RNASEQ {
take:
read_pairs_ch
transcriptome
main:
// ...
emit:
fastqc = fastqc_ch
quant = quant_ch
}
The RNASEQ workflow is called by the entry workflow with the following arguments:
workflow {
read_pairs_ch = channel.fromFilePairs(params.reads, checkIfExists: true, flat: true)
(fastqc_ch, quant_ch) = RNASEQ(read_pairs_ch, params.transcriptome)
// ...
}
You can determine the type of each input as follows:
-
The channel
read_pairs_chhas typeChannel<E>, whereEis the type of each value in the channel. ThefromFilePairs()factory withflat: trueemits tuples containing a sample ID and two file paths. Therefore the type ofread_pairs_chisChannel<Tuple<String,Path,Path>>. -
The parameter
params.transcriptomehas typePathas defined in theparamsblock.
To update the workflow inputs, specify the input types as follows:
workflow RNASEQ {
take:
read_pairs_ch: Channel<Tuple<String,Path,Path>>
transcriptome: Path
// ...
}
Type annotations can become long for tuples with many elements. Future Nextflow versions will introduce alternative data types (i.e., record types) that have more concise annotations and provide a better overall experience with static types. The focus of Nextflow 25.10 is to enable type checking for existing code with minimal changes.
You can also specify types for workflow outputs. This is typically not required for type checking, since the output type can usually be inferred from the input types and workflow logic. However, explicit type annotations are still useful as a sanity check -- if the declared output type doesn't match the assigned value's type, the language server can report an error.
You can use the same approach for inputs to determine the type of each output:
-
The variable
fastqc_chcomes from the output ofFASTQC. This process is called with a channel (read_pairs_ch), so it also emits a channel. Each task returns a tuple containing an id and an output file, so the element type isTuple<String,Path>. Therefore, the type offastqc_chisChannel<Tuple<String,Path>>. -
The variable
quant_chcomes from the output ofQUANT. THis process is called with a channel (read_pairs_ch), so it also emits a channel. Each task returns a tuple containing an id and an output file, so the element type isTuple<String,Path>. Therefore, the type ofquant_chisChannel<Tuple<String,Path>>.
To update the workflow outputs, specify the output types as follows:
workflow RNASEQ {
take:
read_pairs_ch: Channel<Tuple<String,Path,Path>>
transcriptome: Path
main:
index = INDEX(transcriptome)
fastqc_ch = FASTQC(read_pairs_ch)
quant_ch = QUANT(index, read_pairs_ch)
emit:
fastqc: Channel<Tuple<String,Path>> = fastqc_ch
quant: Channel<Tuple<String,Path>> = quant_ch
}
Migrating processes
See Processes (typed) for an overview of typed process inputs and outputs.
FASTQC
The FASTQ process is defined with the following inputs and outputs:
process FASTQC {
// ...
input:
tuple val(id), path(fastq_1), path(fastq_2)
output:
path "fastqc_${id}_logs"
// ...
}
To migrate this process, rewrite the inputs and outputs as follows:
process FASTQC {
// ...
input:
(id, fastq_1, fastq_2): Tuple<String,Path,Path>
output:
file("fastqc_${id}_logs")
// ...
}
In the above:
-
Inputs of type
Pathare treated likepathinputs in the legacy syntax. Additionally, iffastq_1orfastq_2were marked asPath?, they could be null, which is not supported bypathinputs. -
Outputs are normally defined as assignments, similar to workflow emits. In this case, however, the name was omitted since there is only one output.
-
Values in the
output:section can use several specialized functions for [process outputs][process-reference-typed]. Here,file()is the process output function (not the standard library function).
Other process sections, such as the directives and the script: block, are not shown because they do not require changes. As long as the inputs and outputs declare and reference the same variable names and file patterns, the other process sections will behave the same as before.
QUANT
The QUANT process is defined with the following inputs and outputs:
process QUANT {
// ...
input:
tuple val(id), path(fastq_1), path(fastq_2)
path index
output:
path "quant_${id}"
// ...
}
To migrate this process, rewrite the inputs and outputs as follows:
process QUANT {
// ...
input:
(id, fastq_1, fastq_2): Tuple<String,Path,Path>
index: Path
output:
file("quant_${id}")
// ...
}
MULTIQC
The MULTIQC process is defined with the following inputs and outputs:
process MULTIQC {
// ...
input:
path '*'
path config
output:
path 'multiqc_report.html'
// ...
}
To migrate this process, rewrite the inputs and outputs as follows:
process MULTIQC {
// ...
input:
logs: Set<Path>
config: Path
stage:
stageAs logs, '*'
output:
file('multiqc_report.html')
// ...
}
In a typed process, file patterns for path inputs must be declared using a stage directive. In this example, the first input uses the variable name logs, and the stageAs directive stages the input using the glob pattern *.
In this case, you can omit the stage directive because * matches Nextflow's default staging behavior. Inputs of type Path or a Path collection (e.g., Set<Path>) are staged by default using the pattern '*'.
In the legacy syntax, you can use the arity option to specify whether a path qualifier expects a single file or collection of files. When using typed inputs and outputs, the type determines this behavior, i.e., Path vs Set<Path>.
While List<Path> and Bag<Path> are also valid path collection types, Set<Path> is recommended because it most accurately represents an unordered collection of files. You should only use List<Path> when you want the collection to be ordered.
INDEX
Apply the same migration principles from the previous processes to migrate INDEX.
Preparing for static types
While static types can be adopted progressively with existing code, some coding patterns cannot be statically typed and will prevent the type checker from validating your entire pipeline.
This section provides best practices for writing Nextflow code that works well with static types. The strict syntax remains optional in Nextflow 25.10. However, it is required to use type annotations. Before you migrate to static types, ensure your code adheres to the strict syntax using When preparing for the strict syntax, try to address deprecation warnings as much as possible. For example: The above example shows how to avoid three deprecated patterns: Nextflow provides three ways to assign a channel: a standard assignment, the However, the type checker cannot validate code that uses The special operators However, the type checker cannot validate code that uses these special operators. Use standard assignments and method calls instead: The each input qualifier is not supported by typed processes. Use the combine operator to create a single tuple channel instead. For example: Rewrite the script to use the The Some standard library functions and operators have ambiguous return types. While you can still use them with static types, the type checker will not be able to fully validate your code. You can tell whether a function has an ambiguous return type by inspecting the function signature. The return type is ambiguous if it is For example, the Use alternatives that have well-defined return types. For example, use The commonly-used operators listed under Channels have well-defined return types and are recommended for use with static types. If you cannot avoid an ambiguously-typed function, you can use a hard cast to specify the expected type, provided you know what the type will be at runtime. For example: The To address this issue: Replace the Add a It becomes: The resulting channel has type Use the strict syntax
nextflow lint or the language server.Avoid deprecated patterns
Channel.from(1, 2, 3).map { it * 2 } // deprecated
channel.of(1, 2, 3).map { v -> v * 2 } // best practice
Channel to access channel factories (use channel instead)channel.from factory (use channel.of or channel.fromList instead)it closure parameter (declare the parameter explicitly instead)Avoid
set and tap operatorsset operator, and the tap operator:ch = channel.of(1, 2, 3) // standard assignment
channel.of(10, 20, 30).set { ch } // set
channel.of(10, 20, 30).tap { ch } // tapset or tap. Use standard assignments in your dataflow logic to enable full type checking.Avoid
| and & operators| and & provide shorthands for writing dataflow logic:channel.of('Hello', 'Hola', 'Ciao')
| greet
| map { v -> v.toUpperCase() }
| viewch_input = channel.of('Hello', 'Hola', 'Ciao')
ch_greet = greet(ch_input)
ch_greet
.map { v -> v.toUpperCase() }
.view()Avoid
each input qualifierprocess align {
input:
path seq
each mode
script:
"""
t_coffee -in $seq -mode $mode > result
"""
}
workflow {
sequences = channel.fromPath('*.fa')
methods = ['regular', 'espresso', 'psicoffee']
align(sequences, methods)
}combine operator. It becomes:process align {
input:
tuple path(seq), val(mode)
script:
"""
t_coffee -in $seq -mode $mode > result
"""
}
workflow {
sequences = channel.fromPath('*.fa')
methods = ['regular', 'espresso', 'psicoffee']
align(sequences.combine(methods))
}each qualifier is discouraged in modern Nextflow code. While it provides a convenient shorthand for combining multiple inputs, it couples the process definition with external workflow logic. Since the introduction of DSL2, Nextflow aims to treat processes as standalone modules that are decoupled from workflow logic.Avoid functions with ambiguous return types
? or contains ? as a type argument.flatten operator has the following signature:def flatten() -> Channel<?>flatMap instead of flatten.channel.fromPath('samplesheet.csv')
.splitCsv()
.view()splitCsv operator has an ambiguous return type (Channel<?>). It emits lists by default but emits maps when header is enabled, making the return type dependent on runtime conditions.
splitCsv operation with a flatMap operation that calls the splitCsv method (defined for Path).map operation that casts the emitted rows as List<String>.channel.fromPath('samplesheet.csv')
.flatMap { csv -> csv.splitCsv() }
.map { row -> row as List<String> }
.view()Channel<List<String>>, and the type checker can validate any downstream code that uses this channel.
Additional resources
See the following links to learn more about static types: