Path to this page:
Subject: CVS commit: pkgsrc/biology/racon
From: Brook Milligan
Date: 2021-05-26 20:53:40
Message id: 20210526185340.0F770FA95@cvs.NetBSD.org
Log Message:
biology/racon: add racon 1.4.3
## Description
Racon is intended as a standalone consensus module to correct raw
contigs generated by rapid assembly methods which do not include a
consensus step. The goal of Racon is to generate genomic consensus
which is of similar or better quality compared to the output generated
by assembly methods which employ both error correction and consensus
steps, while providing a speedup of several times compared to those
methods. It supports data produced by both Pacific Biosciences and
Oxford Nanopore Technologies.
Racon can be used as a polishing tool after the assembly with **either
Illumina data or data produced by third generation of
sequencing**. The type of data inputed is automatically detected.
Racon takes as input only three files: contigs in FASTA/FASTQ format,
reads in FASTA/FASTQ format and overlaps/alignments between the reads
and the contigs in MHAP/PAF/SAM format. Output is a set of polished
contigs in FASTA format printed to stdout. All input files **can be
compressed with gzip** (which will have impact on parsing time).
Racon can also be used as a read error-correction tool. In this
scenario, the MHAP/PAF/SAM file needs to contain pairwise overlaps
between reads **including dual overlaps**.
A **wrapper script** is also available to enable easier usage to the
end-user for large datasets. It has the same interface as racon but
adds two additional features from the outside. Sequences can be
**subsampled** to decrease the total execution time (accuracy might be
lower) while target sequences can be **split** into smaller chunks and
run sequentially to decrease memory consumption. Both features can be
run at the same time as well.
Files: