./biology/racon, Genomic consensus builder

[ CVSweb ] [ Homepage ] [ RSS ] [ Required by ] [ Add to tracker ]


Branch: CURRENT, Version: 1.4.3, Package name: racon-1.4.3, Maintainer: pkgsrc-users

Racon is intended as a standalone consensus module to correct raw
contigs generated by rapid assembly methods which do not include a
consensus step. The goal of Racon is to generate genomic consensus
which is of similar or better quality compared to the output generated
by assembly methods which employ both error correction and consensus
steps, while providing a speedup of several times compared to those
methods. It supports data produced by both Pacific Biosciences and
Oxford Nanopore Technologies.

Racon can be used as a polishing tool after the assembly with either
Illumina data or data produced by third generation of sequencing. The
type of data inputed is automatically detected.


Master sites:

Filesize: 50262.517 KB

Version history: (Expand)


CVS history: (Expand)


   2021-10-26 12:03:45 by Nia Alarie | Files touched by this commit (73)
Log message:
biology: Replace RMD160 checksums with BLAKE2s checksums

All checksums have been double-checked against existing RMD160 and
SHA512 hashes
   2021-10-07 15:19:44 by Nia Alarie | Files touched by this commit (73)
Log message:
biology: Remove SHA1 hashes for distfiles
   2021-05-26 20:53:40 by Brook Milligan | Files touched by this commit (4)
Log message:
biology/racon: add racon 1.4.3

## Description

Racon is intended as a standalone consensus module to correct raw
contigs generated by rapid assembly methods which do not include a
consensus step. The goal of Racon is to generate genomic consensus
which is of similar or better quality compared to the output generated
by assembly methods which employ both error correction and consensus
steps, while providing a speedup of several times compared to those
methods. It supports data produced by both Pacific Biosciences and
Oxford Nanopore Technologies.

Racon can be used as a polishing tool after the assembly with **either
Illumina data or data produced by third generation of
sequencing**. The type of data inputed is automatically detected.

Racon takes as input only three files: contigs in FASTA/FASTQ format,
reads in FASTA/FASTQ format and overlaps/alignments between the reads
and the contigs in MHAP/PAF/SAM format. Output is a set of polished
contigs in FASTA format printed to stdout. All input files **can be
compressed with gzip** (which will have impact on parsing time).

Racon can also be used as a read error-correction tool. In this
scenario, the MHAP/PAF/SAM file needs to contain pairwise overlaps
between reads **including dual overlaps**.

A **wrapper script** is also available to enable easier usage to the
end-user for large datasets. It has the same interface as racon but
adds two additional features from the outside. Sequences can be
**subsampled** to decrease the total execution time (accuracy might be
lower) while target sequences can be **split** into smaller chunks and
run sequentially to decrease memory consumption. Both features can be
run at the same time as well.