Package: wnpp Severity: wishlist Subject: ITP: long-read-assembler -- assembly from long reads against reference genome Package: wnpp Owner: Steffen Moeller <moel...@debian.org> Severity: wishlist
* Package name : long-read-assembler Version : 1.3.2 Upstream Author : Copyright: Bonnie Phan Wolfe <bonn...@usc.edu> * URL : https://github.com/ChaissonLab/LRA * License : USC-RL-1.0 Programming Lang: C Description : assembly from long reads against reference genome Machines that determine the DNA sequence do not provide answers en block, but as many comparatively short reads that by chance also overlap. From these, the complete genome is puzzled together assembled. This is less easy than one may think because of redundancies of the genome, so you do not know where a read comes from if that read is too short. A reference genome helps, but people differ, e.g. with local duplications, and you may be interested in diseases that have chromosomal rearrangements. . lra is a sequence alignment program that aligns long reads from single-molecule sequencing (SMS) instruments, or megabase-scale contigs from SMS assemblies. These technologies provide reads that are 1000 or 10k times longer than what can be achieved with the Sanger Sequencing technology and help the assembly. . lra implements seed chaining sparse dynamic programming with a concave gap function to read and assembly alignment, which is also extended to allow for inversion cases. lra alignment approach increases sensitivity and specificity for SV discovery, particularly for variants above 1kb and when discovering variation from ONT reads, while having runtime that arecomparable (1.05-3.76×) to current methods. When applied to calling variation from *de novo* assembly contigs, there is a 3.2% increase in Truvari F1 score compared to minimap2+htsbox. Remark: This package is maintained by Steffen Moeller at https://salsa.debian.org/med-team/long-read-assembler