2021-03-24 16:22:29 by Jason Bacon | Files touched by this commit (4) |
Log message: biology/vcf-split: import vcf-split-0.1.1 Vcf-split splits a multi-sample VCF into single-sample VCFs, writing thousands of output files simultaneously. Parsing the TOPMed human chromosome 1 BCF with bcftools takes two days, so extracting the 137,977 samples one at a time or using thousands of parallel readers of the same file is impractical. Vcf-split solves this by generating thousands of single-sample outputs during a single sweep through the multi-sample input. |