Downloads

The latest version of L_RNA_scaffolder, including the Documentation, is available for download.

Case1: Scaffolding zebrafish genome

The final zebrafish assembly files in our manuscript could be downloaded below, including
zebrafish complete genome of Zv_9: Zv_9.tar.gz;
zebrafish initial genome contigs: zebrafish_contig.tar.gz;
zebrafish genome scaffolded with our method: zebrafish_L_RNA_scaffolder.tar.gz;

Case 2: Scaffolding human genome

2.1 What data sets were used in the comparison?
Four different whole-genome mate-pair libraries were used:
Library 1: 2 Kb mate-pair library
Fastq read file 1
Fastq read file 2
Library 2: 5 kb mate-pair library
Fastq read file 1
Fastq read file 2
Library 3: 10 kb mate-pair library
Fastq read file 1
Fastq read file 2
Library 4: 35 kb mate-pair library
Fastq read file 1
Fastq read file 2

2.2 What genome scaffolders are used in the comparison?
We are evaluating the performance of the following genome scaffolders on our data sets:
1. SSPACE
2.SOPRA
3. SOAPdenovo
4. MIP scaffolder
5. Opera

2.3 Final assembly files in the comparison

human complete genome of Hg19: hg19.tar.gz;
human initial genome contigs: human_contig.tar.gz;
human genome scaffolded with our method: human_L_RNA_scaffolder.tar.gz;
human genome scaffolded using SSPACE with only 2kb library: SSPACE2K.scaf;
human genome scaffolded using SSPACE with only 5kb library: SSPACE5K.scaf;
human genome scaffolded using SSPACE with only 10kb library: SSPACE10K.scaf;
human genome scaffolded using SSPACE with only 35kb library: SSPACE35K.scaf;
human genome scaffolded using SOPRA with only 2kb library: SOPRA2K.scaf;
human genome scaffolded using SOPRA with only 5kb library: SOPRA5K.scaf;
human genome scaffolded using SOPRA with only 10kb library: SOPRA10K.scaf;
human genome scaffolded using Opera with only 2kb library: Opera2K.scaf;
human genome scaffolded using Opera with only 5kb library: Opera5K.scaf;
human genome scaffolded using Opera with only 10kb library: Opera10K.scaf;
human genome scaffolded using Opera with only 35kb library: Opera35K.scaf;
human genome scaffolded using MIP scaffolder with only 2kb library: Mip2K.scaf;
human genome scaffolded using MIP scaffolder with only 5kb library: Mip5K.scaf;
human genome scaffolded using MIP scaffolder with only 10kb library: Mip10K.scaf;
human genome scaffolded using MIP scaffolder with only 35kb library: Mip35K.scaf;
human genome scaffolded using SOAPdenovo with only 2kb library: soapdenovo2K.scaf;
human genome scaffolded using SOAPdenovo with only 5kb library: soapdenovo5K.scaf;
human genome scaffolded using SOAPdenovo with only 10kb library: soapdenovo10K.scaf;
human genome scaffolded using SOAPdenovo with only 35kb library: soapdenovo35K.scaf;

2.4 Scaffolding human genomes using pair-end RNA-seq data
Two sets of pair-end RNA-seq data, including SRR324684 and SRR324685, were downloaded from NCBI SRA database.

We scaffolded human genome contigs with illumina RNA-seq data in two ways.

In the first way, the fastq sequences were converted to fasta format and then aligned to human contigs using BLAT. The alignments were used with L_RNA_scaffolder for scaffolding. The scaffolding result is available here

In the second way, RNA-seq data was firstly assembled using Trinity. The de novo assembled transcripts were aligned using BLAT and the output guided genome scaffolding.The scaffolding result is available here

These scaffoldings demonstrate that L_RNA_scaffolder is suitable for illumina RNA-seq data.

2.5 Scaffolding human genomes using PacBio RNA-seq data
To demonstrate that our method is applicable to PacBio RNA-seq data, PacBio long RNA-seq reads (available here)from human brain cerebellum were used to scaffold human genome with our method. The scaffolding result is available here.

This practice indicates that our method has the practical impact on genome scaffolding using the novel single-molecule sequencing technologies.

Case 3: Scaffolding pearl oyster genome
the pearl oyster initial genome: pearl_genome.tar.gz;
the pearl oyster genome scaffolded with our method: pearl_L_RNA_scaffolder.tar.gz;
If you would like to experiment further, we provide the example data using C. elegans Genome and you can try scaffolding this genome. In the example data, we provide the PSL results generated from alignment of 427,703 ESTs/mRNAs to C. elegans contigs. After downloading the example data, you install L_RNA_scaffolder program into the example diretory and type "sh L_RNA_scaffolder/L_RNA_scaffolder.sh -d L_RNA_scaffolder/ -i exapmle/C_elegans.psl -j exapmle/C_elegans.contig.fasta -e 20000". A file named "L_RNA_scaffolder.fasta" is the scaffolding result.

If you have any trouble using L_RNA_scaffoler then please first follow the steps in our Documentation before contacting us.

Comments are closed.