Downloads

The latest version of PEP_scaffolder, including the Documentation, is available for download.

Case 1: Scaffolding human genome

1.1 Data Source

The human assembly files in our manuscript could be downloaded below, including:
complete genome of hg38: hg38.tar.gz;
human initial genome contigs: contigs.tar.gz;

Different sources of proteins used in scaffolding include:
Library 1: human proteins from Swiss-Prot database
uniprot_sprot_human.fasta.gz
Library 2: human proteins from TrEMBL database
uniprot_trembl_human.fasta.gz
Library 3: mammals protein from Swiss-Prot database
uniprot_sprot_mammals.fasta.gz
Library 4: mammals protein from TrEMBL data
uniprot_trembl_mammals.fasta.gz
Library 5: rodents protein from Swiss-Prot data
uniprot_sprot_rodents.fasta.gz
Library 6: rodents protein from TrEMBL data
uniprot_trembl_rodents.fasta.gz

1.2 How to evaluate the accuracy of assembly?
Using the order and orientation of contigs on the hg38 assembly as the golden standard (available here ), we estimated the correct links scaffolded and calculated the proportion of corrected links out of all links.

1.3 What scaffolders are used in the comparison?
We are evaluating the performance of the following genome scaffolders:
1. SWiPS
2.ESPRIT

1.4 Final assembly files in the comparison

human complete genome of hg38 : hg38.tar.gz;
human initial genome contigs: human_contig.tar.gz;
human genome scaffolded with human Swiss-Prot proteins: ushc_PEP_scaffolder.fasta; ( PEP_scaffolder connections )
human genome scaffolded with human TrEMBL proteins: : uthc_PEP_scaffolder.fasta; ( PEP_scaffolder connections )
human genome scaffolded with mammal Swiss-Prot proteins: usmc_PEP_scaffolder.fasta; ( PEP_scaffolder connections )
human genome scaffolded with mammal TrEMBL proteins: : uthc_PEP_scaffolder.fasta; ( PEP_scaffolder connections )
human genome scaffolded with rodent Swiss-Prot proteins: usrc_PEP_scaffolder.fasta;( PEP_scaffolder connections )
human genome scaffolded with rodent TrEMBL proteins: : utrc_PEP_scaffolder.fasta; ( PEP_scaffolder connections )
human genome scaffolded with human and mammal Swiss-Prot proteins: ushmc_PEP_scaffolder.fasta; ( PEP_scaffolder connections )
human genome scaffolded with human and mammal TrEMBL proteins: uthmc_PEP_scaffolder.fasta; ( PEP_scaffolder connections )
human genome scaffolded with human, mammal and rodent Swiss-Prot proteins: ushmrc_PEP_scaffolder.fasta; ( PEP_scaffolder connections )
human genome scaffolded with human, mammal and rodent TrEMBL proteins: uthmrc_PEP_scaffolder.fasta; ( PEP_scaffolder connections )
human genome scaffolded with human Swiss-Prot and TrEMBL proteins: : usthc_PEP_scaffolder.fasta; ( PEP_scaffolder connections )
human genome scaffolded with mammal Swiss-Prot and TrEMBL proteins: ustmc_PEP_scaffolder.fasta; ( PEP_scaffolder connections )
human genome scaffolded with rodent Swiss-Prot and TrEMBL proteins: : ustrc_PEP_scaffolder.fasta; ( PEP_scaffolder connections )
human genome scaffolded with Swiss-Prot and TrEMBL proteins of human and mammal: usthmc_PEP_scaffolder.fasta; ( PEP_scaffolder connections )
human genome scaffolded with Swiss-Prot and TrEMBL proteins of human, mammal and rodent: usthmrc_PEP_scaffolder.fasta; ( PEP_scaffolder connections )

Case 2: Scaffolding fly genome
the fly genome contigs: fly contig fasta;
the fly Ensembl protein: fly Ensembl protein;
the fly genome scaffolded using PEP_scaffolder: fly PEP_scaffolder assembly;
the psl alignment file of fly Ensembl protein to fly contigs psl file ;
the contigs connections generated by PEP_scaffolder fly PEP_scaffolder connections ;
the contigs connections generated by ESPRIT fly ESPRIT connections ;
the contigs connections generated by ESPRIT fly SWiPS connections ;

If you would like to experiment further, you can download the fly genome contigs and psl alignment file. After downloading the example data, you install PEP_scaffolder program into the example diretory and type "sh PEP_scaffolder/PEP_scaffolder.sh -d PEP_scaffolder/ -i fly.psl -j fly_contig.fasta ". A file named "PEP_scaffolder.fasta" is the scaffolding result.

If you have any trouble using PEP_scaffolder, then please first follow the steps in our Documentation before contacting us.

Comments are closed.