- I have managed computing cluster for last so many years, so i thought a bash version should be there.
- This morning coded this and now ToxPy, ToxAnnotator and ToxShell all integrated tools for comparative genomics.
Toxshell for Comparative Genomics
This is for server side HPC rendering and comparison for ToxDB species
Gaurav Sablok
codeprog@icloud.com
toxshell analyzer
enter the gff1 file:ToxoDB-65_TgondiiME49.gff
enter the gff2 file:ToxoDB-66_TgondiiME49.gff
Prepairing the analysis for the comparative genomics of ToxDB
The number of the protein coding genes in the first gff1 file are:
46
The number of the protein coding genes in the second gff2 file are:
41
The simplified coordinates of the protein_coding_genes are written to gff_1_simplified and gff_2_simplified
The ids of the genes which are dissimilar in both the annotation are in the file gff_compare_dissimilar_ids and their indiviual ids are present in gff_1_ids and gff_2_ids
The similar genes are written in the file: gff_compare_similar_ids
The shared ids from the first GFF are written to the gff_1_final_compare
The shared ids from the second GFF are written to the gff_2_final_compare
final_sorted_ids_start_end_difference
Final Files
The final files for the comparative analysis are:
1. Simplified GFF 1: gff_1_simplified containing the simplified version
2. Simplified GFF 2: gff_2_simplified containing the simplified version
3. Similar genes: gff_compare_similar_ids
4. Dissimilar genes: gff_compare_dissimilar_ids
5. Shared ids from the first GFF: gff_1_final_compare
6. Shared ids from the second GFF: gff_2_final_compare
7. Genes differ on the first gff: gff_1_final_differ
8. Genes differ on the second gff: gff_2_final_differ
9. Genes differ on the first gff and present on the negative strand: gff_1_final_differ_negative_strand
10. Genes differ on the first gff and present on the positive strand: gff_1_final_differ_positve_strand
11. Genes differ on the second gff and present on the negative strand: gff_2_final_differ_negative_strand
12. Genes differ on the second gff and present on the positive strand: gff_2_final_differ_positive_strand