2 | 2023-11-13T11:12:18.883Z | drVM | Assembly Tool | drVM: a new tool for efficient genome assembly of known eukaryotic viruses from metagenomes | | | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5466706/ | Rob Edwards' Viral Bioinfo Tools | | | | | | | | |
3 | 2023-11-13T11:12:18.883Z | GenomeDetective Virus | Assembly Tool | | | | https://www.genomedetective.com/app/typingtool/virus/ | Rob Edwards' Viral Bioinfo Tools | | | | | | | | |
4 | 2023-11-13T11:12:18.883Z | IVA | Assembly Tool | de-novo assembly, needs to be incorporated in pipeline with host sequence removal, e.g., shiver | | | http://sanger-pathogens.github.io/iva/ | Rob Edwards' Viral Bioinfo Tools | | | | | | | | |
5 | 2023-11-13T11:12:18.883Z | IVAR | Assembly Tool | Designed for mapping-based "assembly" of amplicon sequencing data | | https://github.com/andersen-lab/ivar | | Rob Edwards' Viral Bioinfo Tools | | | | | | | | |
6 | 2023-11-13T11:12:18.883Z | metaViralSpades | Assembly Tool | | | | https://academic.oup.com/bioinformatics/article-abstract/36/14/4126/5837667 | Rob Edwards' Viral Bioinfo Tools | | | | | | | | |
7 | 2023-11-13T11:12:18.883Z | rnaViralSpades | Assembly Tool | | | | https://www.biorxiv.org/content/10.1101/2020.07.28.224584v1 | Rob Edwards' Viral Bioinfo Tools | | | | | | | | |
8 | 2023-11-13T11:12:18.883Z | savage | Assembly Tool | | | https://bitbucket.org/jbaaijens/savage/src/master/ | | Rob Edwards' Viral Bioinfo Tools | | | | | | | | |
9 | 2023-11-13T11:12:18.883Z | v-pipe | Assembly Tool | | | https://github.com/cbg-ethz/V-pipe/tree/ | | Rob Edwards' Viral Bioinfo Tools | | | | | | | | |
10 | 2023-11-13T11:12:18.883Z | vicuna | Assembly Tool | | | | https://www.broadinstitute.org/viral-genomics/vicuna | Rob Edwards' Viral Bioinfo Tools | | | | | | | | |
11 | 2023-11-13T11:12:18.883Z | VIP | Assembly Tool | Phage VIrion Protein classification based on chaos game representation and Vision Transformer; Both | | | https://github.com/KennthShang/PhaVIP; https://github.com/keylabivdc/VIP/ | https://www.nature.com/articles/srep23774 | Rob Edwards' Viral Bioinfo Tools; Rob Edwards' Viral Bioinfo Tools | | | | | | | | |
12 | 2023-11-13T11:12:18.883Z | viral-ngs | Assembly Tool | | | | https://viral-ngs.readthedocs.io/en/latest/ | Rob Edwards' Viral Bioinfo Tools | | | | | | | | |
13 | 2023-11-13T11:12:18.883Z | VirusTAP | Assembly Tool | WEBSERVER - No option to register | | | https://gph.niid.go.jp/virustap/system_in | Rob Edwards' Viral Bioinfo Tools | | | | | | | | |
14 | 2023-11-13T11:12:18.883Z | Choice of assembly software has a critical impact on virome characterisation | Benchmark | Phage assembly benchmark | | | https://microbiomejournal.biomedcentral.com/articles/10.1186/s40168-019-0626-5/tables/1 | Rob Edwards' Viral Bioinfo Tools | | | | | | | | |
15 | 2023-11-13T11:12:18.883Z | Evaluation of computational phage detection tools for metagenomic datasets | Bioinformatics | Phage detection in metagenomes tools benchmark | | | https://www.frontiersin.org/articles/10.3389/fmicb.2023.1078760/full | Rob Edwards' Viral Bioinfo Tools | | | | | | | | |
16 | 2023-11-13T11:12:18.883Z | MaGplotR | CRISPR | Virus | CRISPR Screens | | | https://github.com/alematia/MaGplotR | https://www.biorxiv.org/content/10.1101/2023.01.12.523725v1 | Rob Edwards' Viral Bioinfo Tools | | 20230112 | | | | | | |
17 | 2023-11-13T11:12:18.883Z | SpacePHARER | CRISPR | Phage | CRISPR Spacer Phage-Host Pair Finder | | | spacepharer.soedinglab.org | https://www.biorxiv.org/content/10.1101/2020.05.15.090266v1 | Rob Edwards' Viral Bioinfo Tools | | 20220906 | | | | | | |
18 | 2023-11-13T11:12:18.883Z | BVBRC | Cyberinfrastructure-supported virus tools | Both | Website | | | https://bitbucket.org/srouxjgi/iphop | http://bvbrc.org | Rob Edwards' Viral Bioinfo Tools | | Actively developed | | | | | | |
19 | 2023-11-13T11:12:18.883Z | iVirus 2.0 | Cyberinfrastructure-supported virus tools | Phage | integrating iVirus apps on CyVerse and KBase | | | CyVerse ( KBase ( | https://www.nature.com/articles/s43705-021-00083-3
http://tinyurl.com/4ndkt4n2),
https://kbase.us/applist/) | Rob Edwards' Viral Bioinfo Tools | | | | | | | | |
20 | 2023-11-13T11:12:18.883Z | PhageAI | Data repository, life cycle, taxonomy and proteins structure prediction, phage similarity, phage annotation | Phage | NLP, ML | | | | https://www.biorxiv.org/content/10.1101/2020.07.11.198606v1
https://app.phage.ai/ | Rob Edwards' Viral Bioinfo Tools | | | | | | | | |
21 | 2024-01-05T17:21:26.566Z | DePP | Depolymerase finder | Phage | | https://timskvortsov.github.io/WebDePP/ | https://doi.org/10.1186%2Fs12859-023-05341-w | Rob Edwards' Viral Bioinfo Tools | | | | | | | | |
22 | 2023-11-13T11:12:18.883Z | PhageDPO | Depolymerase finder | Phage | SVM and ANN | | bit.ly/phagedpo | | Rob Edwards' Viral Bioinfo Tools | | 2022 | | | | | | |
23 | 2023-11-13T11:12:18.883Z | OLGenie | Diversity and selection analysis | Both | Program for estimating dN/dS in overlapping genes (OLGs); inferring purifying selection in alternative reading frames; intrahost; within-host; evolution; selection; nucleotide diversity | | | https://github.com/chasewnelson/OLGenie | https://academic.oup.com/mbe/article/37/8/2440/5815567 | Rob Edwards' Viral Bioinfo Tools | | 20221202 | | | | | | |
24 | 2023-11-13T11:12:18.883Z | SNPGenie | Diversity and selection analysis | Both | Program for estimating πN/πS, dN/dS, and other diversity measures from next-generation sequencing data; intrahost; within-host; evolution; selection; nucleotide diversity | | | https://github.com/chasewnelson/snpgenie | https://academic.oup.com/bioinformatics/article/31/22/3709/241742 | Rob Edwards' Viral Bioinfo Tools | | 20230822 | | | | | | |
25 | 2023-11-13T11:12:18.883Z | VCFgenie | Diversity and selection analysis | Both | Program for reproducibly filtering VCF files and eliminating false positive variants; intrahost; within-host; evolution; selection; nucleotide diversity | In revision | | https://github.com/chasewnelson/VCFgenie | | Rob Edwards' Viral Bioinfo Tools | | 20220825 | | | | | | |
26 | 2023-11-13T11:12:18.883Z | VIPERA | Evolutionary analysis | Virus | Phylogenetic and population genetics-based analysis of intra-patient SARS-CoV-2 | | | https://github.com/PathoGenOmics-Lab/VIPERA | https://doi.org/10.1101/2023.10.24.561010 | Rob Edwards' Viral Bioinfo Tools | | 20231108 | | | | | | |
27 | 2023-11-13T11:12:18.883Z | MetaCerberus | Genome and virome annotation | Both | HMM-based with Ray MPP | | | https://github.com/raw-lab/MetaCerberus | https://www.biorxiv.org/content/10.1101/2023.08.10.552700v1 | Rob Edwards' Viral Bioinfo Tools | | 2023 | | | | | | |
28 | 2023-11-13T11:12:18.883Z | DRAMv | Genome annnotation | Phage | Distilling and refining annotation of metabolism | | | https://github.com/WrightonLabCSU/DRAM | https://academic.oup.com/nar/article/48/16/8883/5884738 | Rob Edwards' Viral Bioinfo Tools | | 2023 | | | | | | |
29 | 2023-11-13T11:12:18.883Z | PhANNs | Genome annnotation | Phage | | | PhANNs | https://journals.plos.org/ploscompbiol/article/authors?id=10.1371/journal.pcbi.1007845 | Rob Edwards' Viral Bioinfo Tools | | | | | | | | |
30 | 2023-11-13T11:12:18.883Z | Pharokka | Genome annnotation | Phage | | | https://github.com/gbouras13/pharokka | https://doi.org/10.1093/bioinformatics/btac776 | Rob Edwards' Viral Bioinfo Tools | | 20230124 | | | | | | |
31 | 2023-11-13T11:12:18.883Z | coronaSPAdes | Genome assembly | Both | HMM-synteny guided assembly (works for all viruses) | | | https://github.com/ablab/spades/tree/metaviral_publication | https://academic.oup.com/bioinformatics/article/38/1/1/6354349 | Rob Edwards' Viral Bioinfo Tools | | | | | | | | |
32 | 2023-11-13T11:12:18.883Z | metaviralSPAdes | Genome assembly | Both | MetaviralSPAdes: assembly of viruses from metagenomic data | Bioinformatics | Oxford Academic | | https://github.com/ablab/spades/tree/metaviral_publication | | Rob Edwards' Viral Bioinfo Tools | | | | | | | | |
33 | 2023-11-13T11:12:18.883Z | mulitPHATE | Genome comparison | Phage | | https://github.com/carolzhou/multiPhATE | | Rob Edwards' Viral Bioinfo Tools | | | | | | | | |
34 | 2023-11-13T11:12:18.883Z | PhageClouds | Genome comparison | Phage | network graphs | | | | https://doi.org/10.1089/phage.2021.0008 | Rob Edwards' Viral Bioinfo Tools | | | | | | | | |
35 | 2023-11-13T11:12:18.883Z | CheckV | Genome completeness | Both |; CheckV: assessing the quality of metagenome-assembled viral genomes | | https://bitbucket.org/berkeleylab/checkv/src/master/ | https://www.biorxiv.org/content/10.1101/2020.05.06.081778v1 | Rob Edwards' Viral Bioinfo Tools | | 20220906 | | | | | | |
36 | 2023-11-13T11:12:18.883Z | viralComplete | Genome completeness | Both | | https://github.com/ablab/viralComplete/ | | Rob Edwards' Viral Bioinfo Tools | | | | | | | | |
37 | 2023-11-13T11:12:18.883Z | viralVerify | Genome completeness | Both | | https://github.com/ablab/viralVerify/ | | Rob Edwards' Viral Bioinfo Tools | | | | | | | | |
38 | 2023-11-13T11:12:18.883Z | BacteriophageHostPrediction | Host prediction | Phage | | | https://github.com/dimiboeckaerts/BacteriophageHostPrediction | https://www.nature.com/articles/s41598-021-81063-4 | Rob Edwards' Viral Bioinfo Tools | | | | | | | | |
39 | 2023-11-13T11:12:18.883Z | CHERRY | Host prediction | Phage | | | https://github.com/KennthShang/CHERRY | https://academic.oup.com/bib/article/23/5/bbac182/6589865 | Rob Edwards' Viral Bioinfo Tools | | | | | | | | |
40 | 2023-11-13T11:12:18.883Z | CrisprOpenDB | Host prediction | Phage | | | https://github.com/edzuf/CrisprOpenDB | https://doi.org/10.1093/nar/gkab133 | Rob Edwards' Viral Bioinfo Tools | | | | | | | | |
41 | 2023-11-13T11:12:18.883Z | DeePaC | Host prediction | Both | CNN, ResNet, Shapley values (interpretability) | | | | https://academic.oup.com/nargab/article/3/1/lqab004/6125551,
https://academic.oup.com/bib/article/22/6/bbab269/6326527,
https://academic.oup.com/bioinformatics/article/38/Supplement_2/ii168/6702016
https://gitlab.com/dacs-hpi/deepac | Rob Edwards' Viral Bioinfo Tools | | 20221216 | | | | | | |
42 | 2023-11-13T11:12:18.883Z | DeePaC-Live | Host prediction | Both | ResNet | | | | https://academic.oup.com/bib/article/22/6/bbab269/6326527
https://gitlab.com/dacs-hpi/deepac-live | Rob Edwards' Viral Bioinfo Tools | | 20210123 | | | | | | |
43 | 2023-11-13T11:12:18.883Z | DeepHost | Host prediction | Phage | CNN |
**Description:**
DeepHost is a phage host prediction tool.; DeepHost is a phage host prediction tool. | | https://github.com/deepomicslab/DeepHost
https://github.com/deepomicslab/DeepHost; https://github.com/deepomicslab/DeepHost | https://academic.oup.com/bib/article-abstract/23/1/bbab385/6374063?redirectedFrom=fulltext | Rob Edwards' Viral Bioinfo Tools; Phage Kitchen | | 20220804 | | | | | | |
44 | 2023-11-13T11:12:18.883Z | HostG | Host prediction | Phage | GCN | | | https://github.com/KennthShang/HostG | https://bmcbiol.biomedcentral.com/articles/10.1186/s12915-021-01180-4 | Rob Edwards' Viral Bioinfo Tools | | 20220316 | | | | | | |
45 | 2023-11-13T11:12:18.883Z | HostPhinder | Host prediction | Phage | k-mers | | | https://github.com/julvi/HostPhinder | https://pubmed.ncbi.nlm.nih.gov/27153081/ | Rob Edwards' Viral Bioinfo Tools | | 20200902 | | | | | | |
46 | 2023-11-13T11:12:18.883Z | INFH-VH | Host prediction | Phage | | | https://github.com/liudan111/ILMF-VH | https://bmcbioinformatics.biomedcentral.com/articles/10.1186/s12859-019-3082-0 | Rob Edwards' Viral Bioinfo Tools | | | | | | | | |
47 | 2023-11-13T11:12:18.883Z | iPHoP | Host prediction | Phage | | | | https://www.biorxiv.org/content/10.1101/2022.07.28.501908v1.abstract | Rob Edwards' Viral Bioinfo Tools | | | | | | | | |
48 | 2023-11-13T11:12:18.883Z | MVP | Host prediction | Both | | | | https://academic.oup.com/nar/article/46/D1/D700/4643372?login=true
http://mvp.medgenius.info/home | Rob Edwards' Viral Bioinfo Tools | | | | | | | | |
49 | 2023-11-13T11:12:18.883Z | PHERI | Host prediction | Phage | PHERI | | https://github.com/andynet/pheri | | Rob Edwards' Viral Bioinfo Tools | | | | | | | | |
50 | 2023-11-13T11:12:18.883Z | PHIAF | Host prediction | Phage | GAN | | | https://github.com/BioMedicalBigDataMiningLab/PHIAF | https://academic.oup.com/bib/article-abstract/23/1/bbab348/6362109 | Rob Edwards' Viral Bioinfo Tools | | | | | | | | |
51 | 2023-11-13T11:12:18.883Z | PHISDetector | Host prediction | Phage | | | http://www.microbiome-bigdata.com/PHISDetector/index/ | Rob Edwards' Viral Bioinfo Tools | | | | | | | | |
52 | 2023-11-13T11:12:18.883Z | PHIST | Host prediction | Phage | k-mers | | | https://github.com/refresh-bio/phist | https://academic.oup.com/bioinformatics/article/38/5/1447/6460800 | Rob Edwards' Viral Bioinfo Tools | | | | | | | | |
53 | 2023-11-13T11:12:18.883Z | PHP | Host prediction | Phage | | https://github.com/congyulu-bioinfo/PHP | | Rob Edwards' Viral Bioinfo Tools | | | | | | | | |
54 | 2023-11-13T11:12:18.883Z | PredPHI | Host prediction | Phage | | https://github.com/xialab-ahu/PredPHI | | Rob Edwards' Viral Bioinfo Tools | | | | | | | | |
55 | 2023-11-13T11:12:18.883Z | RaFaH | Host prediction | Phage | | | | https://www.sciencedirect.com/science/article/pii/S2666389921001008
https://sourceforge.net/projects/rafah/ | Rob Edwards' Viral Bioinfo Tools | | | | | | | | |
56 | 2023-11-13T11:12:18.883Z | vHulk | Host prediction | Phage |
**Description:**
**Phage Host Prediction using high level features and neural networks**
Metagenomics and sequencing techniques have greatly improved in these last five years and, as a consequence, the amount of data from microbial communities is astronomic. An import part of the microbial community are phages, which have their own ecological roles in the environment. Besides that, they have also been given a possible human relevant (clinical) role as terminators of multidrug resistant bacterial infections. A lot of basic research still need to be done in the Phage therapy field, and part of this research involves gathering knowledge from new phages present in the environment as well as about their relationship with clinical relevant bacterial pathogens.
Having this scenario in mind, we have developed vHULK. A user-friendly tool for prediction of phage hosts given their complete or partial genome in FASTA format. Our tool outputs an ensemble prediction at the genus or species level based on scores of four different neural network models. Each model was trained with more than 4,000 genomes whose phage-host relationship was known. v.HULK also outputs a mesure of entropy for each final prediction, which we have demonstrated to be correlated with prediction's accuracy. The user might understand this value as additional information of how certain v.HULK is about a particular prediction. We also suspect that phages with higher entropy values may have a broad host-range. But that hypothesis is to be tested later. Accuracy results in test datasets were >99% for predictions at the genus level and >98% at the species level. vHULK currently supports predictions for 52 different prokaryotic host species and 61 different genera. | | nan
https://github.com/LaboratorioBioinformatica/vHULK | https://www.biorxiv.org/content/10.1101/2020.12.06.413476v1
https://www.biorxiv.org/content/10.1101/2020.12.06.413476v1.full | Rob Edwards' Viral Bioinfo Tools | | | | | | | | |
57 | 2023-11-13T11:12:18.883Z | VIDHOP | Host prediction | Both | Deep learning | | | https://github.com/flomock/vidhop | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7454304/ | Rob Edwards' Viral Bioinfo Tools | | | | | | | | |
58 | 2023-11-13T11:12:18.883Z | VirHostMatcher | Host prediction | Phage | oligonucleotide frequency based distance and dissimilarity measures | | | https://github.com/jessieren/VirHostMatcher | https://pubmed.ncbi.nlm.nih.gov/27899557/ | Rob Edwards' Viral Bioinfo Tools | | | | | | | | |
59 | 2023-11-13T11:12:18.883Z | VirHostMatcher-Net | Host prediction | Virus |
**Description:**
Metagenomic sequencing has greatly enhanced the discovery of viral genomic sequences; however, it remains challenging to identify the host(s) of these new viruses. We developed VirHostMatcher-Net, a flexible, network-based, Markov random field framework for predicting virus–prokaryote interactions using multiple, integrated features: CRISPR sequences and alignment-free similarity measures (⁠s∗2 and WIsH). Evaluation of this method on a benchmark set of 1462 known virus–prokaryote pairs yielded host prediction accuracy of 59% and 86% at the genus and phylum levels, representing 16–27% and 6–10% improvement, respectively, over previous single-feature prediction approaches. We applied our host prediction tool to crAssphage, a human gut phage, and two metagenomic virus datasets: marine viruses and viral contigs recovered from globally distributed, diverse habitats. Host predictions were frequently consistent with those of previous studies, but more importantly, this new tool made many more confident predictions than previous tools, up to nearly 3-fold more (n > 27 000), greatly expanding the diversity of known virus–host interactions.; Metagenomic sequencing has greatly enhanced the discovery of viral genomic sequences; however, it remains challenging to identify the host(s) of these new viruses. We developed VirHostMatcher-Net, a flexible, network-based, Markov random field framework for predicting virus‚Äìprokaryote interactions using multiple, integrated features: CRISPR sequences and alignment-free similarity measures (‚ņs‚àó2 and WIsH). Evaluation of this method on a benchmark set of 1462 known virus‚Äìprokaryote pairs yielded host prediction accuracy of 59% and 86% at the genus and phylum levels, representing 16‚Äì27% and 6‚Äì10% improvement, respectively, over previous single-feature prediction approaches. We applied our host prediction tool to crAssphage, a human gut phage, and two metagenomic virus datasets: marine viruses and viral contigs recovered from globally distributed, diverse habitats. Host predictions were frequently consistent with those of previous studies, but more importantly, this new tool made many more confident predictions than previous tools, up to nearly 3-fold more (n > 27 000), greatly expanding the diversity of known virus‚Äìhost interactions. | | https://github.com/WeiliWw/VirHostMatcher-Net
https://github.com/WeiliWw/VirHostMatcher-Net; https://github.com/WeiliWw/VirHostMatcher-Net | https://academic.oup.com/nargab/article/2/2/lqaa044/5861484?login=true
https://academic.oup.com/nargab/article/2/2/lqaa044/5861484; https://academic.oup.com/nargab/article/2/2/lqaa044/5861484 | Rob Edwards' Viral Bioinfo Tools; Phage Kitchen | https://trello.com/1/cards/61948b43aa653c636dd10832/attachments/61ba828f6d63dd22924262d0/download/image.png | | | | | | | |
60 | 2023-11-13T11:12:18.883Z | VirMatcher | Host prediction | Phage | Leveraging multiple methods and assigning a confidence score | | | https://bitbucket.org/MAVERICLab/virmatcher/src/master/ | https://www.cell.com/cell-host-microbe/fulltext/S1931-3128(20)30456-X | Rob Edwards' Viral Bioinfo Tools | | 20220429 | | | | | | |
61 | 2023-11-13T11:12:18.883Z | Virus Host DB | Host prediction | Both | | | | https://pubmed.ncbi.nlm.nih.gov/26938550/
https://www.genome.jp/virushostdb/ | Rob Edwards' Viral Bioinfo Tools | | | | | | | | |
62 | 2023-11-13T11:12:18.883Z | Virus Host Predict | Host prediction | Both | | https://github.com/youngfran/virus_host_predict | | Rob Edwards' Viral Bioinfo Tools | | | | | | | | |
63 | 2023-11-13T11:12:18.883Z | WIsH | Host prediction | Phage | | | https://github.com/soedinglab/WIsH | https://academic.oup.com/bioinformatics/article/33/19/3113/3964377#:~:text=WIsH%20predicts%20prokaryotic%20hosts%20of,3%20kbp%2Dlong%20phage%20contigs. | Rob Edwards' Viral Bioinfo Tools | | | | | | | | |
64 | 2023-11-13T11:12:18.883Z | ReadItAndKeep | Host Removal Tool | | | https://github.com/GenomePathogenAnalysisService/read-it-and-keep | | Rob Edwards' Viral Bioinfo Tools | | | | | | | | |
65 | 2023-11-13T11:12:18.883Z | shiver | Host Removal Tool | | | https://github.com/ChrisHIV/shiver | | Rob Edwards' Viral Bioinfo Tools | | | | | | | | |
66 | 2023-11-13T11:12:18.883Z | DRAD | Identify Integrated Viruses | Phage | Dinucleotide Relative Abundance difference | | | Does not exist any more | https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0001193 | Rob Edwards' Viral Bioinfo Tools | | None | | | | | | |
67 | 2023-11-13T11:12:18.883Z | geNomad | Identify Integrated Viruses | Both | | https://github.com/apcamargo/genomad | | Rob Edwards' Viral Bioinfo Tools | | 20221015 | | | | | | |
68 | 2023-11-13T11:12:18.883Z | hafeZ | Identify Integrated Viruses | Phage | Readmapping | | | https://github.com/Chrisjrt/hafeZ | https://www.biorxiv.org/content/10.1101/2021.07.21.453177v1 | Rob Edwards' Viral Bioinfo Tools | | 20211004 | | | | | | |
69 | 2023-11-13T11:12:18.883Z | LysoPhD | Identify Integrated Viruses | Phage | | | No code available | https://ieeexplore.ieee.org/document/8983280 | Rob Edwards' Viral Bioinfo Tools | | None | | | | | | |
70 | 2023-11-13T11:12:18.883Z | phage_finder | Identify Integrated Viruses | Phage | | | | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1635311/
http://phage-finder.sourceforge.net/ | Rob Edwards' Viral Bioinfo Tools | | | | | | | | |
71 | 2023-11-13T11:12:18.883Z | phageboost | Identify Integrated Viruses | Phage | boost ml | | | | https://www.biorxiv.org/content/10.1101/2020.08.09.243022v1
http://phageboost.ml | Rob Edwards' Viral Bioinfo Tools | | | | | | | | |
72 | 2023-11-13T11:12:18.883Z | PhageWeb | Identify Integrated Viruses | Phage | | | | https://www.frontiersin.org/articles/10.3389/fgene.2018.00644/full
http://computationalbiology.ufpa.br/phageweb/ | Rob Edwards' Viral Bioinfo Tools | | | | | | | | |
73 | 2023-11-13T11:12:18.883Z | PHASTER | Identify Integrated Viruses | Phage |
**Description:**
PHASTER (PHAge Search Tool Enhanced Release) is a significant upgrade to the popular PHAST web server for the rapid identification and annotation of prophage sequences within bacterial genomes and plasmids. While the steps in the phage identification pipeline in PHASTER remain largely the same as in the original PHAST, numerous software improvements and significant hardware enhancements have now made PHASTER faster, more efficient, more visually appealing and much more user friendly. In particular, PHASTER is now 4.3X faster than PHAST when analyzing a typical bacterial genome. More specifically, software optimizations have made the backend of PHASTER 2.7X faster than PHAST. Likewise, the addition of more than 120 CPUs to the PHASTER compute cluster have greatly reduced processing times. PHASTER can now process a typical bacterial genome in 3 minutes from the raw sequence alone, or in 1.5 minutes when given a pre-annotated GenBank file. A number of other optimizations have been implemented, including automated algorithms to reduce the size and redundancy of PHASTER’s databases, improvements in handling multiple (metagenomic) queries and high user traffic, and the ability to perform automated look-ups against >14,000 previously PHAST/PHASTER annotated bacterial genomes (which can lead to complete phage annotations in seconds as opposed to minutes). PHASTER’s web interface has also been entirely rewritten. A new graphical genome browser has been added, gene/genome visualization tools have been improved, and the graphical interface is now more modern, robust, and user-friendly.; PHASTER (PHAge Search Tool Enhanced Release) is a significant upgrade to the popular PHAST web server for the rapid identification and annotation of prophage sequences within bacterial genomes and plasmids. While the steps in the phage identification pipeline in PHASTER remain largely the same as in the original PHAST, numerous software improvements and significant hardware enhancements have now made PHASTER faster, more efficient, more visually appealing and much more user friendly. In particular, PHASTER is now 4.3X faster than PHAST when analyzing a typical bacterial genome. More specifically, software optimizations have made the backend of PHASTER 2.7X faster than PHAST. Likewise, the addition of more than 120 CPUs to the PHASTER compute cluster have greatly reduced processing times. PHASTER can now process a typical bacterial genome in 3 minutes from the raw sequence alone, or in 1.5 minutes when given a pre-annotated GenBank file. A number of other optimizations have been implemented, including automated algorithms to reduce the size and redundancy of PHASTER‚Äôs databases, improvements in handling multiple (metagenomic) queries and high user traffic, and the ability to perform automated look-ups against >14,000 previously PHAST/PHASTER annotated bacterial genomes (which can lead to complete phage annotations in seconds as opposed to minutes). PHASTER‚Äôs web interface has also been entirely rewritten. A new graphical genome browser has been added, gene/genome visualization tools have been improved, and the graphical interface is now more modern, robust, and user-friendly. | | | https://pubmed.ncbi.nlm.nih.gov/27141966/
https://phaster.ca/
https://phaster.ca/
http://www.ncbi.nlm.nih.gov/pubmed/27141966; http://www.ncbi.nlm.nih.gov/pubmed/27141966
https://phaster.ca/ | Rob Edwards' Viral Bioinfo Tools; Phage Kitchen | | | | | | | | |
74 | 2023-11-13T11:12:18.883Z | Phigaro | Identify Integrated Viruses | Phage |; Phigaro: high throughput prophage sequence annotation | | https://github.com/bobeobibo/phigaro | https://www.biorxiv.org/content/10.1101/598243v1 | Rob Edwards' Viral Bioinfo Tools | | | | | | | | |
75 | 2023-11-13T11:12:18.883Z | PhiSpy | Identify Integrated Viruses | Phage | PhiSpy: a novel algorithm for finding prophages in bacterial genomes that combines similarity- and composition-based strategies - PMC | | https://github.com/linsalrob/PhiSpy | | Rob Edwards' Viral Bioinfo Tools | | 20220202 | | | | | | |
76 | 2023-11-13T11:12:18.883Z | Prophage Hunter | Identify Integrated Viruses | Phage | logistic regression | | | | https://academic.oup.com/nar/article/47/W1/W74/5494712
https://pro-hunter.bgi.com/ | Rob Edwards' Viral Bioinfo Tools | | | | | | | | |
77 | 2023-11-13T11:12:18.883Z | Prophet | Identify Integrated Viruses | Phage | | | https://github.com/jaumlrc/ProphET | https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0223364 | Rob Edwards' Viral Bioinfo Tools | | | | | | | | |
78 | 2023-11-13T11:12:18.883Z | Prophinder | Identify Integrated Viruses | Phage | | | | https://academic.oup.com/bioinformatics/article/24/6/863/194494
http://aclame.ulb.ac.be/Tools/Prophinder/ | Rob Edwards' Viral Bioinfo Tools | | | | | | | | |
79 | 2023-11-13T11:12:18.883Z | VAPiD | Identify Integrated Viruses | Virus | | | https://github.com/rcs333/VAPiD | https://bmcbioinformatics.biomedcentral.com/articles/10.1186/s12859-019-2606-y | Rob Edwards' Viral Bioinfo Tools | | | | | | | | |
80 | 2023-11-13T11:12:18.883Z | viralintegration | Identify Integrated Viruses | Virus | Nextflow pipeline | | https://github.com/nf-core/viralintegration | | Rob Edwards' Viral Bioinfo Tools | | 2023 | | | | | | |
81 | 2023-11-13T11:12:18.883Z | BACPHLIP | Lifestyle classification | Phage | Random Forest classifier |
**Description:**
Bacteriophages are broadly classified into two distinct lifestyles: temperate and virulent. Temperate phages are capable of a latent phase of infection within a host cell (lysogenic cycle), whereas virulent phages directly replicate and lyse host cells upon infection (lytic cycle). Accurate lifestyle identification is critical for determining the role of individual phage species within ecosystems and their effect on host evolution. Here, we present BACPHLIP, a BACterioPHage LIfestyle Predictor. BACPHLIP detects the presence of a set of conserved protein domains within an input genome and uses this data to predict lifestyle via a Random Forest classifier that was trained on a dataset of 634 phage genomes. On an independent test set of 423 phages, BACPHLIP has an accuracy of 98% greatly exceeding that of the previously existing tools (79%). BACPHLIP is freely available on GitHub
(
and the code used to build and test the classifier is provided in a separate repository
(
for users wishing to interrogate and re-train the underlying classification model.; Bacteriophages are broadly classified into two distinct lifestyles: temperate and virulent. Temperate phages are capable of a latent phase of infection within a host cell (lysogenic cycle), whereas virulent phages directly replicate and lyse host cells upon infection (lytic cycle). Accurate lifestyle identification is critical for determining the role of individual phage species within ecosystems and their effect on host evolution. Here, we present BACPHLIP, a BACterioPHage LIfestyle Predictor. BACPHLIP detects the presence of a set of conserved protein domains within an input genome and uses this data to predict lifestyle via a Random Forest classifier that was trained on a dataset of 634 phage genomes. On an independent test set of 423 phages, BACPHLIP has an accuracy of 98% greatly exceeding that of the previously existing tools (79%). BACPHLIP is freely available on GitHub
(
and the code used to build and test the classifier is provided in a separate repository
(
for users wishing to interrogate and re-train the underlying classification model. | | https://github.com/adamhockenberry/bacphlip
https://github.com/adamhockenberry/bacphlip)
https://github.com/adamhockenberry/bacphlip-model-dev); https://github.com/adamhockenberry/bacphlip)
https://github.com/adamhockenberry/bacphlip-model-dev) | https://pubmed.ncbi.nlm.nih.gov/33996289/
https://pubmed.ncbi.nlm.nih.gov/33996289/; https://pubmed.ncbi.nlm.nih.gov/33996289/ | Rob Edwards' Viral Bioinfo Tools; Phage Kitchen | | 20210128 | | | | | | |
82 | 2023-11-13T11:12:18.883Z | PHACTS | Lifestyle classification | Phage |
**Description:**
PHACTS-0.3.tar.gz
**Abstract**
*Motivation*: Bacteriophages have two distinct lifestyles: virulent and temperate. The virulent lifestyle has many implications for phage therapy, genomics and microbiology. Determining which lifestyle a newly sequenced phage falls into is currently determined using standard culturing techniques. Such laboratory work is not only costly and time consuming, but also cannot be used on phage genomes constructed from environmental sequencing. Therefore, a computational method that utilizes the sequence data of phage genomes is needed.
*Results*: Phage Classification Tool Set (PHACTS) utilizes a novel similarity algorithm and a supervised Random Forest classifier to make a prediction whether the lifestyle of a phage, described by its proteome, is virulent or temperate. The similarity algorithm creates a training set from phages with known lifestyles and along with the lifestyle annotation, trains a Random Forest to classify the lifestyle of a phage. PHACTS predictions are shown to have a 99% precision rate.
*Availability and implementation*: PHACTS was implemented in the PERL programming language and utilizes the FASTA program (Pearson and Lipman, 1988) and the R programming language library 'Random Forest' (Liaw and Weiner, 2010). The PHACTS software is open source and is available as downloadable stand-alone version or can be accessed online as a user-friendly web interface. The source code, help files and online version are available at; PHACTS-0.3.tar.gz
Abstract
Motivation: Bacteriophages have two distinct lifestyles: virulent and temperate. The virulent lifestyle has many implications for phage therapy, genomics and microbiology. Determining which lifestyle a newly sequenced phage falls into is currently determined using standard culturing techniques. Such laboratory work is not only costly and time consuming, but also cannot be used on phage genomes constructed from environmental sequencing. Therefore, a computational method that utilizes the sequence data of phage genomes is needed.
Results: Phage Classification Tool Set (PHACTS) utilizes a novel similarity algorithm and a supervised Random Forest classifier to make a prediction whether the lifestyle of a phage, described by its proteome, is virulent or temperate. The similarity algorithm creates a training set from phages with known lifestyles and along with the lifestyle annotation, trains a Random Forest to classify the lifestyle of a phage. PHACTS predictions are shown to have a 99% precision rate.
Availability and implementation: PHACTS was implemented in the PERL programming language and utilizes the FASTA program (Pearson and Lipman, 1988) and the R programming language library 'Random Forest' (Liaw and Weiner, 2010). The PHACTS software is open source and is available as downloadable stand-alone version or can be accessed online as a user-friendly web interface. The source code, help files and online version are available at | | | https://pubmed.ncbi.nlm.nih.gov/22238260/
https://pubmed.ncbi.nlm.nih.gov/22238260/
https://edwards.sdsu.edu/PHACTS/
https://edwards.sdsu.edu/PHACTS/PHACTS-0.3.tar.gz
http://www.phantome.org/PHACTS/.; https://edwards.sdsu.edu/PHACTS/
https://edwards.sdsu.edu/PHACTS/PHACTS-0.3.tar.gz
http://www.phantome.org/PHACTS/.
https://pubmed.ncbi.nlm.nih.gov/22238260/ | Rob Edwards' Viral Bioinfo Tools; Phage Kitchen | | | | | | | | |
83 | 2023-11-13T11:12:18.883Z | ViralMSA | Multiple Sequence Alignment | Virus | Python script that wraps around read mappers (e.g. Minimap2) | | | https://github.com/niemasd/ViralMSA | https://doi.org/10.1093/bioinformatics/btaa743 | Rob Edwards' Viral Bioinfo Tools | | Actively developed | | | | | | |
84 | 2023-11-13T11:12:18.883Z | Phanotate | Phage genes | Phage |
**Description:**
PHANOTATE is a tool to annotate phage genomes. It uses the assumption that non-coding bases in a phage genome is disadvantageous, and then populates a weighted graph to find the optimal path through the six frames of the DNA where open reading frames are beneficial paths, while gaps and overlaps are penalized paths. | | https://github.com/deprekate/PHANOTATE
https://github.com/deprekate/PHANOTATE | https://academic.oup.com/bioinformatics/article/35/22/4537/5480131
https://academic.oup.com/bioinformatics/article/35/22/4537/5480131 | Rob Edwards' Viral Bioinfo Tools | | | | | | | | |
85 | 2023-11-13T11:12:18.883Z | PHROGs | Phage genes | Phage | | | | https://academic.oup.com/nargab/article/3/3/lqab067/6342220 | Rob Edwards' Viral Bioinfo Tools | | | | | | | | |
86 | 2023-11-13T11:12:18.883Z | PHRED | Phage receptors | Phage | | | No longer available | https://academic.oup.com/femsle/article/363/4/fnw002/1845417 | Rob Edwards' Viral Bioinfo Tools | | | | | | | | |
87 | 2023-11-13T11:12:18.883Z | PlaqueSizeTool | Plaque size calculation | Phage | Based on the optimized Computer Vision library | | | https://github.com/ellinium/plaque_size_tool | https://www.sciencedirect.com/science/article/pii/S004268222100115X?via%3Dihub | Rob Edwards' Viral Bioinfo Tools | | 2022 | | | | | | |
88 | 2023-11-13T11:12:18.883Z | PlaqueSizeTool (colab version) | Plaque size calculation | Phage | Based on the optimized Computer Vision library | | | | https://www.sciencedirect.com/science/article/pii/S004268222100115X?via%3Dihub
https://colab.research.google.com/drive/1HJe8V26l7n82zX8vJ7bO5C8-xrs_aWuq?usp=sharing | Rob Edwards' Viral Bioinfo Tools | | 2023 | | | | | | |
89 | 2023-11-13T11:12:18.883Z | PhageTerm | Predicting phage packaging mechanism | Phage | Read mapping | | | | https://www.nature.com/articles/s41598-017-07910-5
https://gitlab.pasteur.fr/vlegrand/ptv/-/releases | Rob Edwards' Viral Bioinfo Tools | | | | | | | | |
90 | 2023-11-13T11:12:18.883Z | Virus-Host Interaction Predictor (VHIP) | Prediction | G. Eric Bastien and colleagues have developed a [machine learning model called Virus-Host Interaction Predictor (VHIP)]( to predict virus-host interactions and reconstruct complex virus-host networks in natural systems. | G. Eric Bastien and others | | https://www.biorxiv.org/content/10.1101/2023.11.03.565433v1)
https://www.biorxiv.org/content/10.1101/2023.11.03.565433v1 | Rob Edwards' Viral Bioinfo Tools | | | | | | | | |
91 | 2023-11-13T11:12:18.883Z | PhagePromoter | Promoters | Phage | artificial neural network (ANN), support vector machines (SVM) | | | https://github.com/martaS95/PhagePromoter | https://academic.oup.com/bioinformatics/article/35/24/5301/5540317 | Rob Edwards' Viral Bioinfo Tools | | | | | | | | |
92 | 2023-11-13T11:12:18.883Z | DeepVHPPI | Protein:Protein Interactions | Virus | | | https://github.com/QData/DeepVHPPI | https://dl.acm.org/doi/abs/10.1145/3459930.3469527 | Rob Edwards' Viral Bioinfo Tools | | | | | | | | |
93 | 2023-11-13T11:12:18.883Z | PhageRBPdetect | RBP | Phage | HMMs & machine learning | | | | https://www.mdpi.com/1999-4915/14/6/1329 | Rob Edwards' Viral Bioinfo Tools | | | | | | | | |
94 | 2023-11-13T11:12:18.883Z | EVBC Virus Bioinformatics Tools | Resource | A collection of useful tools in Virus Bioinformatics curated by the European Virus Bioinformatics Center. Please note, that the EVBC is not maintaining the tools | EVBC | | https://evirusbioinfc.notion.site/evirusbioinfc/18e21bc49827484b8a2f84463cb40b8d?v=92e7eb6703be4720abf17a901bc9a947 | Rob Edwards' Viral Bioinfo Tools | | | | | | | | |
95 | 2023-11-13T11:12:18.883Z | MGE detection tools | Resource | A collection of bacteria/virus tools | | | https://docs.google.com/spreadsheets/d/1dL5o524IX_-hJB6iYV1FB4QrK_U5KgFcfM4rZDZV_Dw/edit#gid=0 | Rob Edwards' Viral Bioinfo Tools | | | | | | | | |
96 | 2024-01-23T14:08:28.106Z | Phage Kitchen | Resource | Comparison and categorization of MANY phage bioinformatics tools | Nouri Ben Zakour | https://github.com/nbenzakour/phage-kitchen | | Nouri Ben Zakour | | | | | | | | |
97 | 2024-01-23T14:08:28.106Z | Phage prediction tools | Resource | Github repo accompanying paper: "Gauge your phage: benchmarking of bacteriophage identification tools in metagenomic sequencing data" by Siu Fung Stanley Ho, Nicole E. Wheeler, Andrew D. Millard & Willem van Schaik ( | | https://github.com/sxh1136/Phage_tools | https://doi.org/10.1186/s40168-023-01533-x) | Rob Edwards' Viral Bioinfo Tools | | | | | | | | |
98 | 2023-11-13T11:12:18.883Z | Rob Edwards' Viral Bioinformatics Tools | Resource | Periodically updated open spreadsheet of bioinformatics tools; owned by Rob Edwards | Rob Edwards | | https://docs.google.com/spreadsheets/d/1ClNgip08olKK-oBMMlPHBwIcilqSxsan8MEaYphUei4/edit#gid=1636291468 | Rob Edwards' Viral Bioinfo Tools | | | | | | | | |
99 | 2023-11-13T11:12:18.883Z | TE Hub Repeat Databases | Resource | A list of databases for the storage of sequences and metadata associated with repetitive, mobile and selfish DNA | Tyler Elliott | | https://tehub.org/en/resources/repeat_databases | Rob Edwards' Viral Bioinfo Tools | | | | | | | | |
100 | 2023-11-13T11:12:18.883Z | Testing (5) Prophage finding tools | Resource | Comparison of five (text updated with 5th tool) prophage finding tools for bacterial genomics — Phispy, VirSorter, Phigaro, ProphET, PHASTER | | | https://nickp60.github.io/weird_one_offs/testing_3_prophage_finders/ | Rob Edwards' Viral Bioinfo Tools | | | | | | | | |
101 | 2023-11-13T11:12:18.883Z | VEGA | RNA viral assembly toolkit | Both | snakemake workflow | | | https://github.com/pauloluniyi/VGEA | https://peerj.com/articles/12129/ | Rob Edwards' Viral Bioinfo Tools | | | | | | | | |
102 | 2023-11-13T11:12:18.883Z | palmID | RNA Virus (RdRp) search tool | Virus | Website / R | | | | https://peerj.com/articles/14055/
https://serratus.io/palmid | Rob Edwards' Viral Bioinfo Tools | | 2023 | | | | | | |
103 | 2023-11-13T11:12:18.883Z | RdRp-scan | RNA Virus (RdRp) search tool | Both | seacrh against the RdRp database | | | https://github.com/JustineCharon/RdRp-scan/ | https://academic.oup.com/ve/article/8/2/veac082/6679729?login=true | Rob Edwards' Viral Bioinfo Tools | | | | | | | | |
104 | 2023-11-13T11:12:18.883Z | rdrpsearch | RNA Virus (RdRp) search tool | Both | Iterative HMM search of viral RdRp | | | | https://www.science.org/doi/abs/10.1126/science.abm5847
https://zenodo.org/record/5731488#.Y-6yFXbMKUk | Rob Edwards' Viral Bioinfo Tools | | 20211127 | | | | | | |
105 | 2023-11-13T11:12:18.883Z | CHVD | Sequence Database | Both | | | | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8201803/
https://zenodo.org/record/4498884#.Y2Q9sHZBxD8 | Rob Edwards' Viral Bioinfo Tools | | 20210203 | | | | | | |
106 | 2023-11-13T11:12:18.883Z | Earth Virome | Sequence Database | Both | | | | https://www.nature.com/articles/nprot.2017.063
https://portal.nersc.gov/dna/microbial/prokpubs/EarthVirome_DP/ | Rob Edwards' Viral Bioinfo Tools | | 20151210 | | | | | | |
107 | 2023-11-13T11:12:18.883Z | GOV-RNA | Sequence Database | Both | RNA viruses from the Global Ocean | | | | https://www.science.org/doi/abs/10.1126/science.abm5847
https://datacommons.cyverse.org/browse/iplant/home/shared/iVirus/ZayedWainainaDominguez-Huerta_RNAevolution_Dec2021 | Rob Edwards' Viral Bioinfo Tools | | 20211206 | | | | | | |
108 | 2023-11-13T11:12:18.883Z | GOV2.0 | Sequence Database | Both | DNA viruses from the Global Ocean | | | | https://www.cell.com/cell/fulltext/S0092-8674(19)30341-1
https://datacommons.cyverse.org/browse/iplant/home/shared/iVirus/GOV2.0 | Rob Edwards' Viral Bioinfo Tools | | 20190424 | | | | | | |
109 | 2023-11-13T11:12:18.883Z | GPDB | Sequence Database | Both | | | | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7895897/?report=reader
http://ftp.ebi.ac.uk/pub/databases/metagenomics/genome_sets/gut_phage_database/ | Rob Edwards' Viral Bioinfo Tools | | 20201029 | | | | | | |
110 | 2023-11-13T11:12:18.883Z | GVD | Sequence Database | Both | | | | https://www.sciencedirect.com/science/article/pii/S193131282030456X
https://datacommons.cyverse.org/browse/iplant/home/shared/iVirus/Gregory_and_Zablocki_GVD_Jul2020 | Rob Edwards' Viral Bioinfo Tools | | | | | | | | |
111 | 2023-11-13T11:12:18.883Z | KEGG Virus | Sequence Database | Both | | | https://www.genome.jp/kegg/genome/virus.html | Rob Edwards' Viral Bioinfo Tools | | | | | | | | |
112 | 2023-11-13T11:12:18.883Z | mMGE | Sequence Database | Both | mobile genetic element database | | | https://mai.fudan.edu.cn/mgedb/client/index.html#/ | Rob Edwards' Viral Bioinfo Tools | | | | | | | | |
113 | 2023-11-13T11:12:18.883Z | PhagesDB | Sequence Database | | | | https://phagesdb.org/ | Rob Edwards' Viral Bioinfo Tools | | | | | | | | |
114 | 2023-11-13T11:12:18.883Z | Viruses.String | Sequence Database | Both | | | http://viruses.string-db.org/ | Rob Edwards' Viral Bioinfo Tools | | | | | | | | |
115 | 2023-11-13T11:12:18.883Z | FAVITES | Simulate networks | Both | Simulate contact networks, transmission networks, phylogenies, and sequences | | | https://github.com/niemasd/FAVITES | https://doi.org/10.1093/bioinformatics/bty921 | Rob Edwards' Viral Bioinfo Tools | | 20221124 | | | | | | |
116 | 2023-11-13T11:12:18.883Z | FAVITES-Lite | Simulate networks | Both | Simulate contact networks, transmission networks, phylogenies, and sequences | TBD | | https://github.com/niemasd/FAVITES-Lite | | Rob Edwards' Viral Bioinfo Tools | | Actively developed | | | | | | |
117 | 2023-11-13T11:12:18.883Z | efam | Viral orthologous groups | Both | Concensus viral identification, network-based clustering, metaproteomics | | | | https://academic.oup.com/bioinformatics/article/37/22/4202/6300514
https://datacommons.cyverse.org/browse/iplant/home/shared/iVirus/Zayed_efam_2020.1 | Rob Edwards' Viral Bioinfo Tools | | 20210505 | | | | | | |
118 | 2023-11-13T11:12:18.883Z | pVOGs | Viral orthologous groups | Phage | | | | https://academic.oup.com/nar/article/45/D1/D491/2333930
http://dmk-brain.ecn.uiowa.edu/pVOGs/ | Rob Edwards' Viral Bioinfo Tools | | | | | | | | |
119 | 2023-11-13T11:12:18.883Z | VogDB | Viral orthologous groups | Both | | | http://vogdb.org/ | Rob Edwards' Viral Bioinfo Tools | | | | | | | | |
120 | 2023-11-13T11:12:18.883Z | VStrains | Viral strain reconstruction | Virus | | | https://github.com/metagentools/VStrains | https://www.biorxiv.org/content/10.1101/2022.10.21.513181v2 | Rob Edwards' Viral Bioinfo Tools | | | | | | | | |
121 | 2023-11-13T11:12:18.883Z | vAMPirus | Virus amplicon sequencing | both | Nextflow pipeline | | | https://github.com/Aveglia/vAMPirus | https://www.authorea.com/users/584435/articles/623635-vampirus-a-versatile-amplicon-processing-and-analysis-program-for-studying-viruses?commit=4bde44de2b3f3816288a47c0a72ec4075e6438cc | Rob Edwards' Viral Bioinfo Tools | | 2023 | | | | | | |
122 | 2023-11-13T11:12:18.883Z | COBRA | Virus genome improvement | Both | | | | https://www.biorxiv.org/content/10.1101/2023.05.30.542503v2.abstract | Rob Edwards' Viral Bioinfo Tools | | | | | | | | |
123 | 2023-11-30T14:44:02.217Z | Cenote-Taker2 | Virus identification in metagenomes | Both | | | https://github.com/mtisza1/Cenote-Taker2 | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7816666/pdf/veaa100.pdf | Rob Edwards' Viral Bioinfo Tools | | 20220719 | | | | | | |
124 | 2023-11-13T11:12:18.883Z | CoCoNet | Virus identification in metagenomes | Virus | Neural networks | | | https://github.com/Puumanamana/CoCoNet | https://academic.oup.com/bioinformatics/article/37/18/2803/6211156 | Rob Edwards' Viral Bioinfo Tools | | 20211022 | | | | | | |
125 | 2023-11-13T11:12:18.883Z | crassus | Virus identification in metagenomes | Phage | snakemake workflow | | https://github.com/dcarrillox/CrassUS | | Rob Edwards' Viral Bioinfo Tools | | 20220704 | | | | | | |
126 | 2023-11-13T11:12:18.883Z | DBSCAN-SWA | Virus identification in metagenomes | Phage | DBSCAN | | | https://github.com/HIT-ImmunologyLab/DBSCAN-SWA/. | https://www.frontiersin.org/articles/10.3389/fgene.2022.885048/full | Rob Edwards' Viral Bioinfo Tools | | 20221103 | | | | | | |
127 | 2023-11-13T11:12:18.883Z | DeepVirFinder | Virus identification in metagenomes | Both | neural network |; Identifying viruses from metagenomic data by deep learning | | https://github.com/jessieren/DeepVirFinder | https://arxiv.org/pdf/1806.07810.pdf | Rob Edwards' Viral Bioinfo Tools | | 20221008 | | | | | | |
128 | 2023-11-13T11:12:18.883Z | DePhT | Virus identification in metagenomes | phage | | | https://github.com/chg60/DEPhT | https://academic.oup.com/nar/article/50/13/e75/6572362 | Rob Edwards' Viral Bioinfo Tools | | 20220930 | | | | | | |
129 | 2023-11-13T11:12:18.883Z | FastViromeExplorer | Virus identification in metagenomes | Both | | | | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5768174/
https://code.vt.edu/saima5/FastViromeExplorer | Rob Edwards' Viral Bioinfo Tools | | 20180220 | | | | | | |
130 | 2023-11-13T11:12:18.883Z | GenomePeek | Virus identification in metagenomes | Phage | | | | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4476108/ | Rob Edwards' Viral Bioinfo Tools | | | | | | | | |
131 | 2023-11-13T11:12:18.883Z | hecatomb | Virus identification in metagenomes | Both |
**Description:**
A hecatomb is a great sacrifice or an extensive loss. Heactomb the software empowers an analyst to make data driven decisions to 'sacrifice' false-positive viral reads from metagenomes to enrich for true-positive viral reads. This process frequently results in a great loss of suspected viral sequences / contigs. | | https://github.com/shandley/hecatomb
https://github.com/shandley/hecatomb | https://www.biorxiv.org/content/10.1101/2022.05.15.492003v2
https://hecatomb.readthedocs.io/en/latest/ | Rob Edwards' Viral Bioinfo Tools | | 20220902 | | | | | | |
132 | 2023-11-13T11:12:18.883Z | HoloVir | Virus identification in metagenomes | Both | | | https://github.com/plaffy/HoloVir | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4899465/ | Rob Edwards' Viral Bioinfo Tools | | 20181113 | | | | | | |
133 | 2023-11-13T11:12:18.883Z | INHERIT | Virus identification in metagenomes | Phage | Embedding (BERT) | | | https://github.com/Celestial-Bai/INHERIT | https://academic.oup.com/bioinformatics/article/38/18/4264/6654586 | Rob Edwards' Viral Bioinfo Tools | | 20221024 | | | | | | |
134 | 2023-11-13T11:12:18.883Z | isling | Virus identification in metagenomes | Virus | Split read alignment | | | https://github.com/szsctt/intvi_other-tools | https://www.sciencedirect.com/science/article/pii/S0022283621006458 | Rob Edwards' Viral Bioinfo Tools | | 20210811 | | | | | | |
135 | 2023-11-13T11:12:18.883Z | Jaeger | Virus identification in metagenomes | Phage | | https://github.com/Yasas1994/Jaeger | | Rob Edwards' Viral Bioinfo Tools | | 20230210 | | | | | | |
136 | 2023-11-13T11:12:18.883Z | Jovian | Virus identification in metagenomes | Virus | | https://github.com/DennisSchmitz/Jovian | | Rob Edwards' Viral Bioinfo Tools | | 20210604 | | | | | | |
137 | 2023-11-30T14:44:01.213Z | LazyPipe | Virus identification in metagenomes | Both | | | | https://academic.oup.com/ve/article/6/2/veaa091/6017186?login=false
https://www.helsinki.fi/en/projects/lazypipe | Rob Edwards' Viral Bioinfo Tools | | 20200706 | | | | | | |
138 | 2023-11-13T11:12:18.883Z | MARVEL | Virus identification in metagenomes | Phage | random forest |; MARVEL, a Tool for Prediction of Bacteriophage Sequences in Metagenomic Bins | | https://github.com/LaboratorioBioinformatica/MARVEL | https://www.frontiersin.org/articles/10.3389/fgene.2018.00304/full | Rob Edwards' Viral Bioinfo Tools | | | | | | | | |
139 | 2023-11-13T11:12:18.883Z | metaPhage | Virus identification in metagenomes | both | pipeline | | | https://mattiapandolfovr.github.io/MetaPhage/ | Rob Edwards' Viral Bioinfo Tools | | | | | | | | |
140 | 2023-11-13T11:12:18.883Z | MetaPhinder | Virus identification in metagenomes | Both |; MetaPhinder—Identifying Bacteriophage Sequences in Metagenomic Data Sets | | https://github.com/vanessajurtz/MetaPhinder | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5042410/ | Rob Edwards' Viral Bioinfo Tools | | | | | | | | |
141 | 2023-11-13T11:12:18.883Z | Phables | Virus identification in metagenomes | Phage | Flow decomposition on assembly graphs | | | https://github.com/Vini2/phables | https://biorxiv.org/cgi/content/short/2023.04.04.535632v1 | Rob Edwards' Viral Bioinfo Tools | | 2023 | | | | | | |
142 | 2023-11-13T11:12:18.883Z | Phage tools | Virus identification in metagenomes | Phage | | https://github.com/sxh1136/Phage_tools | | Rob Edwards' Viral Bioinfo Tools | | | | | | | | |
143 | 2023-11-13T11:12:18.883Z | PHAMB | Virus identification in metagenomes | Phage | Random forest | | | https://github.com/RasmussenLab/phamb | https://www.nature.com/articles/s41467-022-28581-5 | Rob Edwards' Viral Bioinfo Tools | | | | | | | | |
144 | 2023-11-13T11:12:18.883Z | phaMers | Virus identification in metagenomes | Phage | kmers + machine learning | | | https://github.com/jondeaton/PhaMers | https://doi.org/10.1002/adbi.201900108 | Rob Edwards' Viral Bioinfo Tools | | | | | | | | |
145 | 2023-11-13T11:12:18.883Z | Phanta | Virus identification in metagenomes | Both | K-mer read based classification, snakemake workflow | | | https://github.com/bhattlab/phanta | https://www.biorxiv.org/content/10.1101/2022.08.05.502982v1.full | Rob Edwards' Viral Bioinfo Tools | | 2023 | | | | | | |
146 | 2023-11-13T11:12:18.883Z | PIGv | Virus identification in metagenomes | Giant virus | Metabat binning, k-mer scoring, marker genes | | https://github.com/BenMinch/PIGv | | Rob Edwards' Viral Bioinfo Tools | | 2023 | | | | | | |
147 | 2023-11-13T11:12:18.883Z | PPR-Meta | Virus identification in metagenomes | Phage | neural network - CNN |; PPR-Meta: a tool for identifying phages and plasmids from metagenomic fragments using deep learning | | https://github.com/zhenchengfang/PPR-Meta | https://doi.org/10.1093/gigascience/giz066 | Rob Edwards' Viral Bioinfo Tools | | | | | | | | |
148 | 2023-11-13T11:12:18.883Z | Prophage Tracer | Virus identification in metagenomes | Phage | Split read alignment | | | | https://academic.oup.com/nar/article/49/22/e128/6374144 | Rob Edwards' Viral Bioinfo Tools | | | | | | | | |
149 | 2023-11-13T11:12:18.883Z | Seeker | Virus identification in metagenomes | Phage* | LSTM |; Seeker: alignment-free identification of bacteriophage genomes by deep learning | | https://github.com/gussow/seeker | https://academic.oup.com/nar/article/48/21/e121/5921300 | Rob Edwards' Viral Bioinfo Tools | | | | | | | | |
150 | 2023-11-13T11:12:18.883Z | Serratus | Virus identification in metagenomes | Both | Website | | | | https://www.nature.com/articles/s41586-021-04332-2
https://serratus.io/ | Rob Edwards' Viral Bioinfo Tools | | 2023 | | | | | | |
151 | 2023-11-13T11:12:18.883Z | VFM | Virus identification in metagenomes | Phage | | | https://github.com/liuql2019/VFM | https://ieeexplore.ieee.org/document/8924706 | Rob Edwards' Viral Bioinfo Tools | | | | | | | | |
152 | 2023-11-13T11:12:18.883Z | VIBRANT | Virus identification in metagenomes | Both |; Automated recovery, annotation and curation of microbial viruses, and evaluation of virome function from genomic sequences | | https://github.com/AnantharamanLab/VIBRANT | https://microbiomejournal.biomedcentral.com/articles/10.1186/s40168-020-00867-0 | Rob Edwards' Viral Bioinfo Tools | | | | | | | | |
153 | 2023-11-13T11:12:18.883Z | VIGA | Virus identification in metagenomes | Both | | https://github.com/EGTortuero/viga/tree/developer | | Rob Edwards' Viral Bioinfo Tools | | 2022 | | | | | | |
154 | 2023-11-13T11:12:18.883Z | ViralCC | Virus identification in metagenomes | Both | | | https://github.com/dyxstat/ViralCC | https://www.nature.com/articles/s41467-023-35945-y | Rob Edwards' Viral Bioinfo Tools | | 2022 | | | | | | |
155 | 2023-11-13T11:12:18.883Z | ViralConsensus | Virus identification in metagenomes | Virus | Viral consensus sequence calling | | | https://github.com/niemasd/ViralConsensus | https://doi.org/10.1101/2020.11.10.377499 | Rob Edwards' Viral Bioinfo Tools | | Actively developed | | | | | | |
156 | 2023-11-13T11:12:18.883Z | viralMetagenomicsPipeline | Virus identification in metagenomes | Snakemake | | https://github.com/wclose/viralMetagenomicsPipeline | | Rob Edwards' Viral Bioinfo Tools | | | | | | | | |
157 | 2023-11-13T11:12:18.883Z | ViralWasm | Virus identification in metagenomes | Virus | WebAssembly | | | | https://zenodo.org/doi/10.5281/zenodo.8427588
https://niema-lab.github.io/ViralWasm | Rob Edwards' Viral Bioinfo Tools | | Actively developed | | | | | | |
158 | 2023-11-13T11:12:18.883Z | viraMiner | Virus identification in metagenomes | Both | CNN classifier | | | https://github.com/NeuroCSUT/ViraMiner | https://doi.org/10.1371/journal.pone.0222271 | Rob Edwards' Viral Bioinfo Tools | | | | | | | | |
159 | 2023-11-13T11:12:18.883Z | virAnnot | Virus identification in metagenomes | Virus | pipeline | | | https://github.com/marieBvr/virAnnot | https://doi.org/10.1094/PBIOMES-07-19-0037-A | Rob Edwards' Viral Bioinfo Tools | | 2022 | | | | | | |
160 | 2023-11-13T11:12:18.883Z | virFinder | Virus identification in metagenomes | Both | neural network,machine learning | | | https://github.com/jessieren/VirFinder | https://doi.org/10.1186/s40168-017-0283-5 | Rob Edwards' Viral Bioinfo Tools | | | | | | | | |
161 | 2023-11-13T11:12:18.883Z | Virhunter | Virus identification in metagenomes | Virus | | | https://github.com/cbib/virhunter | https://www.frontiersin.org/articles/10.3389/fbinf.2022.867111/full | Rob Edwards' Viral Bioinfo Tools | | | | | | | | |
162 | 2023-11-13T11:12:18.883Z | VirMine | Virus identification in metagenomes | Both | | | https://github.com/thatzopoulos/virMine | https://peerj.com/articles/6695/ | Rob Edwards' Viral Bioinfo Tools | | | | | | | | |
163 | 2023-11-13T11:12:18.883Z | virMiner | Virus identification in metagenomes | Both | random forest | | | https://github.com/TingtZHENG/VirMiner | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6425642/ | Rob Edwards' Viral Bioinfo Tools | | | | | | | | |
164 | 2023-11-13T11:12:18.883Z | VirNet | Virus identification in metagenomes | Phage |; Deep attention model for viral reads identification | | https://github.com/alyosama/virnet | https://doi.org/10.1109/ICCES.2018.8639400 | Rob Edwards' Viral Bioinfo Tools | | | | | | | | |
165 | 2023-11-13T11:12:18.883Z | VirSorter | Virus identification in metagenomes | Phage |; VirSorter: mining viral signal from microbial genomic data | | https://github.com/simroux/VirSorter | https://peerj.com/articles/985/ | Rob Edwards' Viral Bioinfo Tools | | | | | | | | |
166 | 2023-11-13T11:12:18.883Z | VirSorter2 | Virus identification in metagenomes | Both | Random Forest | | | https://bitbucket.org/MAVERICLab/virsorter2/src/master/ | https://microbiomejournal.biomedcentral.com/articles/10.1186/s40168-020-00990-y | Rob Edwards' Viral Bioinfo Tools | | | | | | | | |
167 | 2023-11-13T11:12:18.883Z | Virtifier | Virus identification in metagenomes | Both | LSTM neural network | | | https://github.com/crazyinter/Seq2Vec | https://academic.oup.com/bioinformatics/article/38/5/1216/6462188 | Rob Edwards' Viral Bioinfo Tools | | | | | | | | |
168 | 2023-11-13T11:12:18.883Z | virus_prediction | Virus identification in metagenomes | Both | Nextflow | | https://github.com/rujinlong/virus_prediction | | Rob Edwards' Viral Bioinfo Tools | | | | | | | | |
169 | 2023-11-13T11:12:18.883Z | ViruSpy | Virus identification in metagenomes | Both | | https://github.com/NCBI-Hackathons/ViruSpy | | Rob Edwards' Viral Bioinfo Tools | | | | | | | | |
170 | 2023-11-13T11:12:18.883Z | VirusSeeker | Virus identification in metagenomes | Both | | | | https://www.sciencedirect.com/science/article/pii/S0042682217300053?via%3Dihub
https://wupathlabs.wustl.edu/virusseeker/ | Rob Edwards' Viral Bioinfo Tools | | 20160824 | | | | | | |
171 | 2023-11-30T14:46:12.093Z | What_the_phage | Virus identification in metagenomes | Phage | Nextflow | | | https://github.com/replikation/What_the_Phage | https://academic.oup.com/gigascience/article/doi/10.1093/gigascience/giac110/6833029#:~:text=https%3A//doi.org/10.1093/gigascience/giac110 | Rob Edwards' Viral Bioinfo Tools | | | | | | | | |
172 | 2023-11-13T11:12:18.883Z | vRhyme | Virus identification in metagenomes | Both | Machine learning | vRhyme enables binning of viral genomes from metagenomes | Nucleic Acids Research | Oxford Academic | | https://github.com/AnantharamanLab/vRhyme | | Rob Edwards' Viral Bioinfo Tools | | | | | | | | |
173 | 2023-11-13T11:12:18.883Z | Deep6 | Virus identification in metatranscriptomes | Both | Machine Learning | | | https://github.com/janfelix/Deep6 | https://journals.asm.org/doi/10.1128/mra.01079-22 | Rob Edwards' Viral Bioinfo Tools | | | | | | | | |
174 | 2023-11-13T11:12:18.883Z | BERTax | Virus taxonomy | Both | | | https://github.com/f-kretschmer/bertax | https://www.pnas.org/doi/full/10.1073/pnas.2122636119 | Rob Edwards' Viral Bioinfo Tools | | | | | | | | |
175 | 2023-11-13T11:12:18.883Z | Classiphages 2.0 | Virus taxonomy | Phage | ANN | | | No code available | https://www.biorxiv.org/content/10.1101/558171v1 | Rob Edwards' Viral Bioinfo Tools | | None | | | | | | |
176 | 2023-11-13T11:12:18.883Z | GraViTy | Virus taxonomy | Both | HMMs and genome organisation models | | | https://github.com/PAiewsakun/GRAViTy | https://microbiomejournal.biomedcentral.com/articles/10.1186/s40168-018-0422-7 | Rob Edwards' Viral Bioinfo Tools | | 20200224 | | | | | | |
177 | 2023-11-13T11:12:18.883Z | PhaGCN | Virus taxonomy | Phage | GCN | | | https://github.com/KennthShang/PhaGCN | https://academic.oup.com/bioinformatics/article/37/Supplement_1/i25/6319660 | Rob Edwards' Viral Bioinfo Tools | | | | | | | | |
178 | 2023-11-13T11:12:18.883Z | vConTACT | Virus taxonomy | Both | Whole-genome gene-sharing networks for virus taxonomy | | | https://bitbucket.org/MAVERICLab/vcontact/src/master/ | https://peerj.com/articles/3243/ | Rob Edwards' Viral Bioinfo Tools | | | | | | | | |
179 | 2023-11-13T11:12:18.883Z | vConTACT2.0 | Virus taxonomy | Both | Whole-genome gene-sharing networks for virus taxonomy | | | https://bitbucket.org/MAVERICLab/vcontact2/src/master/ | https://www.nature.com/articles/s41587-019-0100-8 | Rob Edwards' Viral Bioinfo Tools | | | | | | | | |
180 | 2023-11-13T11:12:18.883Z | VICTOR | Virus taxonomy | Phage | | | https://github.com/vdclab/vdclab-wiki/blob/master/VICTOR.md | https://academic.oup.com/bioinformatics/article/33/21/3396/3933260 | Rob Edwards' Viral Bioinfo Tools | | | | | | | | |
181 | 2023-11-13T11:12:18.883Z | VIPtree | Virus taxonomy | Phage | | | https://github.com/yosuken/ViPTreeGen | https://academic.oup.com/bioinformatics/article/33/21/3396/3933260 | Rob Edwards' Viral Bioinfo Tools | | | | | | | | |
182 | 2023-11-13T11:12:18.883Z | VIRIDIC | Virus taxonomy | Phage | | | | https://www.mdpi.com/1999-4915/12/11/1268 | Rob Edwards' Viral Bioinfo Tools | | | | | | | | |
183 | 2023-11-13T11:12:18.883Z | VIRify | Virus taxonomy | Both |; VIRify is a recently developed pipeline for the detection, annotation, and taxonomic classification of viral contigs in metagenomic and metatranscriptomic assemblies. The pipeline is part of the repertoire of analysis services offered by MGnify. VIRify’s taxonomic classification relies on the detection of taxon-specific profile hidden Markov models (HMMs), built upon a set of 22,014 orthologous protein domains and referred to as ViPhOGs.
The pipeline is implemented and available in CWL and Nextflow. | | https://github.com/EBI-Metagenomics/emg-viral-pipeline; https://github.com/EBI-Metagenomics/emg-viral-pipeline | https://www.researchgate.net/publication/362871600_VIRify_an_integrated_detection_annotation_and_taxonomic_classification_pipeline_using_virus-specific_protein_profile_hidden_Markov_models/link/6304eb9961e4553b95322c97/download | Rob Edwards' Viral Bioinfo Tools; Phage Kitchen | https://trello.com/1/cards/6183429621514236f30e5700/attachments/618342c9a3769e0a1ada4128/download/chart.png | | | | | | | |
184 | 2023-11-13T11:12:18.883Z | VirusTaxo | Virus taxonomy | Virus | k-mer enrichment method | | | https://github.com/nahid18/virustaxo-wf | https://www.sciencedirect.com/science/article/pii/S0888754322001598?via%3Dihub | Rob Edwards' Viral Bioinfo Tools | | | | | | | | |
185 | 2023-11-13T11:12:18.883Z | VPF Tools | Virus taxonomy | Both | | | https://github.com/biocom-uib/vpf-tools | https://academic.oup.com/bioinformatics/article/37/13/1805/6104829 | Rob Edwards' Viral Bioinfo Tools | | | | | | | | |
186 | 2023-11-29T15:14:06.683Z | American Type Culture Collection (ATCC) (US) | | | | | https://www.atcc.org/microbe-products/bacteriology-and-archaea/bacteriophages#t=productTab&numberOfResults=24 | Phage Kitchen | | | | | | | | |
187 | 2023-11-13T11:12:18.883Z | Annotation & classification | | | | | | | | | | | | | | |
188 | 2023-11-13T11:12:18.883Z | Apollo | | | | | | | | | | | | | | |
189 | 2023-11-13T11:12:18.883Z | ARAGORN | | | | | | | | | | | | | | |
190 | 2023-11-13T11:12:18.883Z | Assembly | | | | | | | | | | | | | | |
191 | 2023-11-13T11:12:18.883Z | BALROG - Bacterial Annotation by Learned Representation Of Genes | | Balrog is a prokaryotic gene finder based on a Temporal Convolutional Network. We took a data-driven approach to prokaryotic gene finding, relying on the large and diverse collection of already-sequenced genomes. By training a single, universal model of bacterial genes on protein sequences from many different species, we were able to match the sensitivity of current gene finders while reducing the overall number of gene predictions. Balrog does not need to be refit on any new genome.
**Description:**
Balrog is a prokaryotic gene finder based on a Temporal Convolutional Network. We took a data-driven approach to prokaryotic gene finding, relying on the large and diverse collection of already-sequenced genomes. By training a single, universal model of bacterial genes on protein sequences from many different species, we were able to match the sensitivity of current gene finders while reducing the overall number of gene predictions. Balrog does not need to be refit on any new genome. | | https://github.com/salzberg-lab/Balrog
https://github.com/salzberg-lab/Balrog | https://www.biorxiv.org/content/10.1101/2020.09.06.285304v1
https://www.biorxiv.org/content/10.1101/2020.09.06.285304v1 | Phage Kitchen | | | | | | | | |
192 | 2023-11-13T11:12:18.883Z | barrnap | | | | | | | | | | | | | | |
193 | 2023-11-13T11:12:18.883Z | Baylor College of Medicine (US) | | | | | https://www.bcm.edu/departments/biochemistry-and-molecular-biology/faculty-staff/researchers/bacteria-and-phage | Phage Kitchen | | | | | | | | |
194 | 2023-11-13T11:12:18.883Z | BBtools | | | | | | | | | | | | | | |
195 | 2023-11-13T11:12:18.883Z | blast-DB-phage | | | | | | | | | | | | | | |
196 | 2023-11-13T11:12:18.883Z | blastn | | | | | | | | | | | | | | |
197 | 2023-11-13T11:12:18.883Z | blastp | | | | | | | | | | | | | | |
198 | 2023-11-13T11:12:18.883Z | blastx | | | | | | | | | | | | | | |
199 | 2023-11-13T11:12:18.883Z | Bowtie2 | | | | | | | | | | | | | | |
200 | 2023-11-13T11:12:18.883Z | C++ | | | | | | | | | | | | | | |
201 | 2023-11-13T11:12:18.883Z | Category | | | | | | | | | | | | | | |
202 | 2023-11-13T11:12:18.883Z | CAZY | | | | | | | | | | | | | | |
203 | 2023-11-13T11:12:18.883Z | CD-HIT | | | | | | | | | | | | | | |
204 | 2023-11-13T11:12:18.883Z | CDD | | | | | | | | | | | | | | |
205 | 2023-11-13T11:12:18.883Z | Cenote_Unlimited_Breadsticks | | Unlimited Breadsticks uses probabilistic models (i.e. HMMs) of virus hallmark genes to identify virus sequences from any dataset of contigs (e.g. metagenomic assemblies) or genomes (e.g. bacterial genomes). Optionally, Unlimited Breadsticks will use gene content information to remove flanking cellular chromosomes from contigs representing putative prophages. Generally, the prophage-cellular chromosome boundary will be identified within 100 nt - 2000 nt of the actual location.
+ The code is currently functional. Feel free to consume Unlimited Breadsticks at will.
+ Minor update to handle very large contig files AND update to HMM databases on June 16th, 2021
Unlimited Breadsticks is derived from Cenote-Taker 2, but several time-consuming computations are skipped in order to analyze datasets as quickly as possible. Also, Unlimited Breadsticks only takes approximately 16 minutes to download and install (Cenote-Taker 2 takes about 2 hours due to large databases required for thorough sequence annotation). See installation instructions below.
**Limitations**
Compared to Cenote-Taker 2, there are a few limitations.
* Unlimited Breadsticks does not do post-hallmark-gene-identification computations to flag plasmid and conjugative element sequences that occasionally slip through.
* Unlimited Breadsticks does not make genome maps for manual inspection of putative viruses.
* Contigs are not extensively annotated by Unlimited Breadsticks. No genome maps are created.
**Description:**
Unlimited Breadsticks uses probabilistic models (i.e. HMMs) of virus hallmark genes to identify virus sequences from any dataset of contigs (e.g. metagenomic assemblies) or genomes (e.g. bacterial genomes). Optionally, Unlimited Breadsticks will use gene content information to remove flanking cellular chromosomes from contigs representing putative prophages. Generally, the prophage-cellular chromosome boundary will be identified within 100 nt - 2000 nt of the actual location.
+ The code is currently functional. Feel free to consume Unlimited Breadsticks at will.
+ Minor update to handle very large contig files AND update to HMM databases on June 16th, 2021
Unlimited Breadsticks is derived from Cenote-Taker 2, but several time-consuming computations are skipped in order to analyze datasets as quickly as possible. Also, Unlimited Breadsticks only takes approximately 16 minutes to download and install (Cenote-Taker 2 takes about 2 hours due to large databases required for thorough sequence annotation). See installation instructions below.
**Limitations**
Compared to Cenote-Taker 2, there are a few limitations.
* Unlimited Breadsticks does not do post-hallmark-gene-identification computations to flag plasmid and conjugative element sequences that occasionally slip through.
* Unlimited Breadsticks does not make genome maps for manual inspection of putative viruses.
* Contigs are not extensively annotated by Unlimited Breadsticks. No genome maps are created. | | https://github.com/mtisza1/Cenote_Unlimited_Breadsticks
https://github.com/mtisza1/Cenote_Unlimited_Breadsticks | | Phage Kitchen | | | | | | | | |
206 | 2023-11-13T11:12:18.883Z | Cenote-Taker DB | | | | | | | | | | | | | | |
207 | 2023-11-13T11:12:18.883Z | Cenote-Taker2 vs DeepVirFinder - VirSorter2 - VIGA | | | | | https://academic.oup.com/ve/article/7/1/veaa100/6055568 | Phage Kitchen | | | | | | | | |
208 | 2023-11-13T11:12:18.883Z | Cenote-Taker2: Discover and Annotate Divergent Viral Contigs | | Cenote-Taker 2 is a dual function bioinformatics tool. On the one hand, Cenote-Taker 2 discovers/predicts virus sequences from any kind of genome or metagenomic assembly. Second, virus sequences/genomes are annotated with a variety of sequences features, genes, and taxonomy. Either the discovery or the the annotation module can be used independently.
Cenote-Taker 2 democratizes virus discovery and sequence annotation. | | https://github.com/mtisza1/Cenote-Taker2 | https://academic.oup.com/ve/article/7/1/veaa100/6055568 | Phage Kitchen | https://trello.com/1/cards/6178a8d0da64201e6344ef18/attachments/6178a996f0d585697c7bd27e/download/m_veaa100f1.jpeg | | | | | | | |
209 | 2023-11-13T11:12:18.883Z | CFU.AI | | CFU Counting with Artificial Intelligence
A Deep-Learning based counting tool that offers accurate and robust analyses
**Description:**
CFU Counting with Artificial Intelligence
A Deep-Learning based counting tool that offers accurate and robust analyses | | | http://www.cfu.ai/
http://www.cfu.ai/ | Phage Kitchen | https://trello.com/1/cards/618336f7f8e6df32edda8cf7/attachments/61833795717a46409bb85e40/download/app_3.png | | | | | | | |
210 | 2023-11-13T11:12:18.883Z | checkV | | | | | | | | | | | | | | |
211 | 2023-11-13T11:12:18.883Z | CheckV - assesses the quality and completeness of metagenome-assembled viral genomes | | Here we present CheckV, an automated pipeline for identifying closed viral genomes, estimating the completeness of genome fragments and removing flanking host regions from integrated proviruses. CheckV estimates completeness by comparing sequences with a large database of complete viral genomes, including 76,262 identified from a systematic search of publicly available metagenomes, metatranscriptomes and metaviromes. After validation on mock datasets and comparison to existing methods, we applied CheckV to large and diverse collections of metagenome-assembled viral sequences, including IMG/VR and the Global Ocean Virome.
**Description:**
Here we present CheckV, an automated pipeline for identifying closed viral genomes, estimating the completeness of genome fragments and removing flanking host regions from integrated proviruses. CheckV estimates completeness by comparing sequences with a large database of complete viral genomes, including 76,262 identified from a systematic search of publicly available metagenomes, metatranscriptomes and metaviromes. After validation on mock datasets and comparison to existing methods, we applied CheckV to large and diverse collections of metagenome-assembled viral sequences, including IMG/VR and the Global Ocean Virome. | | https://bitbucket.org/berkeleylab/CheckV
https://bitbucket.org/berkeleylab/CheckV | https://portal.nersc.gov/CheckV/
https://www.nature.com/articles/s41587-020-00774-7
https://www.nature.com/articles/s41587-020-00774-7
https://portal.nersc.gov/CheckV/ | Phage Kitchen | https://trello.com/1/cards/61831d1ed4b96d70640d6719/attachments/61831db1aaf79980868421d9/download/checkv.png | | | | | | | |
212 | 2023-11-13T11:12:18.883Z | CheckV vs Vibrant, VirSorter, PhiSpy, Phigaro | | | | | https://www.nature.com/articles/s41587-020-00774-7 | Phage Kitchen | | | | | | | | |
213 | 2023-11-13T11:12:18.883Z | Chromomap | | | | | | | | | | | | | | |
214 | 2023-11-13T11:12:18.883Z | chromomap | | | | | | | | | | | | | | |
215 | 2023-11-13T11:12:18.883Z | Circlator | | | | | | | | | | | | | | |
216 | 2023-11-13T11:12:18.883Z | Clustal(W and or O) | | | | | | | | | | | | | | |
217 | 2023-11-13T11:12:18.883Z | Clustering/comparison | | | | | | | | | | | | | | |
218 | 2023-11-13T11:12:18.883Z | COG | | | | | | | | | | | | | | |
219 | 2023-11-13T11:12:18.883Z | CPT Galaxy | | At the Center for Phage Technology (CPT), we developed a suite of phage-oriented tools housed in open, user-friendly web-based interfaces. A Galaxy platform conducts computationally intensive analyses and Apollo, a collaborative genome annotation editor, visualizes the results of these analyses. The collection includes open source applications such as the BLAST+ suite, InterProScan, and several gene callers, as well as unique tools developed at the CPT that allow maximum user flexibility. We describe in detail programs for finding Shine-Dalgarno sequences, resources used for confident identification of lysis genes such as spanins, and methods used for identifying interrupted genes that contain frameshifts or introns. At the CPT, genome annotation is separated into two robust segments that are facilitated through the automated execution of many tools chained together in an operation called a workflow. First, the structural annotation workflow results in gene and other feature calls. This is followed by a functional annotation workflow that combines sequence comparisons and conserved domain searching, which is contextualized to allow integrated evidence assessment in functional prediction. Finally, we describe a workflow used for comparative genomics. Using this multi-purpose platform enables researchers to easily and accurately annotate an entire phage genome.
The portal can be accessed at with accompanying user training material. | | | https://cpt.tamu.edu/galaxy-pub
https://cpt.tamu.edu/training-material/
https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1008214 | Phage Kitchen | https://trello.com/1/cards/6202ff7745a7472839e0b529/attachments/6202ffb95b625372d32d41c2/download/image.png | | | | | | | |
220 | 2023-11-13T11:12:18.883Z | Cuffcompare | | | | | | | | | | | | | | |
221 | 2023-11-13T11:12:18.883Z | CWL (nextflow, snamemake) | | | | | | | | | | | | | | |
222 | 2023-11-13T11:12:18.883Z | Database reference | | | | | | | | | | | | | | |
223 | 2023-11-13T11:12:18.883Z | Databases | | | | | | | | | | | | | | |
224 | 2023-11-13T11:12:18.883Z | dbCAN | | | | | | | | | | | | | | |
225 | 2023-11-13T11:12:18.883Z | DeePhage: a tool for identifying temperate phage-derived and virulent phage-derived sequence in metavirome data using deep learning | | DeePhage is designed to identify metavirome sequences as temperate phage-derived and virulent phage-derived sequences. The program calculate a score reflecting the likelihood of each input fragment as temperate phage-derived and virulent phage-derived sequences. DeePhage can run either on the virtual machine or physical host. For non-computer professionals, we recommend running the virtual machine version of DeePhage on local PC. In this way, users do not need to install any dependency package. If GPU is available, you can also choose to run the physical host version. This version can automatically speed up with GPU and is more suitable to handle large scale data. The program is also available at | | | http://cqb.pku.edu.cn/ZhuLab/DeePhage/. | Phage Kitchen | | | | | | | | |
226 | 2023-11-13T11:12:18.883Z | DeepVirFinder: Identifying viruses from metagenomic data by deep learning | | DeepVirFinder predicts viral sequences using deep learning method. The method has good prediction accuracy for short viral sequences, so it can be used to predict sequences from the metagenomic data.
DeepVirFinder significantly improves the prediction accuracy compared to our k-mer based method VirFinder by using convolutional neural networks (CNN). CNN can automatically learn genomic patterns from the viral and prokaryotic sequences and simultaneously build a predictive model based on the learned genomic patterns. The learned patterns are represented in the form of weight matrices of size 4 by k, where k is the length of the pattern. This representation is similar to the position weight matrix (PWM), the commonly used representation of biological motifs, which are also of size 4 by k and each column specifies the probabilities of having the 4 nucleotides at that position. When only one type of nucleotide can be chosen at each position with probability 1, the motif degenerates to a k-mer. Thus, the CNN is a natural generalization of k-mer based model. The more flexible CNN model indeed outperforms the k-mer based model on viral sequence prediction problem. | | https://github.com/jessieren/DeepVirFinder | https://link.springer.com/article/10.1007/s40484-019-0187-4
https://link.springer.com/content/pdf/10.1007/s40484-019-0187-4.pdf | Phage Kitchen | | | | | | | | |
227 | 2023-11-13T11:12:18.883Z | Demovir | | Democratic taxonomic classification of viral contigs to Order and Family level
When performing metagenomic sequencing of Viral-Like Particle (VLP), the majority of returned sequences often bare little to no homology to reference sequences - Viral Dark Matter. Frequently it may be useful to know which viral taxonomic group these novel viruses are likely to belong to as this will give information about nucleic acid type, size and behaviour.
DemoVir will classify viral contigs to the Order or Family taxonomic level by comparing genes on the amino acid level against the viral subset of the TrEMBL database, and then taking a vote of the Order and Family hits. Homology searches are performed by Usearch in order to increase speed. This type of method has previously been implemented in multiple published virome studies but to our knowledge none have performed benchmarking or made it available as a simple executable script easily downloaded and installed.
**Note - DemoVir is for classification of sequences into viral families and orders only and should not be used for discrminating viral contigs from bacterial/archael/eukaryotic sequences in a metagenomic sample.** | | https://github.com/feargalr/Demovir | | Phage Kitchen | | | | | | | | |
228 | 2023-11-13T11:12:18.883Z | DEPtH - Detection and Extraction of Phages Tool | | DEPhT is a new tool for identifying prophages in bacteria, and was developed with a particular interest in being able to rapidly scan hundreds to thousands of genomes and accurately extract complete (likely active) prophages from them.
A detailed manuscript has been submitted to Nucleic Acids Research, but in brief DEPhT works by using genome architecture (rather than homology) to identify genomic regions likely to contain a prophage. Any regions with phage-like architecture (characterized as regions with high gene density and few transcription direction changes) are then further scrutinized using two passes of homology detection.
* The first pass identifies genes on putative prophages that are homologs of (species/clade/genus-level) conserved bacterial genes, and uses any such genes to disrupt the prophage prediction.
* The second pass (disabled in the 'fast' runmode) identifies genes on putative prophages that are homologs of conserved, functionally annotated phage genes.
* Finally, prophage regions that got through the previous filters are subjected to a BLASTN-based attL/attR detection scheme that gives DEPhT better boundary detection than any tool we are aware of.
**Description:**
DEPhT is a new tool for identifying prophages in bacteria, and was developed with a particular interest in being able to rapidly scan hundreds to thousands of genomes and accurately extract complete (likely active) prophages from them.
A detailed manuscript has been submitted to Nucleic Acids Research, but in brief DEPhT works by using genome architecture (rather than homology) to identify genomic regions likely to contain a prophage. Any regions with phage-like architecture (characterized as regions with high gene density and few transcription direction changes) are then further scrutinized using two passes of homology detection.
* The first pass identifies genes on putative prophages that are homologs of (species/clade/genus-level) conserved bacterial genes, and uses any such genes to disrupt the prophage prediction.
* The second pass (disabled in the 'fast' runmode) identifies genes on putative prophages that are homologs of conserved, functionally annotated phage genes.
* Finally, prophage regions that got through the previous filters are subjected to a BLASTN-based attL/attR detection scheme that gives DEPhT better boundary detection than any tool we are aware of. | | | https://trello.com/c/jkFJv63E/87-depth-detection-and-extraction-of-phages-tool | Phage Kitchen | https://trello.com/1/cards/61e73e0bf7184b8e5955dcac/attachments/61e73f3ba79a440a155436c2/download/image.png | | | | | | | |
229 | 2023-11-13T11:12:18.883Z | DIAMOND | | | | | | | | | | | | | | |
230 | 2023-11-13T11:12:18.883Z | Digital Phagogram | | | | | https://trello.com/c/RpFe431X/90-digital-phagogram | Phage Kitchen | https://trello.com/1/cards/61e754a4fa07561408df7647/attachments/61e754a7b8f5a2897553f1ab/download/image.png | | | | | | | |
231 | 2023-11-13T11:12:18.883Z | Distance-based | | | | | | | | | | | | | | |
232 | 2023-11-13T11:12:18.883Z | DRAM - Distilled and Refined Annotation of Metabolism | | DRAM (Distilled and Refined Annotation of Metabolism) is a tool for annotating metagenomic assembled genomes and VirSorter identified viral contigs. DRAM annotates MAGs and viral contigs using KEGG (if provided by the user), UniRef90, PFAM, dbCAN, RefSeq viral, VOGDB and the MEROPS peptidase database as well as custom user databases. DRAM is run in two stages. First an annotation step to assign database identifiers to gene and then a distill step to curate these annotations into useful functional categories. Additionally viral contigs are further analyzed during to identify potential AMGs. This is done via assigning an auxiliary score and flags representing the confidence that a gene is both metabolic and viral.
For more detail on DRAM and how DRAM works please see our paper as well as the wiki.
/wiki
**Description:**
DRAM (Distilled and Refined Annotation of Metabolism) is a tool for annotating metagenomic assembled genomes and VirSorter identified viral contigs. DRAM annotates MAGs and viral contigs using KEGG (if provided by the user), UniRef90, PFAM, dbCAN, RefSeq viral, VOGDB and the MEROPS peptidase database as well as custom user databases. DRAM is run in two stages. First an annotation step to assign database identifiers to gene and then a distill step to curate these annotations into useful functional categories. Additionally viral contigs are further analyzed during to identify potential AMGs. This is done via assigning an auxiliary score and flags representing the confidence that a gene is both metabolic and viral.
For more detail on DRAM and how DRAM works please see our paper as well as the wiki.
/wiki | | https://github.com/shafferm/DRAM
https://github.com/shafferm/DRAM
https://github.com/shafferm/DRAM/wiki | https://academic.oup.com/nar/article/48/16/8883/5884738
https://academic.oup.com/nar/article/48/16/8883/5884738 | Phage Kitchen | https://trello.com/1/cards/61932c07bad28b1fa56ae5d7/attachments/61932eab8d2cac726d8f7ed2/download/image.png | | | | | | | |
233 | 2023-11-13T11:12:18.883Z | EDGE bioinformatics - Empowering the Development of Genomics Expertise | | EDGE bioinformatics is intended to help truly democratize the use of Next Generation Sequencing for exploring genomes and metagenomes. Given that bioinformatic analysis is now the rate limiting factor in genomics, we developed EDGE bioinformatics with a user-friendly interface that allows scientists to perform a number of tailored analyses using many cutting-edge tools. A complete version of EDGE is available as a variety of packages that can fit individual needs, including source code, or images in VMware and Docker formats. For basic information about EDGE, visit the EDGE ABCs, that provide a brief overview of EDGE, the various workflows, and the computational environment restraints for local use. | | | https://edgebioinformatics.org/ | Phage Kitchen | https://trello.com/1/cards/6202fd9fc544f30bcf2439eb/attachments/6202fdc23b3fb3251f8e3378/download/image.png | | | | | | | |
234 | 2023-11-13T11:12:18.883Z | Efam | | | | | | | | | | | | | | |
235 | 2023-11-13T11:12:18.883Z | Efam-XC | | | | | | | | | | | | | | |
236 | 2023-11-13T11:12:18.883Z | eggNOG | | | | | | | | | | | | | | |
237 | 2023-11-13T11:12:18.883Z | Eligo | | | | | https://trello.com/c/nvGNnVhR/3-eligo | Phage Kitchen | | | | | | | | |
238 | 2023-11-13T11:12:18.883Z | FactoMineR (PCA) | | | | | | | | | | | | | | |
239 | 2023-11-13T11:12:18.883Z | FageBank (Netherlands) | | | | | https://www.fagenbank.nl/english/ | Phage Kitchen | | | | | | | | |
240 | 2023-11-13T11:12:18.883Z | FASTA | | | | | | | | | | | | | | |
241 | 2023-11-13T11:12:18.883Z | FastME | | | | | | | | | | | | | | |
242 | 2023-11-13T11:12:18.883Z | Fastp | | | | | | | | | | | | | | |
243 | 2023-11-13T11:12:18.883Z | FastQC | | | | | | | | | | | | | | |
244 | 2023-11-13T11:12:18.883Z | Felix d'Herelle Reference Center for Bacterial Viruses (CANADA) | | | | | https://www.phage.ulaval.ca/en/phages-catalog/ | Phage Kitchen | | | | | | | | |
245 | 2023-11-13T11:12:18.883Z | FIGfams | | | | | | | | | | | | | | |
246 | 2023-11-13T11:12:18.883Z | Functional annotation | | | | | | | | | | | | | | |
247 | 2023-11-13T11:12:18.883Z | Gene and accessory prediction | | | | | | | | | | | | | | |
248 | 2023-11-13T11:12:18.883Z | Genemark(s) | | | | | | | | | | | | | | |
249 | 2023-11-13T11:12:41.087Z | Genome detective - an automated system for virus identification from high-throughput sequencing data | | Genome Detective is an easy to use web-based software application that assembles the genomes of viruses quickly and accurately. The application uses a novel alignment method that constructs genomes by reference-based linking of de novo contigs by combining amino-acids and nucleotide scores. The software was optimized using synthetic datasets to represent the great diversity of virus genomes. The application was then validated with next generation sequencing data of hundreds of viruses. User time is minimal and it is limited to the time required to upload the data.
Supp data
Availability and implementation
Available online: | | | https://oup.silverchair-cdn.com/oup/backfile/Content_public/Journal/bioinformatics/35/5/10.1093_bioinformatics_bty695/2/bty695_supplementary_information.docx
http://www.genomedetective.com/app/typingtool/virus/.
https://doi.org/10.1093/bioinformatics/bty695 | Phage Kitchen | https://trello.com/1/cards/6201e4fffa436475e2f6fa30/attachments/6201e58d1fb7793fba2806ff/download/image.png | | | | | | | |
250 | 2023-11-13T11:12:18.883Z | German Collection of Microorganisms and Cell Cultures (GERMANY) | | | | | https://www.dsmz.de/collection/catalogue/microorganisms/special-groups-of-organisms/phages | Phage Kitchen | | | | | | | | |
251 | 2023-11-13T11:12:18.883Z | Glimmer | | | | | | | | | | | | | | |
252 | 2023-11-13T11:12:18.883Z | GOM (Genome Organisation Models) | | | | | | | | | | | | | | |
253 | 2023-11-13T11:12:18.883Z | GRAViTy: Genome Relationships Applied to Virus Taxonomy | | GRAViTy - "Genome Relationships Applied to Virus Taxonomy" is an analysis pipeline that is effective at reproducing the current assignments of viruses at family level as well as inter-family groupings into Orders (Aiewsakun and Simmonds, 2018). It can additionally be used to correctly differentiate assigned viruses from unassigned viruses and classify them into correct taxonomic groups. The method provides a rapid and objective means to explore metagenomic viral diversity and make informed recommendations for classification that are consistent with the current ICTV taxonomic framework. Methods like GRAViTy are increasingly required as the vast diversity of viruses found in metagenomic sequence datasets is explored. | | https://github.com/PAiewsakun/GRAViTy | https://www.microbiologyresearch.org/content/journal/jgv/10.1099/jgv.0.001110
https://link.springer.com/article/10.1007%2Fs00705-018-3938-z
https://microbiomejournal.biomedcentral.com/articles/10.1186/s40168-018-0422-7
http://gravity.cvr.gla.ac.uk/ | Phage Kitchen | https://trello.com/1/cards/6178b2f91667bb2c47a57b47/attachments/6178b33f7a97de18b8fc8033/download/gravity_overview_small.png | | | | | | | |
254 | 2023-11-13T11:12:18.883Z | Haskell | | | | | | | | | | | | | | |
255 | 2023-11-13T11:12:18.883Z | Hecatomb | | A hecatomb is a great sacrifice or an extensive loss. Heactomb the software empowers an analyst to make data driven decisions to 'sacrifice' false-positive viral reads from metagenomes to enrich for true-positive viral reads. This process frequently results in a great loss of suspected viral sequences / contigs. | | https://github.com/shandley/hecatomb | https://hecatomb.readthedocs.io/en/latest/ | Phage Kitchen | https://trello.com/1/cards/61bac43d4ed8e71713a429d7/attachments/61bac51c7ec5eb88bf53a977/download/image.png | | | | | | | |
256 | 2023-11-13T11:12:18.883Z | hhmake | | | | | | | | | | | | | | |
257 | 2023-11-13T11:12:18.883Z | hhsuite | | | | | | | | | | | | | | |
258 | 2023-11-13T11:12:18.883Z | hmmer3 | | nhmmer: DNA homology search with profile HMMs | | | | | | | | | | | | |
259 | 2023-11-13T11:12:18.883Z | Hmmscan | | | | | | | | | | | | | | |
260 | 2023-11-13T11:12:18.883Z | hmmsearch | | | | | | | | | | | | | | |
261 | 2023-11-13T11:12:18.883Z | Host prediction | | | | | | | | | | | | | | |
262 | 2023-11-13T11:12:18.883Z | Host Taxon Predictor | | | | | | | | | | | | | | |
263 | 2023-11-13T11:12:18.883Z | HTP - Host Taxon Predictor | | The initial repo was split into two parts. This part contains a software designed to fetch complete viral genomic reference sequences from NCBI Nucleotide, get viral host's lineage from NCBI Taxonomy and transform the sequence into some features. The second part, available on has been designed to infer host of previously unknown virus.
Recent advances in metagenomics provided a valuable alternative to culture-based approaches for better sampling viral diversity. However, some of newly identified viruses lack sequence similarity to any of previously sequenced ones, and cannot be easily assigned to their hosts. Here we present a bioinformatic approach to this problem. We developed classifiers capable of distinguishing eukaryotic viruses from the phages achieving almost 95% prediction accuracy. The classifiers are wrapped in Host Taxon Predictor (HTP) software written in Python which is freely available at . HTP’s performance was later demonstrated on a collection of newly identified viral genomes and genome fragments. In summary, HTP is a culture- and alignment-free approach for distinction between phages and eukaryotic viruses. We have also shown that it is possible to further extend our method to go up the evolutionary tree and predict whether a virus can infect narrower taxa.
**Description:**
The initial repo was split into two parts. This part contains a software designed to fetch complete viral genomic reference sequences from NCBI Nucleotide, get viral host's lineage from NCBI Taxonomy and transform the sequence into some features. The second part, available on has been designed to infer host of previously unknown virus.
Recent advances in metagenomics provided a valuable alternative to culture-based approaches for better sampling viral diversity. However, some of newly identified viruses lack sequence similarity to any of previously sequenced ones, and cannot be easily assigned to their hosts. Here we present a bioinformatic approach to this problem. We developed classifiers capable of distinguishing eukaryotic viruses from the phages achieving almost 95% prediction accuracy. The classifiers are wrapped in Host Taxon Predictor (HTP) software written in Python which is freely available at . HTP’s performance was later demonstrated on a collection of newly identified viral genomes and genome fragments. In summary, HTP is a culture- and alignment-free approach for distinction between phages and eukaryotic viruses. We have also shown that it is possible to further extend our method to go up the evolutionary tree and predict whether a virus can infect narrower taxa. | | https://github.com/wojciech-galan/Viral_feature_extractor
https://github.com/wojciech-galan/viruses_classifier
https://github.com/wojciech-galan/viruses_classifier
https://github.com/wojciech-galan/viruses_classifier.
https://github.com/wojciech-galan/Viral_feature_extractor
https://github.com/wojciech-galan/viruses_classifier
https://github.com/wojciech-galan/viruses_classifier
https://github.com/wojciech-galan/viruses_classifier. | https://www.nature.com/articles/s41598-019-39847-2
https://www.nature.com/articles/s41598-019-39847-2 | Phage Kitchen | | | | | | | | |
264 | 2023-11-13T11:12:18.883Z | Identification | | | | | | | | | | | | | | |
265 | 2023-11-13T11:12:18.883Z | Identification/classification | | | | | | | | | | | | | | |
266 | 2023-11-13T11:12:18.883Z | IMG/VR | | | | | | | | | | | | | | |
267 | 2023-11-13T11:12:18.883Z | IMG/VR v3: an integrated ecological and evolutionary framework for interrogating genomes of uncultivated viruses | | Viruses are integral components of all ecosystems and microbiomes on Earth. Through pervasive infections of their cellular hosts, viruses can reshape microbial community structure and drive global nutrient cycling. Over the past decade, viral sequences identified from genomes and metagenomes have provided an unprecedented view of viral genome diversity in nature. Since 2016, the IMG/VR database has provided access to the largest collection of viral sequences obtained from (meta)genomes. Here, we present the third version of IMG/VR, composed of 18 373 cultivated and 2 314 329 uncultivated viral genomes (UViGs), nearly tripling the total number of sequences compared to the previous version. These clustered into 935 362 viral Operational Taxonomic Units (vOTUs), including 188 930 with two or more members. UViGs in IMG/VR are now reported as single viral contigs, integrated proviruses or genome bins, and are annotated with a new standardized pipeline including genome quality estimation using CheckV, taxonomic classification reflecting the latest ICTV update, and expanded host taxonomy prediction. The new IMG/VR interface enables users to efficiently browse, search, and select UViGs based on genome features and/or sequence similarity.
IMG/VR v3 is available at and the underlying data are available to download at | | | https://img.jgi.doe.gov/vr,
https://genome.jgi.doe.gov/portal/IMG_VR.
https://doi.org/10.1093/nar/gkaa946 | Phage Kitchen | https://trello.com/1/cards/6178b4a04f04fc85bf214923/attachments/6178b4d75f1f0e06e212ce16/download/gkaa946fig1.jpeg | | | | | | | |
268 | 2023-11-13T11:12:18.883Z | infernal (rRNA) | | | | | | | | | | | | | | |
269 | 2023-11-13T11:12:18.883Z | INPHARED - INfrastructure for a PHAge REference Database | | Providing up-to-date bacteriophage genome databases, metrics and useful input files for a number of bioinformatic pipelines including vConTACT2 and MASH. The aim is to produce a useful starting point for viral genomics and meta-omics.
Citation:
If you find our database useful, please see our recently published paper in PHAGE HERE
Cook R, Brown N, Redgwell T, Rihtman B, Barnes M, Clokie M, Stekel DJ, Hobman JL, Jones MA, Millard A. INfrastructure for a PHAge REference Database: Identification of Large-Scale Biases in the Current Collection of Cultured Phage Genomes. PHAGE. 2021. Available from: | | https://github.com/RyanCook94/inphared | http://doi.org/10.1089/phage.2021.0007. | Phage Kitchen | | | | | | | | |
270 | 2023-11-13T11:12:18.883Z | INPHARED - INfrastructure for a PHAge REference Database: Identification of large-scale biases in the current collection of phage genomes. | | inphared.pl (INfrastructure for a PHAge REference Database) is a perl script which downloads and filters phage genomes from Genbank to provide the most complete phage genome database possible.
Providing up-to-date bacteriophage genome databases, metrics and useful input files for a number of bioinformatic pipelines including vConTACT2 and MASH. The aim is to produce a useful starting point for viral genomics and meta-omics.
**Description:** | | https://github.com/RyanCook94/inphared | https://leicester.figshare.com/articles/dataset/INPHARED_DATABASE/14242085
https://doi.org/10.1101/2021.05.01.442102
https://link.springer.com/protocol/10.1007/978-1-4939-7343-9_17
https://www.ncbi.nlm.nih.gov/labs/pmc/articles/PMC3965101/ | Phage Kitchen | https://trello.com/1/cards/6178b384f7c0026429e621f4/attachments/618b21c1df5339278f3169eb/download/image.png | | | | | | | |
271 | 2023-11-13T11:12:18.883Z | INSDC | | | | | | | | | | | | | | |
272 | 2023-11-13T11:12:18.883Z | Interpro | | | | | | | | | | | | | | |
273 | 2023-11-13T11:12:18.883Z | interproscan | | | | | | | | | | | | | | |
274 | 2023-11-13T11:12:18.883Z | IRF-finder | | | | | | | | | | | | | | |
275 | 2023-11-13T11:12:18.883Z | Israeli Biobank | | | | | https://www.ncbi.nlm.nih.gov/labs/pmc/articles/PMC7277922/ | Phage Kitchen | | | | | | | | |
276 | 2023-11-13T11:12:18.883Z | Jackhmmer | | | | | | | | | | | | | | |
277 | 2023-11-13T11:12:18.883Z | Kaiju | | | | | | | | | | | | | | |
278 | 2023-11-13T11:12:18.883Z | KEGG | | | | | | | | | | | | | | |
279 | 2023-11-13T11:12:18.883Z | kmer-DB | | | | | | | | | | | | | | |
280 | 2023-11-13T11:12:18.883Z | Kraken(2) | | | | | | | | | | | | | | |
281 | 2023-11-13T11:12:18.883Z | Krona | | | | | | | | | | | | | | |
282 | 2023-11-13T11:12:18.883Z | Languages | | | | | | | | | | | | | | |
283 | 2023-11-13T11:12:18.883Z | LASTZ (circularity) | | | | | | | | | | | | | | |
284 | 2023-11-13T11:12:18.883Z | Lifestyle prediction | | | | | | | | | | | | | | |
285 | 2023-11-13T11:12:18.883Z | LipoP | | | | | | | | | | | | | | |
286 | 2023-11-13T11:12:18.883Z | Mapping | | | | | | | | | | | | | | |
287 | 2023-11-13T11:12:18.883Z | MARVEL, a Tool for Prediction of Bacteriophage Sequences in Metagenomic Bins | | Here we present MARVEL, a tool for prediction of double-stranded DNA bacteriophage sequences in metagenomic bins. MARVEL uses a random forest machine learning approach.
**Description:**
Here we present MARVEL, a tool for prediction of double-stranded DNA bacteriophage sequences in metagenomic bins. MARVEL uses a random forest machine learning approach. | | https://github.com/LaboratorioBioinformatica/MARVEL
https://github.com/LaboratorioBioinformatica/MARVEL | https://www.frontiersin.org/articles/10.3389/fgene.2018.00304/full
https://www.frontiersin.org/articles/10.3389/fgene.2018.00304/full | Phage Kitchen | | | | | | | | |
288 | 2023-11-13T11:12:18.883Z | MASH | | | | | | | | | | | | | | |
289 | 2023-11-13T11:12:18.883Z | mashmap | | | | | | | | | | | | | | |
290 | 2023-11-13T11:12:18.883Z | MATLAB | | | | | | | | | | | | | | |
291 | 2023-11-13T11:12:18.883Z | MAVRICH - Bacteriophage evolution differs by host, lifestyle and genome | | Bacteriophages play key roles in microbial evolution, marine nutrient cycling and human disease. Phages are genetically diverse, and their genome architectures are characteristically mosaic, driven by horizontal gene transfer with other phages and host genomes. As a consequence, phage evolution is complex and their genomes are composed of genes with distinct and varied evolutionary histories. However, there are conflicting perspectives on the roles of mosaicism and the extent to which it generates a spectrum of genome diversity or genetically discrete populations. Here, we show that bacteriophages evolve within two general evolutionary modes that differ in the extent of horizontal gene transfer by an order of magnitude. Temperate phages distribute into high and low gene flux modes, whereas lytic phages share only the lower gene flux mode. The evolutionary modes are also a function of the bacterial host and different proportions of temperate and lytic phages are distributed in either mode depending on the host phylum. Groups of genetically related phages fall into either the high or low gene flux modes, suggesting there are genetic as well as ecological drivers of horizontal gene transfer rates. Consequently, genome mosaicism varies depending on the host, lifestyle and genetic constitution of phages.
**Description:**
Bacteriophages play key roles in microbial evolution, marine nutrient cycling and human disease. Phages are genetically diverse, and their genome architectures are characteristically mosaic, driven by horizontal gene transfer with other phages and host genomes. As a consequence, phage evolution is complex and their genomes are composed of genes with distinct and varied evolutionary histories. However, there are conflicting perspectives on the roles of mosaicism and the extent to which it generates a spectrum of genome diversity or genetically discrete populations. Here, we show that bacteriophages evolve within two general evolutionary modes that differ in the extent of horizontal gene transfer by an order of magnitude. Temperate phages distribute into high and low gene flux modes, whereas lytic phages share only the lower gene flux mode. The evolutionary modes are also a function of the bacterial host and different proportions of temperate and lytic phages are distributed in either mode depending on the host phylum. Groups of genetically related phages fall into either the high or low gene flux modes, suggesting there are genetic as well as ecological drivers of horizontal gene transfer rates. Consequently, genome mosaicism varies depending on the host, lifestyle and genetic constitution of phages. | | | https://www.nature.com/articles/nmicrobiol2017112
https://www.nature.com/articles/nmicrobiol2017112 | Phage Kitchen | https://trello.com/1/cards/618224f2f47cb25bb3df551f/attachments/61a71964841b764308607aa1/download/image.png | | | | | | | |
292 | 2023-11-13T11:12:18.883Z | MCL | | | | | | | | | | | | | | |
293 | 2023-11-13T11:12:18.883Z | megahit | | | | | | | | | | | | | | |
294 | 2023-11-13T11:12:18.883Z | MEROPS | | | | | | | | | | | | | | |
295 | 2023-11-13T11:12:18.883Z | metabat2 | | | | | | | | | | | | | | |
296 | 2023-11-13T11:12:18.883Z | MetaGeneAnnotator | | | | | | | | | | | | | | |
297 | 2023-11-13T11:12:18.883Z | Metagenome enabled | | | | | | | | | | | | | | |
298 | 2023-12-26T17:22:05.670Z | MetaPhage: an Automated Pipeline for Analyzing, Annotating, and Classifying Bacteriophages in Metagenomics Sequencing Data | | To assist the nonspecialist in the decision-making process and facilitate workflow management, we present here MetaPhage (MP), a fully automated computational pipeline for quality control, assembly, and phage detection as well as classification and quantification of these phages in metagenomics data. The pipeline is modular and enables the user to skip some of the steps and recover analysis in the event of execution errors. To guarantee scalability and reproducibility, MetaPhage was implemented in Nextflow (NF) (8), a workflow manager that uses software containers to allow easy installation. The pipeline can be run on a single computer or parallelized on an high performance computing (HPC) cluster. MetaPhage also implements a novel algorithm that delivers automatic taxonomic classification of phage contigs from the vConTACT2 (9) network graph implemented in the workflow. Results for each step of the analysis are reported on a rich and easy-to-read html report that can be opened and inspected on any web browser. | | https://github.com/MattiaPandolfoVR/MetaPhage | https://doi.org/10.1128/msystems.00741-22 | Phage Kitchen | https://trello.com/1/cards/63225df719c457038e5d5017/attachments/63225e779c42e5017272e5e1/download/image.png | | | | | | | |
299 | 2023-12-26T17:21:42.920Z | MetaPhinder‚ Identifying Bacteriophage Sequences in Metagenomic Data Sets | | Bacteriophages are the most abundant biological entity on the planet, but at the same time do not account for much of the genetic material isolated from most environments due to their small genome sizes. They also show great genetic diversity and mosaic genomes making it challenging to analyze and understand them. Here we present MetaPhinder, a method to identify assembled genomic fragments (i.e.contigs) of phage origin in metagenomic data sets. The method is based on a comparison to a database of whole genome bacteriophage sequences, integrating hits to multiple genomes to accomodate for the mosaic genome structure of many bacteriophages. The method is demonstrated to out-perform both BLAST methods based on single hits and methods based on k-mer comparisons. MetaPhinder is available as a web service at the Center for Genomic Epidemiology while the source code can be downloaded from or | | https://bitbucket.org/genomicepidemiology/metaphinder
https://github.com/vanessajurtz/MetaPhinder. | https://cge.cbs.dtu.dk/services/MetaPhinder/,
https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0163111 | Phage Kitchen | | | | | | | | |
300 | 2023-11-13T11:12:18.883Z | metaSPAdes | | | | | | | | | | | | | | |
301 | 2023-11-13T11:12:18.883Z | METAVIRALSPADES: assembly of viruses from metagenomic data | | We describe a METAVIRALSPADES tool for identifying viral genomes in metagenomic assembly graphs that is based on analyzing variations in the coverage depth between viruses and bacterial chromosomes. We benchmarked METAVIRALSPADES on diverse metagenomic datasets, verified our predictions using a set of virus-specific Hidden Markov Models and demonstrated that it improves on the state-of-the-art viral identification pipelines.
**Availability and implementation**
METAVIRALSPADES includes VIRALASSEMBLY, VIRALVERIFY and VIRALCOMPLETE modules that are available as standalone packages:
and | | https://github.com/ablab/spades/tree/metaviral_publication,
https://github.com/ablab/viralVerify/
https://github.com/ablab/viralComplete/. | https://academic.oup.com/bioinformatics/article/36/14/4126/5837667 | Phage Kitchen | | | | | | | | |
302 | 2023-11-13T11:12:18.883Z | MetaWRAP - a flexible pipeline for genome-resolved metagenomic data analysis | | MetaWRAP aims to be an easy-to-use metagenomic wrapper suite that accomplishes the core tasks of metagenomic analysis from start to finish: read quality control, assembly, visualization, taxonomic profiling, extracting draft genomes (binning), and functional annotation. Additionally, metaWRAP takes bin extraction and analysis to the next level (see module overview below). While there is no single best approach for processing metagenomic data, metaWRAP is meant to be a fast and simple approach before you delve deeper into parameterization of your analysis. MetaWRAP can be applied to a variety of environments, including gut, water, and soil microbiomes (see metaWRAP paper for benchmarks). Each individual module of metaWRAP is a standalone program, which means you can use only the modules you are interested in for your data. | | https://github.com/bxlab/metaWRAP | https://microbiomejournal.biomedcentral.com/articles/10.1186/s40168-018-0541-1 | Phage Kitchen | https://trello.com/1/cards/6201e658cb39c1591b145040/attachments/6201e6664ee0d5422a604141/download/image.png | | | | | | | |
303 | 2023-11-13T11:12:18.883Z | MGnify: the microbiome analysis resource in 2020 | | MGnify ( provides a free to use platform for the assembly, analysis and archiving of microbiome data derived from sequencing microbial populations that are present in particular environments. Over the past 2 years, MGnify (formerly EBI Metagenomics) has more than doubled the number of publicly available analysed datasets held within the resource. Recently, an updated approach to data analysis has been unveiled (version 5.0), replacing the previous single pipeline with multiple analysis pipelines that are tailored according to the input data, and that are formally described using the Common Workflow Language, enabling greater provenance, reusability, and reproducibility. MGnify's new analysis pipelines offer additional approaches for taxonomic assertions based on ribosomal internal transcribed spacer regions (ITS1/2) and expanded protein functional annotations. Biochemical pathways and systems predictions have also been added for assembled contigs. MGnify's growing focus on the assembly of metagenomic data has also seen the number of datasets it has assembled and analysed increase six-fold. The non-redundant protein database constructed from the proteins encoded by these assemblies now exceeds 1 billion sequences. Meanwhile, a newly developed contig viewer provides fine-grained visualisation of the assembled contigs and their enriched annotations. | | | http://www.ebi.ac.uk/metagenomics)
https://academic.oup.com/nar/article/48/D1/D570/5614179 | Phage Kitchen | | | | | | | | |
304 | 2023-11-13T11:12:18.883Z | MGV - Metagenomic Gut Virus catalogue viral detection pipeline | | The Metagenomic Gut Virus catalogue improves detection of viruses in stool metagenomes and accounts for nearly 40% of CRISPR spacers found in human gut Bacteria and Archaea. We also produced a catalogue of 459,375 viral protein clusters to explore the functional potential of the gut virome
Access to the full catalogue of viral genomes, protein clusters, diversity-generating retroelements and CRISPR spacers is provided without restrictions at Any requests for further data should be directed to the corresponding authors.
**Description:**
The Metagenomic Gut Virus catalogue improves detection of viruses in stool metagenomes and accounts for nearly 40% of CRISPR spacers found in human gut Bacteria and Archaea. We also produced a catalogue of 459,375 viral protein clusters to explore the functional potential of the gut virome
Access to the full catalogue of viral genomes, protein clusters, diversity-generating retroelements and CRISPR spacers is provided without restrictions at Any requests for further data should be directed to the corresponding authors. | | https://github.com/snayfach/MGV
https://github.com/snayfach/MGV | https://www.nature.com/articles/s41564-021-00928-6
https://portal.nersc.gov/MGV.
https://www.nature.com/articles/s41564-021-00928-6
https://portal.nersc.gov/MGV. | Phage Kitchen | https://trello.com/1/cards/6183207b8771288e6a739eab/attachments/61832100ae71055d17d6d6b8/download/MGV.png | | | | | | | |
305 | 2023-11-13T11:12:18.883Z | MGV/aaicluster | | | | | | | | | | | | | | |
306 | 2023-11-13T11:12:18.883Z | MGV/anicluster | | | | | | | | | | | | | | |
307 | 2023-11-13T11:12:18.883Z | MGV/crispr_spacers | | | | | | | | | | | | | | |
308 | 2023-11-13T11:12:18.883Z | MGV/marker_gene_tree | | | | | | | | | | | | | | |
309 | 2023-11-13T11:12:18.883Z | MGV/snptree | | | | | | | | | | | | | | |
310 | 2023-11-13T11:12:18.883Z | MIST | | | | | | | | | | | | | | |
311 | 2023-11-13T11:12:18.883Z | ML-based (outside of HMMs) | | | | | | | | | | | | | | |
312 | 2023-11-13T11:12:18.883Z | MMSeqs2 (HMM profile search) | | | | | | | | | | | | | | |
313 | 2023-12-26T17:30:47.559Z | MultiPHATE2 - bioinformatics pipeline for functional annotation of phage isolates | | MULTIPHATE2 - >
MULTIPHATE - >
PHANNOTATE ->
**ABOUT THE MULTI-PHATE PIPELINE DRIVER**
MultiPhATE is a command-line program that runs gene finding and the PhATE annotation code over user-specified phage genomes, then performs gene-by-gene comparisons among the genomes. The multiPhate.py code takes a single argument consisting of a configuration file (hereafter referred to as, multiPhate.config; use the file sample.multiPhate.config as starting point) and uses it to specify annotation parameters. Then, multiPhate.py invokes the PhATE pipeline for each genome. See below for the types of annotations that PhATE performs. If two or more genomes are specified by the user, then multiPhATE will run the CompareGeneProfiles code to identify corresponding genes among the genomes.
**ABOUT THE PHATE PIPELINE**
PhATE is a fully automated computational pipeline for identifying and annotating phage genes in genome sequence. PhATE is written in Python 3.7, and runs on Linux and Mac operating systems. Code execution is controled by a configuration file, which can be tailored to run specific gene finders and to blast sequences against specific phage- and virus-centric data sets, in addition to more generic (genome, protein) data sets. See below for the specific databases that are accommodated. PhATE runs at least one gene finding algorithm, then annotates the genome, gene, and protein sequences using nucleotide and protein blast flavors and a set of fasta sequence databases, and uses hmm searches (phmmer, jackhmmer) against these same fasta databases. It also runs hmmscan against the pVOG and VOG hmm profile databases. If more than one gene finder is run, PhATE will provide a side-by-side comparison of the genes called by each gene caller. The user specifies the preferred gene caller, and the genes and proteins predicted by that caller are annotated using blast against the supporting databases (or, the user may specify one of the comparison gene sets: superset, consensus, or commoncore, for functional annotation). Classification of each protein sequence into a pVOG or VOG group is followed by generation of an alignment-ready fasta file. By convention, genome sequence files end with extension, ".fasta"; gene nucleotide fasta files end with, ".fnt", and cds amino-acid fasta files end with, ".faa".
**Description:**
MULTIPHATE2 - >
MULTIPHATE - >
PHANNOTATE ->
**ABOUT THE MULTI-PHATE PIPELINE DRIVER**
MultiPhATE is a command-line program that runs gene finding and the PhATE annotation code over user-specified phage genomes, then performs gene-by-gene comparisons among the genomes. The multiPhate.py code takes a single argument consisting of a configuration file (hereafter referred to as, multiPhate.config; use the file sample.multiPhate.config as starting point) and uses it to specify annotation parameters. Then, multiPhate.py invokes the PhATE pipeline for each genome. See below for the types of annotations that PhATE performs. If two or more genomes are specified by the user, then multiPhATE will run the CompareGeneProfiles code to identify corresponding genes among the genomes.
**ABOUT THE PHATE PIPELINE**
PhATE is a fully automated computational pipeline for identifying and annotating phage genes in genome sequence. PhATE is written in Python 3.7, and runs on Linux and Mac operating systems. Code execution is controled by a configuration file, which can be tailored to run specific gene finders and to blast sequences against specific phage- and virus-centric data sets, in addition to more generic (genome, protein) data sets. See below for the specific databases that are accommodated. PhATE runs at least one gene finding algorithm, then annotates the genome, gene, and protein sequences using nucleotide and protein blast flavors and a set of fasta sequence databases, and uses hmm searches (phmmer, jackhmmer) against these same fasta databases. It also runs hmmscan against the pVOG and VOG hmm profile databases. If more than one gene finder is run, PhATE will provide a side-by-side comparison of the genes called by each gene caller. The user specifies the preferred gene caller, and the genes and proteins predicted by that caller are annotated using blast against the supporting databases (or, the user may specify one of the comparison gene sets: superset, consensus, or commoncore, for functional annotation). Classification of each protein sequence into a pVOG or VOG group is followed by generation of an alignment-ready fasta file. By convention, genome sequence files end with extension, ".fasta"; gene nucleotide fasta files end with, ".fnt", and cds amino-acid fasta files end with, ".faa". | | https://github.com/carolzhou/multiPhATE2
https://github.com/carolzhou/multiPhATE2 | http://dx.doi.org/10.1093/g3journal/jkab074
https://doi.org/10.1093/bioinformatics/btz258
https://doi.org/10.1093/bioinformatics/btz265
http://dx.doi.org/10.1093/g3journal/jkab074
https://doi.org/10.1093/bioinformatics/btz258
https://doi.org/10.1093/bioinformatics/btz265 | Phage Kitchen | https://trello.com/1/cards/6178a033dfc62f89a9a87e65/attachments/618231016a709d5d597ec768/download/m_jkab074f1.jpeg | | | | | | | |
314 | 2023-11-13T11:12:18.883Z | multiqc | | | | | | | | | | | | | | |
315 | 2023-11-13T11:12:18.883Z | MUSCLE | | | | | | | | | | | | | | |
316 | 2023-11-13T11:12:18.883Z | Naming phages - literature | | | | | https://trello.com/c/FI9EF6ut/18-naming-phages-literature | Phage Kitchen | https://trello.com/1/cards/6178b1e32fc0055c58b5fc6a/attachments/6178b2141e3d2f8f9d59cbec/download/viruses-09-00070-v2.pdf | | | | | | | |
317 | 2023-11-13T11:12:18.883Z | National Collection of Type Cultures (UK) | | | | | https://www.bacteriophage.news/database/uk-national-collection-of-type-cultures/ | Phage Kitchen | | | | | | | | |
318 | 2023-11-13T11:12:18.883Z | NCBI (nr, refseq, taxonomy...) | | | | | | | | | | | | | | |
319 | 2023-11-13T11:12:18.883Z | OnePetri: accelerating common bacteriophage Petri dish assays with computer vision | | OnePetri uses machine learning models & computer vision to automatically detect Petri dishes and plaques, count plaques, and perform common assay calculations with these values (plaque/titration assay).
Note that as of now, OnePetri only works with circular Petri dishes; however, other shapes (square & rectangle) may be added if sufficient training images can be obtained. Additionally, the models used in the app require one plate per dilution, and as such, spot assays are not currently supported.
All image processing & detection is done locally on-device, with no need for an internet connection once the app has been installed. As such, OnePetri does not collect, store, or transmit any user data or images. Updates are likely to be released regularly, so regular access to the internet is strongly recommended. | | https://github.com/mshamash/OnePetri
https://github.com/mshamash/onepetri-benchmark | https://www.biorxiv.org/content/10.1101/2021.09.27.460959v1
https://onepetri.ai/ | Phage Kitchen | https://trello.com/1/cards/6183367f52ca4f0ab766ba64/attachments/61833718be698a10bdd5cc90/download/F2.large.jpg | | | | | | | |
320 | 2023-11-13T11:12:18.883Z | OPTSIL (clustering) | | | | | | | | | | | | | | |
321 | 2023-11-13T11:12:18.883Z | Other tools | | | | | | | | | | | | | | |
322 | 2023-11-13T11:12:18.883Z | Paper - A Roadmap for Genome-Based Phage Taxonomy | | Bacteriophage (phage) taxonomy has been in flux since its inception over four decades ago. Genome sequencing has put pressure on the classification system and recent years have seen significant changes to phage taxonomy. Here, we reflect on the state of phage taxonomy and provide a roadmap for the future, including the abolition of the order Caudovirales and the families Myoviridae, Podoviridae, and Siphoviridae. Furthermore, we specify guidelines for the demarcation of species, genus, subfamily and family-level ranks of tailed phage taxonomy. | | | https://www.mdpi.com/1999-4915/13/3/506 | Phage Kitchen | | | | | | | | |
323 | 2023-11-13T11:12:18.883Z | Paper - Assessing Illumina technology for the high-throughput sequencing of bacteriophage genomes | | We assessed the suitability of Illumina technology for high-throughput sequencing and subsequent assembly of phage genomes. In silico datasets reveal that 30√ó coverage is sufficient to correctly assemble the complete genome of ~98.5% of known phages, with experimental data confirming that the majority of phage genomes can be assembled at 30√ó coverage. Furthermore, in silico data demonstrate it is possible to co-sequence multiple phages from different hosts, without introducing assembly errors. | | | https://www.ncbi.nlm.nih.gov/labs/pmc/articles/PMC4893331/ | Phage Kitchen | | | | | | | | |
324 | 2023-11-13T11:12:18.883Z | Paper - Virome Sequencing of the Human Intestinal Mucosal–Luminal Interface | | While the human gut virome has been increasingly explored in recent years, nearly all studies have been limited to fecal sampling. The mucosal–luminal interface has been established as a viable sample type for profiling the microbial biogeography of the gastrointestinal tract.
We have developed a protocol to extract nucleic acids from viruses at the mucosal–luminal interface of the proximal and distal colon. Colonic viromes from pediatric patients with Crohn's disease demonstrated high interpatient diversity and low but significant intrapatient variation between sites. Whole metagenomics was also performed to explore virome–bacteriome interactions and to compare the viral communities observed in virome and whole metagenomic sequencing. A site-specific study of the human gut virome is a necessary step to advance our understanding of virome–bacteriome–host interactions in human diseases.
**Keywords:** virome, bacteriophage, phage, microbiome, gut mucosa, phageome, gut microbiome | | | https://trello.com/c/lvmH9lW7/76-paper-virome-sequencing-of-the-human-intestinal-mucosal-luminal-interface | Phage Kitchen | https://trello.com/1/cards/618af7a5f7fa9d85b78501aa/attachments/618af83e82b11c348d3c8337/download/Image_1.JPEG | | | | | | | |
325 | 2023-11-13T11:12:18.883Z | PATRIC | | | | | https://trello.com/c/rE3cchQQ/80-patric | Phage Kitchen | | | | | | | | |
326 | 2023-11-13T11:12:18.883Z | PDB | | | | | | | | | | | | | | |
327 | 2023-11-13T11:12:18.883Z | pdm_utils - SEA-PHAGES program to create, update, and maintain MySQL phage genomics databases | | ``pdm_utils`` is a Python package designed in combination with a pre-defined MySQL database schema in order to facilitate the creation, management, and manipulation of phage genomics databases in the :seaphages:`SEA-PHAGES program <>`. The package is directly connected to the structure of the MySQL database, and it provides several types of functionality:
1. :ref:`Python library ` including:
a. Classes to store/parse phage genomes, interact with a local MySQL genomics database, and manage the process of making database changes.
b. Functions and methods to manipulate those classes as well as interact with several databases and servers, including PhagesDB, GenBank, PECAAN, and MySQL.
2. A command line :ref:`toolkit ` to process data and maintain a phage genomics database.
``pdm_utils`` is useful for:
1. The Hatfull lab to maintain MySQL phage genomics databases in the SEA-PHAGES program (:ref:`current pipeline `).
2. Researchers to evaluate new genome annotations (:ref:`flat file QC `).
3. Researchers to directly access and retrieve phage genomics data from any compatible MySQL database (:ref:`tutorial `).
4. Researchers to create :ref:`custom ` MySQL phage genomics databases.
5. Developers to build downstream data analysis tools (:ref:`tutorial `). | | https://github.com/SEA-PHAGES/pdm_utils | | Phage Kitchen | https://trello.com/1/cards/61833a9614f91804d55da1af/attachments/61833adfdc032a1f16a4d66b/download/schema_10_map.jpg | | | | | | | |
328 | 2023-11-13T11:12:18.883Z | Perl | | | | | | | | | | | | | | |
329 | 2023-11-13T11:12:18.883Z | PFAM | | | | | | | | | | | | | | |
330 | 2023-11-13T11:12:18.883Z | PhaGCN - GCN based model classifier | | PhaGCN is a GCN based model, which can learn the species masking feature via deep learning classifier, for new Phage taxonomy classification. To use PhaGCN, you only need to input your contigs to the program.
**Description:**
PhaGCN is a GCN based model, which can learn the species masking feature via deep learning classifier, for new Phage taxonomy classification. To use PhaGCN, you only need to input your contigs to the program. | | https://github.com/KennthShang/PhaGCN
https://github.com/KennthShang/PhaGCN | https://academic.oup.com/bioinformatics/article/37/Supplement_1/i25/6319660
https://academic.oup.com/bioinformatics/article/37/Supplement_1/i25/6319660 | Phage Kitchen | https://trello.com/1/cards/61833bc5e728741ec9f6b8b5/attachments/61ba7f0925743d0f0be8deee/download/image.png | | | | | | | |
331 | 2023-11-13T11:12:18.883Z | Phage Commander, an Application for Rapid Gene Identification in Bacteriophage Genomes Using Multiple Programs | | We present the Phage Commander application for rapid identification of bacteriophage genes using multiple gene identification programs. Phage Commander runs a bacteriophage genome sequence through nine gene identification programs (and an additional program for identification of tRNAs) and integrates the results within a single output table. Phage Commander also generates formatted output files for direct export to National Center for Biotechnology Information GenBank or genome visualization programs such as DNA Master.
**Description:**
We present the Phage Commander application for rapid identification of bacteriophage genes using multiple gene identification programs. Phage Commander runs a bacteriophage genome sequence through nine gene identification programs (and an additional program for identification of tRNAs) and integrates the results within a single output table. Phage Commander also generates formatted output files for direct export to National Center for Biotechnology Information GenBank or genome visualization programs such as DNA Master. | | https://github.com/sarah-harris/PhageCommander
https://github.com/sarah-harris/PhageCommander | https://www.liebertpub.com/doi/full/10.1089/phage.2020.0044
https://www.liebertpub.com/doi/full/10.1089/phage.2020.0044 | Phage Kitchen | | | | | | | | |
332 | 2023-11-13T11:12:18.883Z | Phage related tools listing - git repos | | | | https://github.com/sxh1136/Phage_tools
https://github.com/voorloopnul/awesome-phages | | Phage Kitchen | | | | | | | | |
333 | 2023-12-26T17:30:18.414Z | PhageAI - PhageAI is an Artificial Intelligence application for your daily Phage Research | | PhageAI is an application that simultaneously represents a repository of knowledge of bacteriophages and a tool to analyse genomes with Artificial Intelligence support.
## Framework modules
Set of methods related with:
* `lifecycle` - bacteriophage lifecycle prediction:
* `.predict(fasta_path)` - return bacteriophage lifecycle prediction class (Virulent, Temperate or Chronic) with probability (%);
* `taxonomy` - bacteriophage taxonomy order, family and genus prediction (TBA);
* `topology` - bacteriophage genome topology prediction (TBA);
* `repository` - set of methods related with PhageAI bacteriophage repository:
* `.get_record(value)` - return dict with Bacteriophage meta-data
* `.get_top10_similar_phages(value)` - return list of dicts contained top-10 most similar bacteriophages
------
Machine Learning algorithms can process enormous amounts of data in relatively short time in order to find connections and dependencies that are unobvious for human beings. Correctly designed applications based on AI are able to vastly improve and speed up the work of the domain experts.
Models based on DNA contextual vectorization and Deep Neural Networks are particularly effective when it comes to analysis of genomic data. The system that we propose aims to use the phages sequences uploaded to the database to build a model which is able to predict if a bacteriophage is virulent, temperate or chronic with a high probability.
One of the key system modules is the bacteriophages repository with a clean web interface that allows to browse, upload and share data with other users. The gathered knowledge about the bacteriophages is not only valuable on its own but also because of the ability to train the ever-improving Machine Learning models.
Detection of virulent or temperate features is only one of the first tasks that can be solved with Artificial Intelligence. The combination of Biology, Natural Language Processing and Machine Learning allows us to create algorithms for genomic data processing that could eventually turn out to be effective in a wide range of problems with focus on classification and information extraced from DNA.
**Description:**
PhageAI is an application that simultaneously represents a repository of knowledge of bacteriophages and a tool to analyse genomes with Artificial Intelligence support.
**Framework modules**
Set of methods related with:
* `lifecycle` - bacteriophage lifecycle prediction:
* `.predict(fasta_path)` - return bacteriophage lifecycle prediction class (Virulent, Temperate or Chronic) with probability (%);
* `taxonomy` - bacteriophage taxonomy order, family and genus prediction (TBA);
* `topology` - bacteriophage genome topology prediction (TBA);
* `repository` - set of methods related with PhageAI bacteriophage repository:
* `.get_record(value)` - return dict with Bacteriophage meta-data
* `.get_top10_similar_phages(value)` - return list of dicts contained top-10 most similar bacteriophages
------
Machine Learning algorithms can process enormous amounts of data in relatively short time in order to find connections and dependencies that are unobvious for human beings. Correctly designed applications based on AI are able to vastly improve and speed up the work of the domain experts.
Models based on DNA contextual vectorization and Deep Neural Networks are particularly effective when it comes to analysis of genomic data. The system that we propose aims to use the phages sequences uploaded to the database to build a model which is able to predict if a bacteriophage is virulent, temperate or chronic with a high probability.
One of the key system modules is the bacteriophages repository with a clean web interface that allows to browse, upload and share data with other users. The gathered knowledge about the bacteriophages is not only valuable on its own but also because of the ability to train the ever-improving Machine Learning models.
Detection of virulent or temperate features is only one of the first tasks that can be solved with Artificial Intelligence. The combination of Biology, Natural Language Processing and Machine Learning allows us to create algorithms for genomic data processing that could eventually turn out to be effective in a wide range of problems with focus on classification and information extraced from DNA. | | https://github.com/phageaisa/phageai
https://github.com/phageaisa/phageai | https://phage.ai/accounts/login/?next=/
https://phage.ai/accounts/login/?next=/ | Phage Kitchen | | | | | | | | |
334 | 2023-11-13T11:12:18.883Z | PhagePromoter - Predicting promoters in phage genomes | | The growing interest in phages as antibacterial agents has led to an increase in the number of sequenced phage genomes, increasing the need for intuitive bioinformatics tools for performing genome annotation. The identification of phage promoters is indeed the most difficult step of this process. Due to the lack of online tools for phage promoter prediction, we developed PhagePromoter, a tool for locating promoters in phage genomes, using machine learning methods. This is the first online tool for predicting promoters that uses phage promoter data and the first to identify both host and phage promoters with different motifs.
Availability and implementation
This tool was integrated in the Galaxy framework and it is available online at:
**Description:**
The growing interest in phages as antibacterial agents has led to an increase in the number of sequenced phage genomes, increasing the need for intuitive bioinformatics tools for performing genome annotation. The identification of phage promoters is indeed the most difficult step of this process. Due to the lack of online tools for phage promoter prediction, we developed PhagePromoter, a tool for locating promoters in phage genomes, using machine learning methods. This is the first online tool for predicting promoters that uses phage promoter data and the first to identify both host and phage promoters with different motifs.
Availability and implementation
This tool was integrated in the Galaxy framework and it is available online at: | | | https://bit.ly/2Dfebfv.
https://academic.oup.com/bioinformatics/article/35/24/5301/5540317
https://academic.oup.com/bioinformatics/article/35/24/5301/5540317
https://bit.ly/2Dfebfv. | Phage Kitchen | | | | | | | | |
335 | 2023-11-13T11:12:18.883Z | phageReceptor - phage-host receptor interactions | | phageReceptor is a database of phage-host receptor interactions, which included 427 pairs of phage-host receptor interactions, 341 unique viral species/sub-species, and 69 bacterial species. Based on phageReceptor, we systematically analyzed the associations between phage-host receptor interactions, and characterized the phage protein receptors by structure, function, protein-protein interaction and expression. | | | https://dx.doi.org/10.1093/BIOINFORMATICS/BTAA123
http://www.computationalbiology.cn/phageReceptor/index.html | Phage Kitchen | | | | | | | | |
336 | 2023-11-13T11:12:18.883Z | PhageTerm: a tool for fast and accurate determination of phage termini and packaging mechanism using next-generation sequencing data | | In this work, we demonstrate how it is possible to recover more information from sequencing data than just the phage genome. We developed a theoretical and statistical framework to determine DNA termini and phage packaging mechanisms using NGS data. Our method relies on the detection of biases in the number of reads, which are observable at natural DNA termini compared with the rest of the phage genome. We implemented our method with the creation of the software PhageTerm and validated it using a set of phages with well-established packaging mechanisms representative of the termini diversity, i.e. 5′cos (Lambda), 3′cos (HK97), pac (P1), headful without a pac site (T4), DTR (T7) and host fragment (Mu). In addition, we determined the termini of nine Clostridium difficile phages and six phages whose sequences were retrieved from the Sequence Read Archive. PhageTerm is freely available (), as a Galaxy ToolShed and on a Galaxy-based server ( | | | https://sourceforge.net/projects/phageterm
https://sourceforge.net/projects/phageterm),
https://galaxy.pasteur.fr).
https://www.nature.com/articles/s41598-017-07910-5 | Phage Kitchen | | | | | | | | |
337 | 2023-11-13T11:12:18.883Z | PhageTermVirome - High-throughput identification of viral termini and packaging mechanisms in virome datasets | | Here, we introduce PhageTermVirome (PTV) as a tool for the easy and rapid high-throughput determination of phage termini and packaging mechanisms using modern large-scale metagenomics datasets. We successfully tested the PTV algorithm on a mock virome dataset and then used it on two real virome datasets to achieve the rapid identification of more than 100 phage termini and packaging mechanisms, with just a few hours of computing time. Because PTV allows the identification of free fully formed viral particles (by recognition of termini present only in encapsidated DNA), it can also complement other virus identification softwares to predict the true viral origin of contigs in viral metagenomics datasets. PTV is a novel and unique tool for high-throughput characterization of phage genomes, including phage termini identification and characterization of genome packaging mechanisms. This software should help researchers better visualize, map and study the virosphere. PTV is freely available for downloading and installation at
**Description:**
Here, we introduce PhageTermVirome (PTV) as a tool for the easy and rapid high-throughput determination of phage termini and packaging mechanisms using modern large-scale metagenomics datasets. We successfully tested the PTV algorithm on a mock virome dataset and then used it on two real virome datasets to achieve the rapid identification of more than 100 phage termini and packaging mechanisms, with just a few hours of computing time. Because PTV allows the identification of free fully formed viral particles (by recognition of termini present only in encapsidated DNA), it can also complement other virus identification softwares to predict the true viral origin of contigs in viral metagenomics datasets. PTV is a novel and unique tool for high-throughput characterization of phage genomes, including phage termini identification and characterization of genome packaging mechanisms. This software should help researchers better visualize, map and study the virosphere. PTV is freely available for downloading and installation at | | | https://gitlab.pasteur.fr/vlegrand/ptv.
https://www.nature.com/articles/s41598-021-97867-3
https://www.nature.com/articles/s41598-021-97867-3
https://gitlab.pasteur.fr/vlegrand/ptv. | Phage Kitchen | https://trello.com/1/cards/61836d4bd8ac780c3af6b120/attachments/61836d7560fe9587c9cef8cd/download/image.png | | | | | | | |
338 | 2023-11-13T11:12:18.883Z | Phamerator & BYU-Phamerator | | **Phamerator**
Phamerator is a comparative genomics and genome exploration tool designed and written by Dr. Steve Cresawn of James Madison University.
In 2017, Phamerator transitioned from a Linux-based program to be a cross-platform web-based program. It is available at
-----------
**BYU-Phamerator**
Here we describe modifications to the phage comparative genomics software program, Phamerator, provide public access to the code, and include instructions for creating custom Phamerator databases. We further report genomic analysis techniques to determine phage packaging strategies and identification of the physical ends of phage genomes.
Results
The original Phamerator code can be successfully modified and custom databases can be generated using the instructions we provide. Results of genome map comparisons within a custom database reveal obstacles in performing the comparisons if a published genome has an incorrect complementarity or an incorrect location of the first base of the genome, which are common issues in GenBank-downloaded sequence files. To address these issues, we review phage packaging strategies and provide results that demonstrate identification of the genome start location and orientation using raw sequencing data and software programs such as PAUSE and Consed to establish the location of the physical ends of the genome. These results include determination of exact direct terminal repeats (DTRs) or cohesive ends, or whether phages may use a headful packaging strategy. Phylogenetic analysis using ClustalO and phamily circles in Phamerator demonstrate that the large terminase gene can be used to identify the phage packaging strategy and thereby aide in identifying the physical ends of the genome.
**Description:**
**Phamerator**
Phamerator is a comparative genomics and genome exploration tool designed and written by Dr. Steve Cresawn of James Madison University.
In 2017, Phamerator transitioned from a Linux-based program to be a cross-platform web-based program. It is available at
-----------
**BYU-Phamerator**
Here we describe modifications to the phage comparative genomics software program, Phamerator, provide public access to the code, and include instructions for creating custom Phamerator databases. We further report genomic analysis techniques to determine phage packaging strategies and identification of the physical ends of phage genomes.
Results
The original Phamerator code can be successfully modified and custom databases can be generated using the instructions we provide. Results of genome map comparisons within a custom database reveal obstacles in performing the comparisons if a published genome has an incorrect complementarity or an incorrect location of the first base of the genome, which are common issues in GenBank-downloaded sequence files. To address these issues, we review phage packaging strategies and provide results that demonstrate identification of the genome start location and orientation using raw sequencing data and software programs such as PAUSE and Consed to establish the location of the physical ends of the genome. These results include determination of exact direct terminal repeats (DTRs) or cohesive ends, or whether phages may use a headful packaging strategy. Phylogenetic analysis using ClustalO and phamily circles in Phamerator demonstrate that the large terminase gene can be used to identify the phage packaging strategy and thereby aide in identifying the physical ends of the genome. | | https://github.com/scresawn/Phamerator
https://github.com/scresawn/Phamerator | https://phagesdb.org/Phamerator/faq/
http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Abstract&list_uids=21991981
https://bmcgenomics.biomedcentral.com/articles/10.1186/s12864-016-3018-2
https://phamerator.org/
https://phamerator.org/
https://phagesdb.org/Phamerator/faq/
http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Abstract&list_uids=21991981
https://phamerator.org/
https://bmcgenomics.biomedcentral.com/articles/10.1186/s12864-016-3018-2 | Phage Kitchen | https://trello.com/1/cards/61836f7dfcd95552ff48227e/attachments/6183734af82ab95ee1eaccd3/download/image.png | | | | | | | |
339 | 2023-11-13T11:12:18.883Z | PhANNs - a fast and accurate tool and web server to classify phage structural proteins | | PhANNs is a tool to classify any phage ORF as one of 10 structural protein class, or as "others". It uses an ensemble of Artificial Neural Networks. PhANNs predicts the structural class of a phage ORF by running an artificial neural network ensemble that we created against a fasta file of protein sequences. If you upload a multi-fasta file, we’ll provide you estimates of the structural classes of all the proteins, and we’ll let you download the sequences for each class as a fasta file.
---------------------
For any given bacteriophage genome or phage-derived sequences in metagenomic data sets, we are unable to assign a function to 50–90% of genes, or more. Structural protein-encoding genes constitute a large fraction of the average phage genome and are among the most divergent and difficult-to-identify genes using homology-based methods. To understand the functions encoded by phages, their contributions to their environments, and to help gauge their utility as potential phage therapy agents, we have developed a new approach to classify phage ORFs into ten major classes of structural proteins or into an “other” category. The resulting tool is named PhANNs (Phage Artificial Neural Networks). We built a database of 538,213 manually curated phage protein sequences that we split into eleven subsets (10 for cross-validation, one for testing) using a novel clustering method that ensures there are no homologous proteins between sets yet maintains the maximum sequence diversity for training. An Artificial Neural Network ensemble trained on features extracted from those sets reached a test F1-score of 0.875 and test accuracy of 86.2%. PhANNs can rapidly classify proteins into one of the ten structural classes or, if not predicted to fall in one of the ten classes, as “other,” providing a new approach for functional annotation of phage proteins. PhANNs is open source and can be run from our web server or installed locally.
**Description:**
PhANNs is a tool to classify any phage ORF as one of 10 structural protein class, or as "others". It uses an ensemble of Artificial Neural Networks. PhANNs predicts the structural class of a phage ORF by running an artificial neural network ensemble that we created against a fasta file of protein sequences. If you upload a multi-fasta file, we’ll provide you estimates of the structural classes of all the proteins, and we’ll let you download the sequences for each class as a fasta file.
---------------------
For any given bacteriophage genome or phage-derived sequences in metagenomic data sets, we are unable to assign a function to 50–90% of genes, or more. Structural protein-encoding genes constitute a large fraction of the average phage genome and are among the most divergent and difficult-to-identify genes using homology-based methods. To understand the functions encoded by phages, their contributions to their environments, and to help gauge their utility as potential phage therapy agents, we have developed a new approach to classify phage ORFs into ten major classes of structural proteins or into an “other” category. The resulting tool is named PhANNs (Phage Artificial Neural Networks). We built a database of 538,213 manually curated phage protein sequences that we split into eleven subsets (10 for cross-validation, one for testing) using a novel clustering method that ensures there are no homologous proteins between sets yet maintains the maximum sequence diversity for training. An Artificial Neural Network ensemble trained on features extracted from those sets reached a test F1-score of 0.875 and test accuracy of 86.2%. PhANNs can rapidly classify proteins into one of the ten structural classes or, if not predicted to fall in one of the ten classes, as “other,” providing a new approach for functional annotation of phage proteins. PhANNs is open source and can be run from our web server or installed locally. | | https://github.com/Adrian-Cantu/PhANNs
https://github.com/Adrian-Cantu/PhANNs | http://edwards.sdsu.edu/phanns
https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1007845
https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1007845
http://edwards.sdsu.edu/phanns | Phage Kitchen | | | | | | | | |
340 | 2023-11-13T11:12:18.883Z | PHANOTATE | | PHANOTATE is a tool to annotate phage genomes. It uses the assumption that non-coding bases in a phage genome is disadvantageous, and then populates a weighted graph to find the optimal path through the six frames of the DNA where open reading frames are beneficial paths, while gaps and overlaps are penalized paths. | | https://github.com/deprekate/PHANOTATE | https://academic.oup.com/bioinformatics/article/35/22/4537/5480131 | Phage Kitchen | | | | | | | | |
341 | 2023-11-13T11:12:18.883Z | Phantome | | | | | | | | | | | | | | |
342 | 2023-11-13T11:12:18.883Z | phap - Phage Host Analysis Pipeline | | PHAP wraps the execution of various phage-host prediction tools.
Overview
Features
Uses Singularity containers for the execution of all tools.
When possible (i.e. the image is not larger than a few Gs), tools and their dependencies are bundled in the same container. This means you do not need to get models or any other external databases, unless otherwise specified.
Intermediate processing steps are handled by Conda environments, to ensure smooth and reproducible execution.
Outputs the Last Common Ancestor of all tools, per contig, based on the predicted taxonomy.
* HTP
* RaFAh
* vHuLK
* VirHostMatcher-Net
* WIsH
**Description:**
PHAP wraps the execution of various phage-host prediction tools.
Overview
Features
Uses Singularity containers for the execution of all tools.
When possible (i.e. the image is not larger than a few Gs), tools and their dependencies are bundled in the same container. This means you do not need to get models or any other external databases, unless otherwise specified.
Intermediate processing steps are handled by Conda environments, to ensure smooth and reproducible execution.
Outputs the Last Common Ancestor of all tools, per contig, based on the predicted taxonomy.
* HTP
* RaFAh
* vHuLK
* VirHostMatcher-Net
* WIsH | | https://github.com/MGXlab/phap
https://github.com/wojciech-galan/viruses_classifier
https://github.com/LaboratorioBioinformatica/vHULK
https://github.com/WeiliWw/VirHostMatcher-Net
https://github.com/soedinglab/WIsH
https://github.com/MGXlab/phap
https://github.com/wojciech-galan/viruses_classifier
https://github.com/LaboratorioBioinformatica/vHULK
https://github.com/WeiliWw/VirHostMatcher-Net
https://github.com/soedinglab/WIsH | https://sourceforge.net/projects/rafah/
https://sourceforge.net/projects/rafah/ | Phage Kitchen | | | | | | | | |
343 | 2023-11-13T11:12:18.883Z | PHAROKKA | | phrokka is a fast phage annotation pipeline.
phrokka uses Phanotate (McNair et al 2019 doi:10.1093/bioinformatics/btz265) to conduct gene calling and tRNAscan-SE 2 (Chan et al 2021 to call tRNAs.
phrokka then uses the lightweight PHROGS database ( Terzian et al 2021 to conduct annotation. Specifically, each gene is compared against the entire PHROGS database using mmseqs2.
---
phrokka creates a number of output files in different formats.
The 2 main files phrokka generates is phrokka.gff, which is a gff3 format file including the fasta following the gff table annotations.
phrokka also creates phrokka.tbl, which is a flat-file table suitable to be unploaded to the NCBI's Bankit. | | https://github.com/gbouras13/phrokka | https://doi.org/10.1093/nar/gkab688)
https://phrogs.lmge.uca.fr
https://doi.org/10.1093/nargab/lqab067) | Phage Kitchen | | | | | | | | |
344 | 2023-11-13T11:12:18.883Z | PHASTER DB | | | | | | | | | | | | | | |
345 | 2023-11-13T11:12:18.883Z | PHERI - Phage Host Exploration pipeline | | The solution to this problem may be to use a bioinformatic approach in the form of prediction software capable of determining a bacterial host based on the phage whole-genome sequence. The result of our research is the machine learning algorithm based tool called PHERI. PHERI predicts suitable bacterial host genus for purification of individual viruses from different samples. Besides, it can identify and highlight protein sequences that are important for host selection. PHERI is available at
The source code for the model training is available at
, and the source code for the tool is available at
**Description:**
The solution to this problem may be to use a bioinformatic approach in the form of prediction software capable of determining a bacterial host based on the phage whole-genome sequence. The result of our research is the machine learning algorithm based tool called PHERI. PHERI predicts suitable bacterial host genus for purification of individual viruses from different samples. Besides, it can identify and highlight protein sequences that are important for host selection. PHERI is available at
The source code for the model training is available at
, and the source code for the tool is available at | | https://github.com/andynet/pheri_preprocessing
https://github.com/andynet/pheri.
https://github.com/andynet/pheri_preprocessing
https://github.com/andynet/pheri. | https://hub.docker.com/repository/docker/andynet/pheri.
https://www.biorxiv.org/content/10.1101/2020.05.13.093773v3.full
https://www.biorxiv.org/content/10.1101/2020.05.13.093773v3.full
https://hub.docker.com/repository/docker/andynet/pheri. | Phage Kitchen | | | | | | | | |
346 | 2023-11-13T11:12:18.883Z | Phigaro: high throughput prophage sequence annotation | | Summary Phigaro is a standalone command-line application that is able to detect prophage regions taking raw genome and metagenome assemblies as an input. It also produces dynamic annotated “prophage genome maps” and marks possible transposon insertion spots inside prophages. It provides putative taxonomic annotations that can distinguish tailed from non-tailed phages. It is applicable for mining prophage regions from large metagenomic datasets.
Availability Source code for Phigaro is freely available for download at along with test data. The code is written in Python. | | https://github.com/bobeobibo/phigaro | https://www.biorxiv.org/content/10.1101/598243v1 | Phage Kitchen | | | | | | | | |
347 | 2023-11-13T11:12:18.883Z | Philympics 2021: Prophage Predictions Perplex Programs | | testing:
* Phage Finder (2006)
* PhiSpy (2012)
* VirSorter (2015)
* Phigaro (2020)
* DBSCAN-SWA (2020)
* VIBRANT (2020)
* PhageBoost (2021)
* VirSorter2 (2021) | | | https://f1000research.com/articles/10-758/v1 | Phage Kitchen | | | | | | | | |
348 | 2023-11-13T11:12:18.883Z | Phirbo - A tool to predict prokaryotic hosts for phage (meta)genomic sequences | | Phirbo links phage to host sequences through other intermediate sequences that are potentially homologous to both phage and host sequences.
To link phage (P) to host (H) sequence through intermediate sequences, phage and host sequences need to be used as queries in two separate sequence similarity searches (e.g., BLAST) against the same reference database of prokaryotic genomes (D). One BLAST search is performed for phage query (P) and the other for host query (H). The two lists of BLAST results, P ‚Üí D and H ‚Üí D, contain prokaryotic genomes ordered by decreasing score. To avoid a taxonomic bias due to multiple genomes of the same prokaryote species (e.g., Escherichia coli), prokaryotic species can be ranked according to their first appearance in the BLAST list. In this way, both ranked lists represent phage and host profiles consisting of the ranks of top-score prokaryotic species.
Phirbo estimates the phage-host relationship by comparing the content and order between phage and host ranked lists using Rank-Biased Overlap (RBO) measure. Briefly, RBO fosters comparison of ranked lists of different lengths with heavier weights for matching the higher-ranking items. RBO ranges between 0 and 1, where 0 means that the lists are disjoint (have no items in common) and 1 means that the lists are identical in content and order. | | https://github.com/aziele/phirbo | | Phage Kitchen | https://trello.com/1/cards/61833982e8cf5f4298a3ad20/attachments/618339d66ac1af2038ec3dcc/download/figure.png | | | | | | | |
349 | 2023-11-13T11:12:18.883Z | PHIST - Phage-Host Interaction Search Tool | | A tool to predict prokaryotic hosts for phage (meta)genomic sequences. PHIST links viruses to hosts based on the number of k-mers shared between their sequences.
**Description:**
A tool to predict prokaryotic hosts for phage (meta)genomic sequences. PHIST links viruses to hosts based on the number of k-mers shared between their sequences. | | https://github.com/refresh-bio/PHIST
https://github.com/refresh-bio/PHIST | https://www.biorxiv.org/content/10.1101/2021.09.06.459169v1
https://www.biorxiv.org/content/10.1101/2021.09.06.459169v1 | Phage Kitchen | | | | | | | | |
350 | 2023-11-13T11:12:18.883Z | Phmmer | | | | | | | | | | | | | | |
351 | 2023-11-13T11:12:18.883Z | PHROGS | | | | | | | | | | | | | | |
352 | 2023-11-13T11:12:18.883Z | PILERCR (crispr) | | | | | | | | | | | | | | |
353 | 2023-11-13T11:12:18.883Z | Plaque Size Tool | | Plaque Size Tool is an open-source application written in Python 3 that is able to detect and measure bacteriophage plaques on a Petri dish image.
The source files are located at .
To cite Plaque Size Tool, please use
**Description:**
Plaque Size Tool is an open-source application written in Python 3 that is able to detect and measure bacteriophage plaques on a Petri dish image.
The source files are located at .
To cite Plaque Size Tool, please use | | https://github.com/ellinium/plaque_size_tool
https://github.com/ellinium/plaque_size_tool
https://github.com/ellinium/plaque_size_tool. | https://doi.org/10.1016/j.virol.2021.05.011
https://doi.org/10.1016/j.virol.2021.05.011 | Phage Kitchen | https://trello.com/1/cards/618336f140bb9911840c1069/attachments/6183376932281c5d19e63319/download/image13.jpg | | | | | | | |
354 | 2023-11-13T11:12:18.883Z | Plasmid/ICE contamination check | | | | | | | | | | | | | | |
355 | 2023-11-13T11:12:18.883Z | POG - Orthologous Gene Clusters and Taxon Signature Genes for Viruses of Prokaryotes | | Here, we present an update of the phage orthologous groups (POGs), a collection of 4,542 clusters of orthologous genes from bacteriophages that now also includes viruses infecting archaea and encompasses more than 1,000 distinct virus genomes. Analysis of this expanded data set shows that the number of POGs keeps growing without saturation and that a substantial majority of the POGs remain specific to viruses, lacking homologues in prokaryotic cells, outside known proviruses. Thus, the great majority of virus genes apparently remains to be discovered. | | | https://journals.asm.org/doi/10.1128/JB.01801-12?url_ver=Z39.88-2003&rfr_id=ori:rid:crossref.org&rfr_dat=cr_pub%20%200pubmed | Phage Kitchen | https://trello.com/1/cards/61a713f023021804c22bb858/attachments/61a7143b11d3f363954f2291/download/image.png | | | | | | | |
356 | 2023-11-13T11:12:18.883Z | POGs | | | | | | | | | | | | | | |
357 | 2023-11-13T11:12:18.883Z | PPHMM | | | | | | | | | | | | | | |
358 | 2023-11-13T11:12:18.883Z | PPR-Meta: a tool for identifying phages and plasmids from metagenomic fragments using deep learning | | We present PPR-Meta, a 3-class classifier that allows simultaneous identification of both phage and plasmid fragments from metagenomic assemblies. PPR-Meta consists of several modules for predicting sequences of different lengths. Using deep learning, a novel network architecture, referred to as the Bi-path Convolutional Neural Network, is designed to improve the performance for short fragments. PPR-Meta demonstrates much better performance than currently available similar tools individually for phage or plasmid identification, while testing on both artificial contigs and real metagenomic data. PPR-Meta is freely available via or | | https://github.com/zhenchengfang/PPR-Meta. | http://cqb.pku.edu.cn/ZhuLab/PPR_Meta
https://www.ncbi.nlm.nih.gov/labs/pmc/articles/PMC6586199/ | Phage Kitchen | | | | | | | | |
359 | 2023-11-13T11:12:18.883Z | pprmeta | | | | | | | | | | | | | | |
360 | 2023-11-13T11:12:18.883Z | Presentation - phage genome annotation and classification - how to get started | | Contains lots of info on circularising issue (repeats, etc..) | | | https://quadram.ac.uk/wp-content/uploads/2021/02/APF_phage_annotation_EAdriaenssens-red.pdf | Phage Kitchen | | | | | | | | |
361 | 2023-11-13T11:12:18.883Z | prodigal | | Prodigal: prokaryotic gene recognition and translation initiation site identification | | | | | | | | | | | | |
362 | 2023-11-13T11:12:18.883Z | progressiveMauve | | | | | | | | | | | | | | |
363 | 2023-11-13T11:12:18.883Z | prokka | | | | | | | | | | | | | | |
364 | 2023-11-13T11:12:18.883Z | Prophage Hunter - an integrative hunting tool for active prophages | | We present Prophage Hunter, a tool aimed at hunting for active prophages from whole genome assembly of bacteria. Combining sequence similarity-based matching and genetic features-based machine learning classification, we developed a novel scoring system that exhibits higher accuracy than current tools in predicting active prophages on the validation datasets. The option of skipping similarity matching is also available so that there's higher chance for novel phages to be discovered. Prophage Hunter provides a one-stop web service to extract prophage genomes from bacterial genomes, evaluate the activity of the prophages, identify phylogenetically related phages, and annotate the function of phage proteins. Prophage Hunter is freely available at
**Description:**
We present Prophage Hunter, a tool aimed at hunting for active prophages from whole genome assembly of bacteria. Combining sequence similarity-based matching and genetic features-based machine learning classification, we developed a novel scoring system that exhibits higher accuracy than current tools in predicting active prophages on the validation datasets. The option of skipping similarity matching is also available so that there's higher chance for novel phages to be discovered. Prophage Hunter provides a one-stop web service to extract prophage genomes from bacterial genomes, evaluate the activity of the prophages, identify phylogenetically related phages, and annotate the function of phage proteins. Prophage Hunter is freely available at | | | https://pro-hunter.bgi.com/.
https://academic.oup.com/nar/article/47/W1/W74/5494712
https://academic.oup.com/nar/article/47/W1/W74/5494712
https://pro-hunter.bgi.com/. | Phage Kitchen | https://trello.com/1/cards/6201ebef6118d03d98ea20d7/attachments/6201ec21c1d0366b647235ea/download/image.png | | | | | | | |
365 | 2023-11-13T11:12:18.883Z | ProxiMeta and ProxiPhage (PhaseGenomics - Commercial) | | We developed an end-to-end bioinformatics platform for viral genome reconstruction and host attribution from metagenomic data using proximity-ligation sequencing (i.e., Hi-C). We demonstrate the capabilities of the platform by recovering and characterizing the metavirome of a variety of metagenomes, including a fecal microbiome that has also been sequenced with accurate long reads, allowing for the assessment and benchmarking of the new methods. The platform can accurately extract numerous near-complete viral genomes even from highly fragmented short-read assemblies and can reliably predict their cellular hosts with minimal false positives. To our knowledge, this is the first software for performing these tasks. Being significantly cheaper than long-read sequencing of comparable depth, the incorporation of proximity-ligation sequencing in microbiome research shows promise to greatly accelerate future advancements in the field. | | | https://phasegenomics.com/wp-content/uploads/2021/06/ProxiMeta_Phage-Analysis-App-Note_June-2021.pdf
https://www.biorxiv.org/content/10.1101/2021.06.14.448389v1.full
https://phasegenomics.com/wp-content/uploads/2021/06/PhaseGenomics_ASM_IHMC_Poster_2021-2.pdf | Phage Kitchen | https://trello.com/1/cards/6202faad25d55618bd5fe06d/attachments/6202fb224d6b23742207c6b1/download/image.png | | | | | | | |
366 | 2023-11-13T11:12:18.883Z | pVOG-DB | | | | | | | | | | | | | | |
367 | 2023-11-13T11:12:18.883Z | Python | | | | | | | | | | | | | | |
368 | 2023-11-13T11:12:18.883Z | Queen Astrid Military Hospital (Belgium) | | | | | https://phage.directory/capsid/phage-futures-jean-paul-pirnay | Phage Kitchen | | | | | | | | |
369 | 2023-11-13T11:12:18.883Z | R | | | | | | | | | | | | | | |
370 | 2023-11-13T11:12:18.883Z | rafah | | | | | | | | | | | | | | |
371 | 2023-11-13T11:12:18.883Z | rafah - Random Forest Assignment of Hosts | | One fundamental question when trying to describe viruses of Bacteria and Archaea is: Which host do they infect? To tackle this issue we developed a machine-learning approach named Random Forest Assignment of Hosts (RaFAH), which outperformed other methods for virus-host prediction. Our rationale was that the machine could learn the associations between genes and hosts much more efficiently than a human, while also using the information contained in the hypothetical proteins. Random forest models were built using the Ranger⁠ package in R⁠.
**Description:**
One fundamental question when trying to describe viruses of Bacteria and Archaea is: Which host do they infect? To tackle this issue we developed a machine-learning approach named Random Forest Assignment of Hosts (RaFAH), which outperformed other methods for virus-host prediction. Our rationale was that the machine could learn the associations between genes and hosts much more efficiently than a human, while also using the information contained in the hypothetical proteins. Random forest models were built using the Ranger⁠ package in R⁠. | | | https://sourceforge.net/projects/rafah/
https://www.sciencedirect.com/science/article/pii/S2666389921001008
https://www.sciencedirect.com/science/article/pii/S2666389921001008
https://sourceforge.net/projects/rafah/ | Phage Kitchen | https://trello.com/1/cards/61ba781d22aac541652633b2/attachments/61ba78770c46cf1034d494f3/download/image.png | | | | | | | |
372 | 2023-11-13T11:12:18.883Z | RAST | | | | | https://www.ncbi.nlm.nih.gov/labs/pmc/articles/PMC3965101/
https://link.springer.com/protocol/10.1007/978-1-4939-7343-9_17 | Phage Kitchen | https://trello.com/1/cards/6189db69330c2189dcd5f848/attachments/6189db857b333e773d5d673a/download/image.png | | | | | | | |
373 | 2023-11-13T11:12:18.883Z | RFAM | | | | | | | | | | | | | | |
374 | 2023-11-13T11:12:18.883Z | RIME bioinformatics | | | | | https://www.rime-bioinformatics.com/en/home/ | Phage Kitchen | https://trello.com/1/cards/6178a0631a700007711b5399/attachments/618af56cad3dfb26df27564e/download/image.png | | | | | | | |
375 | 2023-11-13T11:12:18.883Z | Rocha lab - A simple, reproducible and cost-effective procedure to analyse gut phageome: from phage isolation to bioinformatic approach | | (Camille d’Humières, Marie Touchon, Sara Dion, Jean Cury, Amine Ghozlane, Marc Garcia-Garcera, Christiane Bouchier, Laurence Ma, Erick Denamur & Eduardo P.C.Rocha )
We analysed five different techniques to isolate phages from human adult faeces and developed an approach to analyse their genomes in order to quantify contamination and classify phage contigs in terms of taxonomy and lifestyle. We chose the polyethylene glycol concentration method to isolate phages because of its simplicity, low cost, reproducibility, and of the high number and diversity of phage sequences that we obtained. We also tested the reproducibility of this method with multiple displacement amplification (MDA) and showed that MDA severely decreases the phage genetic diversity of the samples and the reproducibility of the method. Lastly, we studied the influence of sequencing depth on the analysis of phage diversity and observed the beginning of a plateau for phage contigs at 20,000,000 reads. This work contributes to the development of methods for the isolation of phages in faeces and for their comparative analysis. | | | https://www.nature.com/articles/s41598-019-47656-w | Phage Kitchen | https://trello.com/1/cards/6178bcd30d3c1535418e3a84/attachments/6181bed5236a0a2e711d79ef/download/Table1.png | | | | | | | |
376 | 2023-11-13T11:12:18.883Z | RPSBLAST | | | | | | | | | | | | | | |
377 | 2023-11-13T11:12:18.883Z | RTMg | | | | | | | | | | | | | | |
378 | 2023-11-13T11:12:18.883Z | Ruby | | | | | | | | | | | | | | |
379 | 2023-11-13T11:12:18.883Z | RVDB | | | | | | | | | | | | | | |
380 | 2023-11-13T11:12:18.883Z | samtools | | The Sequence Alignment/Map format and SAMtools | | | | | | | | | | | | |
381 | 2023-11-13T11:12:18.883Z | sankey | | | | | | | | | | | | | | |
382 | 2023-11-13T11:12:18.883Z | SEA-PHAGES bioinformatics guide | | | | | https://seaphagesbioinformatics.helpdocsonline.com/interpreting-data | Phage Kitchen | https://trello.com/1/cards/6183877a22f1ab60f5e0eb2f/attachments/6183879e1c947738a174289c/download/image.png | | | | | | | |
383 | 2023-11-13T11:12:18.883Z | SEA-PHAGES University of Pittsburgh (US) | | | | | https://seaphages.org/institution/PITT/ | Phage Kitchen | | | | | | | | |
384 | 2023-11-13T11:12:18.883Z | Seaphages decision trees for refining annotations | | | | | https://seaphagesbioinformatics.helpdocsonline.com/article-25 | Phage Kitchen | https://trello.com/1/cards/61e752c9f2a14e3b977d0d8c/attachments/61e752d5c3c821689d80ec25/download/image.png | | | | | | | |
385 | 2023-11-13T11:12:18.883Z | Seeker: alignment-free identification of bacteriophage genomes by deep learning | | Recent advances in metagenomic sequencing have enabled discovery of diverse, distinct microbes and viruses. Bacteriophages, the most abundant biological entity on Earth, evolve rapidly, and therefore, detection of unknown bacteriophages in sequence datasets is a challenge. Most of the existing detection methods rely on sequence similarity to known bacteriophage sequences, impeding the identification and characterization of distinct, highly divergent bacteriophage families. Here we present Seeker, a deep-learning tool for alignment-free identification of phage sequences. Seeker allows rapid detection of phages in sequence datasets and differentiation of phage sequences from bacterial ones, even when those phages exhibit little sequence similarity to established phage families. We comprehensively validate Seeker's ability to identify previously unidentified phages, and employ this method to detect unknown phages, some of which are highly divergent from the known phage families. We provide a web portal (seeker.pythonanywhere.com) and a user-friendly Python package (github.com/gussow/seeker) allowing researchers to easily apply Seeker in metagenomic studies, for the detection of diverse unknown bacteriophages. | | | https://academic.oup.com/nar/article/48/21/e121/5921300 | Phage Kitchen | | | | | | | | |
386 | 2023-11-13T11:12:18.883Z | seqkit | | SeqKit: A Cross-Platform and Ultrafast Toolkit for FASTA/Q File Manipulation | | | | | | | | | | | | |
387 | 2023-11-13T11:12:18.883Z | Sequencing QC | | | | | | | | | | | | | | |
388 | 2023-11-13T11:12:18.883Z | ShineFind | | | | | | | | | | | | | | |
389 | 2023-11-13T11:12:18.883Z | Sourmash | | sourmash: a library for MinHash sketching of DNA | | | | | | | | | | | | |
390 | 2023-11-13T11:12:18.883Z | sourmash | | | | | | | | | | | | | | |
391 | 2023-11-13T11:12:18.883Z | SpacePharer (CRISPR) | | | | | | | | | | | | | | |
392 | 2023-11-13T11:12:18.883Z | SPAdes | | | | | | | | | | | | | | |
393 | 2023-11-13T11:12:18.883Z | STEP3 | | **A machine-learning approach to define the component parts of bacteriophage virions**
Bacteriophages (phages) are currently under consideration as a means to treat a wide range of bacterial infections, including those caused by drug-resistant “superbugs”. Successful phage therapy protocols require diverse phage in a phage cocktail, with the prospective need to recognize features of diverse phage from under-sampled environments. The effective use of these viruse for therapy depends on a number of factors, not least of which is the sequence-based choices that must be made to identify new phages for development into phage therapy.
Phage virions, i.e. the physical form of the phage that would be delivered to the site of infection, conform to a blue-print that consists of a protein capsid housing the viral genome, and a multicomponent tail. We view these virions as molecular machines, and the machinery of the tail machinery is complex. First and foremost, elements within the tail function to engage a species-specific component on the surface of the host bacterium, thereby initiating the infection cascade. The tail machinery is also responsible for penetrating through the bacterial cell wall, in order that the tip of the tail can enter the bacterial cytoplasm. Then, and only then, is a signal transmitted to the portal at the proximal end of the tail, enabling release of the phage DNA into the tail lumen to permit DNA translocation into the bacterial cell cytoplasm, resulting in bacterial death.
We have developed an ensemble predictor called STEP3 that uses machine-learning algorithms to characterize the components of the machinery in phage virions. STEP3 can be used to understand the universal features of the machinery in phage tails, by accurately classifying proteins with conserved features together into groupings that are not dependent on the ill-considered annotations that currently confuse phage genome data. In the development of STEP3, various types of evolutionary features were sampled, features that were extracted from Position-Specific Scoring Matrix (PSSM), to draw on relationships underpinning the evolutionary history of the various proteins making up the phage virions. Considering the high evolution rates of phage proteins, these features are particularly suitable to detect virion proteins with only distantly related homologies. STEP3 integrated these features into an ensemble framework to achieve a stable and robust prediction performance. The final ensemble model showed a significant improvement in terms of prediction accuracy over current state-of-the-art phage virion protein predictors on extensive 5-fold cross-validation and independent tests.
**Description:**
**A machine-learning approach to define the component parts of bacteriophage virions**
Bacteriophages (phages) are currently under consideration as a means to treat a wide range of bacterial infections, including those caused by drug-resistant “superbugs”. Successful phage therapy protocols require diverse phage in a phage cocktail, with the prospective need to recognize features of diverse phage from under-sampled environments. The effective use of these viruse for therapy depends on a number of factors, not least of which is the sequence-based choices that must be made to identify new phages for development into phage therapy.
Phage virions, i.e. the physical form of the phage that would be delivered to the site of infection, conform to a blue-print that consists of a protein capsid housing the viral genome, and a multicomponent tail. We view these virions as molecular machines, and the machinery of the tail machinery is complex. First and foremost, elements within the tail function to engage a species-specific component on the surface of the host bacterium, thereby initiating the infection cascade. The tail machinery is also responsible for penetrating through the bacterial cell wall, in order that the tip of the tail can enter the bacterial cytoplasm. Then, and only then, is a signal transmitted to the portal at the proximal end of the tail, enabling release of the phage DNA into the tail lumen to permit DNA translocation into the bacterial cell cytoplasm, resulting in bacterial death.
We have developed an ensemble predictor called STEP3 that uses machine-learning algorithms to characterize the components of the machinery in phage virions. STEP3 can be used to understand the universal features of the machinery in phage tails, by accurately classifying proteins with conserved features together into groupings that are not dependent on the ill-considered annotations that currently confuse phage genome data. In the development of STEP3, various types of evolutionary features were sampled, features that were extracted from Position-Specific Scoring Matrix (PSSM), to draw on relationships underpinning the evolutionary history of the various proteins making up the phage virions. Considering the high evolution rates of phage proteins, these features are particularly suitable to detect virion proteins with only distantly related homologies. STEP3 integrated these features into an ensemble framework to achieve a stable and robust prediction performance. The final ensemble model showed a significant improvement in terms of prediction accuracy over current state-of-the-art phage virion protein predictors on extensive 5-fold cross-validation and independent tests. | | | https://journals.asm.org/doi/10.1128/mSystems.00242-21
https://step3.erc.monash.edu/
https://step3.erc.monash.edu/
https://journals.asm.org/doi/10.1128/mSystems.00242-21 | Phage Kitchen | https://trello.com/1/cards/61837d0fa5f17a4e76aed78b/attachments/61a7139734f4b01927c4888b/download/image.png | | | | | | | |
394 | 2023-11-13T11:12:18.883Z | StringTie | | | | | | | | | | | | | | |
395 | 2023-11-13T11:12:18.883Z | SumTrees (bootstrapping) | | | | | | | | | | | | | | |
396 | 2023-11-13T11:12:18.883Z | SWISSPROT | | | | | | | | | | | | | | |
397 | 2023-11-13T11:12:18.883Z | Taxonomic assignment | | | | | | | | | | | | | | |
398 | 2023-11-13T11:12:18.883Z | tblastx | | | | | | | | | | | | | | |
399 | 2023-11-13T11:12:18.883Z | The Bacteriophage Bank of Korea | | | | | http://www.phagebank.or.kr/intro/eng_intro.jsp | Phage Kitchen | | | | | | | | |
400 | 2023-11-13T11:12:18.883Z | TIGRfam | | | | | | | | | | | | | | |
401 | 2023-11-13T11:12:18.883Z | TMHMM | | | | | | | | | | | | | | |
402 | 2023-11-13T11:12:18.883Z | TnT Genome | | | | | | | | | | | | | | |
403 | 2023-11-13T11:12:18.883Z | TransTermHP | | | | | | | | | | | | | | |
404 | 2023-11-13T11:12:18.883Z | TrEMBL | | | | | | | | | | | | | | |
405 | 2023-11-13T11:12:18.883Z | TRIBE-MCL | | | | | | | | | | | | | | |
406 | 2023-11-13T11:12:18.883Z | tRNAscan-SE | | | | | | | | | | | | | | |
407 | 2023-11-13T11:12:18.883Z | Unavailable | | | | | | | | | | | | | | |
408 | 2023-11-13T11:12:18.883Z | Uniprot99 | | | | | | | | | | | | | | |
409 | 2023-11-13T11:12:18.883Z | UniRef90 | | | | | | | | | | | | | | |
410 | 2023-11-13T11:12:18.883Z | UPGMA | | | | | | | | | | | | | | |
411 | 2023-11-13T11:12:18.883Z | UpSetR | | UpSetR: an R package for the visualization of intersecting sets and their properties | | | | | | | | | | | | |
412 | 2023-11-13T11:12:18.883Z | Usearch | | | | | | | | | | | | | | |
413 | 2023-11-13T11:12:18.883Z | vConTACT2 | | | | | | | | | | | | | | |
414 | 2023-11-13T11:12:18.883Z | vCONTACT2 - Taxonomic assignment of uncultivated prokaryotic virus genomes is enabled by gene-sharing networks | | We present vConTACT v.2.0, a network-based application utilizing whole genome gene-sharing profiles for virus taxonomy that integrates distance-based hierarchical clustering and confidence scores for all taxonomic predictions. We report near-identical (96%) replication of existing genus-level viral taxonomy assignments from the International Committee on Taxonomy of Viruses for National Center for Biotechnology Information virus RefSeq. Application of vConTACT v.2.0 to 1,364 previously unclassified viruses deposited in virus RefSeq as reference genomes produced automatic, high-confidence genus assignments for 820 of the 1,364. We applied vConTACT v.2.0 to analyze 15,280 Global Ocean Virome genome fragments and were able to provide taxonomic assignments for 31% of these data, which shows that our algorithm is scalable to very large metagenomic datasets. Our taxonomy tool can be automated and applied to metagenomes from any environment for virus classification.
---
Version 1
vConTACT: an iVirus tool to classify double-stranded DNA viruses that infect Archaea and Bacteria
**Description:**
We present vConTACT v.2.0, a network-based application utilizing whole genome gene-sharing profiles for virus taxonomy that integrates distance-based hierarchical clustering and confidence scores for all taxonomic predictions. We report near-identical (96%) replication of existing genus-level viral taxonomy assignments from the International Committee on Taxonomy of Viruses for National Center for Biotechnology Information virus RefSeq. Application of vConTACT v.2.0 to 1,364 previously unclassified viruses deposited in virus RefSeq as reference genomes produced automatic, high-confidence genus assignments for 820 of the 1,364. We applied vConTACT v.2.0 to analyze 15,280 Global Ocean Virome genome fragments and were able to provide taxonomic assignments for 31% of these data, which shows that our algorithm is scalable to very large metagenomic datasets. Our taxonomy tool can be automated and applied to metagenomes from any environment for virus classification.
---
Version 1
vConTACT: an iVirus tool to classify double-stranded DNA viruses that infect Archaea and Bacteria | | https://bitbucket.org/MAVERICLab/vcontact2/wiki/Home
https://bitbucket.org/MAVERICLab/vcontact2/wiki/Home | https://www.nature.com/articles/s41587-019-0100-8
https://peerj.com/articles/3243/
https://www.nature.com/articles/s41587-019-0100-8
https://peerj.com/articles/3243/ | Phage Kitchen | | | | | | | | |
415 | 2023-11-13T11:12:18.883Z | vConTACT2 SOP | | | | | https://www.protocols.io/view/applying-vcontact-to-viral-sequences-and-visualizi-x5xfq7n | Phage Kitchen | | | | | | | | |
416 | 2023-11-13T11:12:18.883Z | vHULK | | **Phage Host Prediction using high level features and neural networks**
Metagenomics and sequencing techniques have greatly improved in these last five years and, as a consequence, the amount of data from microbial communities is astronomic. An import part of the microbial community are phages, which have their own ecological roles in the environment. Besides that, they have also been given a possible human relevant (clinical) role as terminators of multidrug resistant bacterial infections. A lot of basic research still need to be done in the Phage therapy field, and part of this research involves gathering knowledge from new phages present in the environment as well as about their relationship with clinical relevant bacterial pathogens.
Having this scenario in mind, we have developed vHULK. A user-friendly tool for prediction of phage hosts given their complete or partial genome in FASTA format. Our tool outputs an ensemble prediction at the genus or species level based on scores of four different neural network models. Each model was trained with more than 4,000 genomes whose phage-host relationship was known. v.HULK also outputs a mesure of entropy for each final prediction, which we have demonstrated to be correlated with prediction's accuracy. The user might understand this value as additional information of how certain v.HULK is about a particular prediction. We also suspect that phages with higher entropy values may have a broad host-range. But that hypothesis is to be tested later. Accuracy results in test datasets were >99% for predictions at the genus level and >98% at the species level. vHULK currently supports predictions for 52 different prokaryotic host species and 61 different genera. | | https://github.com/LaboratorioBioinformatica/vHULK | https://www.biorxiv.org/content/10.1101/2020.12.06.413476v1.full | Phage Kitchen | https://trello.com/1/cards/61ba7607a28975244b0a6027/attachments/61ba76d4f7735a810ed5d1f1/download/image.png | | | | | | | |
417 | 2023-11-13T11:12:18.883Z | vibrant | | | | | | | | | | | | | | |
418 | 2023-11-13T11:12:18.883Z | VIBRANT: automated recovery, annotation and curation of microbial viruses, and evaluation of viral community function from genomic sequences | | Background
Viruses are central to microbial community structure in all environments. The ability to generate large metagenomic assemblies of mixed microbial and viral sequences provides the opportunity to tease apart complex microbiome dynamics, but these analyses are currently limited by the tools available for analyses of viral genomes and assessing their metabolic impacts on microbiomes.
Design
Here we present VIBRANT, the first method to utilize a hybrid machine learning and protein similarity approach that is not reliant on sequence features for automated recovery and annotation of viruses, determination of genome quality and completeness, and characterization of viral community function from metagenomic assemblies. VIBRANT uses neural networks of protein signatures and a newly developed v-score metric that circumvents traditional boundaries to maximize identification of lytic viral genomes and integrated proviruses, including highly diverse viruses. VIBRANT highlights viral auxiliary metabolic genes and metabolic pathways, thereby serving as a user-friendly platform for evaluating viral community function. VIBRANT was trained and validated on reference virus datasets as well as microbiome and virome data.
Results
VIBRANT showed superior performance in recovering higher quality viruses and concurrently reduced the false identification of non-viral genome fragments in comparison to other virus identification programs, specifically VirSorter, VirFinder, and MARVEL. When applied to 120,834 metagenome-derived viral sequences representing several human and natural environments, VIBRANT recovered an average of 94% of the viruses, whereas VirFinder, VirSorter, and MARVEL achieved less powerful performance, averaging 48%, 87%, and 71%, respectively. Similarly, VIBRANT identified more total viral sequence and proteins when applied to real metagenomes. When compared to PHASTER, Prophage Hunter, and VirSorter for the ability to extract integrated provirus regions from host scaffolds, VIBRANT performed comparably and even identified proviruses that the other programs did not. To demonstrate applications of VIBRANT, we studied viromes associated with Crohn’s disease to show that specific viral groups, namely Enterobacteriales-like viruses, as well as putative dysbiosis associated viral proteins are more abundant compared to healthy individuals, providing a possible viral link to maintenance of diseased states.
Conclusions
The ability to accurately recover viruses and explore viral impacts on microbial community metabolism will greatly advance our understanding of microbiomes, host-microbe interactions, and ecosystem dynamics. | | https://github.com/AnantharamanLab/VIBRANT | https://microbiomejournal.biomedcentral.com/articles/10.1186/s40168-020-00867-0 | Phage Kitchen | https://trello.com/1/cards/6178a75108823f42f72f4f48/attachments/6178a81d23075487b78f97c7/download/image.png | | | | | | | |
419 | 2023-11-13T11:12:18.883Z | VICTOR: genome-based phylogeny and classification of prokaryotic viruses - online only? | | We here present a novel in silico framework for phylogeny and classification of prokaryotic viruses, in line with the principles of phylogenetic systematics, and using a large reference dataset of officially classified viruses. The resulting trees revealed a high agreement with the classification. Except for low resolution at the family level, the majority of taxa was well supported as monophyletic. Clusters obtained with distance thresholds chosen for maximizing taxonomic agreement appeared phylogenetically reasonable, too. Analysis of an expanded dataset, containing >4000 genomes from public databases, revealed a large number of novel species, genera, subfamilies and families. | | | https://ggdc.dsmz.de/victor.php
https://doi.org/10.1093/bioinformatics/btx440 | Phage Kitchen | | | | | | | | |
420 | 2023-11-13T11:12:18.883Z | VIGA - De novo Viral Genome Annotator | | VIGA is a script written in Python 3 that annotates viral genomes automatically (using a de novo algorithm) and predict the function of their proteins using BLAST and HMMER. This script works in UNIX-based OS, including MacOSX and the Windows Subsystem for Linux.
Programs:
* LASTZ (Harris 2007): it is used to predict the circularity of the contigs. The program is publicly available at under the MIT licence.
* INFERNAL (Nawrocki and Eddy 2013): it is used to predict ribosomal RNA in the contigs when using the RFAM database (Nawrocki et al. 2015). This program is publicly available at under the BSD licence and RFAM database is available at ftp://ftp.ebi.ac.uk/pub/databases/Rfam/
* ARAGORN (Laslett and Canback 2004): it is used to predict tRNA sequences in the contig. This program is publicly available at under the GPLv2 licence.
* PILERCR (Edgar 2007): it is used to predict CRISPR repeats in your contig. This program is freely available at under a public licence.
* Prodigal (Hyatt et al. 2010): it is used to predict the ORFs. When the contig is smaller than 100,000 bp, MetaProdigal (Hyatt et al. 2012) is automatically activated instead of normal Prodigal. This program is publicly available at under the GPLv3 licence.
* DIAMOND (Buchfink et al. 2015): it is used to predict the function of proteins according to homology. This program is publicly available at under the GPLv3 licence. Databases must be created from FASTA files according to their instructions before running.
* BLAST+ (Camacho et al. 2008): it is used to predict the function of the predicted proteins according to homology when DIAMOND is not able to retrieve any hit or such hit is a 'hypothetical protein'. This suite is publicly available at ftp://ftp.ncbi.nlm.nih.gov/blast/executables/blast+/LATEST/ under the GPLv2 licence. Databases are available at ftp://ftp.ncbi.nlm.nih.gov/blast/db/ or created using makeblastdb command.
* HMMER (Finn et al. 2011): it is used to add more information of the predicted proteins according to Hidden Markov Models. This suite is publicly available at under the GPLv3 licence. Databases must be in HMM format and an example of potential database is PVOGs (
**Description:**
VIGA is a script written in Python 3 that annotates viral genomes automatically (using a de novo algorithm) and predict the function of their proteins using BLAST and HMMER. This script works in UNIX-based OS, including MacOSX and the Windows Subsystem for Linux.
Programs:
* LASTZ (Harris 2007): it is used to predict the circularity of the contigs. The program is publicly available at under the MIT licence.
* INFERNAL (Nawrocki and Eddy 2013): it is used to predict ribosomal RNA in the contigs when using the RFAM database (Nawrocki et al. 2015). This program is publicly available at under the BSD licence and RFAM database is available at ftp://ftp.ebi.ac.uk/pub/databases/Rfam/
* ARAGORN (Laslett and Canback 2004): it is used to predict tRNA sequences in the contig. This program is publicly available at under the GPLv2 licence.
* PILERCR (Edgar 2007): it is used to predict CRISPR repeats in your contig. This program is freely available at under a public licence.
* Prodigal (Hyatt et al. 2010): it is used to predict the ORFs. When the contig is smaller than 100,000 bp, MetaProdigal (Hyatt et al. 2012) is automatically activated instead of normal Prodigal. This program is publicly available at under the GPLv3 licence.
* DIAMOND (Buchfink et al. 2015): it is used to predict the function of proteins according to homology. This program is publicly available at under the GPLv3 licence. Databases must be created from FASTA files according to their instructions before running.
* BLAST+ (Camacho et al. 2008): it is used to predict the function of the predicted proteins according to homology when DIAMOND is not able to retrieve any hit or such hit is a 'hypothetical protein'. This suite is publicly available at ftp://ftp.ncbi.nlm.nih.gov/blast/executables/blast+/LATEST/ under the GPLv2 licence. Databases are available at ftp://ftp.ncbi.nlm.nih.gov/blast/db/ or created using makeblastdb command.
* HMMER (Finn et al. 2011): it is used to add more information of the predicted proteins according to Hidden Markov Models. This suite is publicly available at under the GPLv3 licence. Databases must be in HMM format and an example of potential database is PVOGs ( | | https://github.com/EGTortuero/viga/tree/developer
https://github.com/lastz/lastz
https://github.com/hyattpd/prodigal/releases/
https://github.com/bbuchfink/diamond
https://github.com/EGTortuero/viga/tree/developer
https://github.com/lastz/lastz
https://github.com/hyattpd/prodigal/releases/
https://github.com/bbuchfink/diamond | http://eddylab.org/infernal/
http://mbio-serv2.mbioekol.lu.se/ARAGORN/
http://drive5.com/pilercr/
http://hmmer.org/
http://dmk-brain.ecn.uiowa.edu/VOG/downloads/All/AllvogHMMprofiles.tar.gz).
http://eddylab.org/infernal/
http://mbio-serv2.mbioekol.lu.se/ARAGORN/
http://drive5.com/pilercr/
http://hmmer.org/
http://dmk-brain.ecn.uiowa.edu/VOG/downloads/All/AllvogHMMprofiles.tar.gz). | Phage Kitchen | | | | | | | | |
421 | 2023-11-13T11:12:18.883Z | ViPhOG | | | | | | | | | | | | | | |
422 | 2023-11-13T11:12:18.883Z | ViPhOGs - Informative Regions In Viral Genomes | | In order to shed some light into this genetic dark matter we expanded the search of orthologous groups as potential markers to viral taxonomy from bacteriophages and included eukaryotic viruses, establishing a set of 31,150 ViPhOGs (Eukaryotic Viruses and Phages Orthologous Groups).
To do this, we examine the non-redundant viral diversity stored in public databases, predict proteins in genomes lacking such information, and used all annotated and predicted proteins to identify potential protein domains. The clustering of domains and unannotated regions into orthologous groups was done using cogSoft. Finally, we employed a random forest implementation to classify genomes into their taxonomy and found that the presence or absence of ViPhOGs is significantly associated with their taxonomy. Furthermore, we established a set of 1457 ViPhOGs that given their importance for the classification could be considered as markers or signatures for the different taxonomic groups defined by the ICTV at the order, family, and genus levels. | | | https://www.mdpi.com/1999-4915/13/6/1164/htm | Phage Kitchen | https://trello.com/1/cards/6201e751f1931e3e6fdcc1a1/attachments/6201e79af135207d425ea7d6/download/image.png | | | | | | | |
423 | 2023-11-13T11:12:18.883Z | ViPTree : the Viral Proteomic Tree server | | The ViPTree server generates a "proteomic tree" of viral genome sequences based on genome-wide sequence similarities computed by tBLASTx. The original proteomic tree concept (i.e., "the Phage Proteomic Tree”) was developed by Rohwer and Edwards, 2002. A proteomic tree is a dendrogram that reveals global genomic similarity relationships between tens, hundreds, and thousands of viruses. It has been shown that viral groups identified in a proteomic tree well correspond to established viral taxonomies. The proteomic tree approach is effective to investigate genomes of newly sequenced viruses as well as those identified in metagenomes.
2021-10-04 version 1.9.1
Version of Virus-Host DB: RefSeq release 207
ViPTreeGen is a tool for automated generation of viral "proteomic tree" by computing genome-wide sequence similarities based on tBLASTx results. | | https://github.com/yosuken/ViPTreeGen | https://www.genome.jp/viptree/ | Phage Kitchen | | | | | | | | |
424 | 2023-11-13T11:12:18.883Z | VIRALPRO | | VIRALpro is a predictor capable of identifying capsid and tail protein sequences using support vector machines (SVM) with an accuracy estimated to be between 90% and 97%. Predictions are based on the protein amino acid composition, on the protein predicted secondary structure, as predicted by SSpro, and on a boosted linear combination of HMM e-values obtained from 3,380 HMMs built from multiple sequence alignments of specific fragments - called contact fragments - of both capsid and tail sequences.
**Description:**
VIRALpro is a predictor capable of identifying capsid and tail protein sequences using support vector machines (SVM) with an accuracy estimated to be between 90% and 97%. Predictions are based on the protein amino acid composition, on the protein predicted secondary structure, as predicted by SSpro, and on a boosted linear combination of HMM e-values obtained from 3,380 HMMs built from multiple sequence alignments of specific fragments - called contact fragments - of both capsid and tail sequences. | | | http://download.igb.uci.edu/
http://scratch.proteomics.ics.uci.edu/explanation.html#VIRALpro
http://scratch.proteomics.ics.uci.edu/explanation.html#VIRALpro
http://download.igb.uci.edu/ | Phage Kitchen | | | | | | | | |
425 | 2023-11-13T11:12:18.883Z | ViralZone | | | | | | | | | | | | | | |
426 | 2023-11-13T11:12:18.883Z | VirClust – a tool for hierarchical clustering, core gene detection and annotation of (prokaryotic) viruses | | Here, VirClust is presented – a novel tool capable of performing
* hierarchical clustering of viruses based on intergenomic distances calculated from their protein cluster content,
* identification of core proteins and
* annotation of viral proteins. VirClust groups proteins into clusters both based on BLASTP sequence similarity, which identifies more related proteins, and also based on hidden markow models (HMM), which identifies more distantly related proteins.
Furthermore, VirClust provides an integrated visualization of the hierarchical clustering tree and of the distribution of the protein content, which allows the identification of the genomic features responsible for the respective clustering. By using different intergenomic distances, the hierarchical trees produced by VirClust can be split into viral genome clusters of different taxonomic ranks. VirClust is freely available, as web-service (virclust.icbm.de) and stand-alone tool. | | | https://doi.org/10.1101/2021.06.14.448304 | Phage Kitchen | https://trello.com/1/cards/6178afc68303fa1f4a81b80f/attachments/6178b0154cb281263e8f9b56/download/F1.large.jpg | | | | | | | |
427 | 2023-11-13T11:12:18.883Z | VirFinder | | VirFinder: R package for identifying viral sequences from metagenomic data using sequence signatures | | | | | | | | | | | | |
428 | 2023-11-13T11:12:18.883Z | VirFinder: a novel k-mer based tool for identifying viral sequences from assembled metagenomic data | | **Background:** Identifying viral sequences in mixed metagenomes containing both viral and host contigs is a critical first step in analyzing the viral component of samples. Current tools for distinguishing prokaryotic virus and host contigs primarily use gene-based similarity approaches. Such approaches can significantly limit results especially for short contigs that have few predicted proteins or lack proteins with similarity to previously known viruses.
**Methods:** We have developed VirFinder, the first k-mer frequency based, machine learning method for virus contig identification that entirely avoids gene-based similarity searches. VirFinder instead identifies viral sequences based on our empirical observation that viruses and hosts have discernibly different k-mer signatures. VirFinder’s performance in correctly identifying viral sequences was tested by training its machine learning model on sequences from host and viral genomes sequenced before 1 January 2014 and evaluating on sequences obtained after 1 January 2014.
**Results:** VirFinder had significantly better rates of identifying true viral contigs (true positive rates (TPRs)) than VirSorter, the current state-of-the-art gene-based virus classification tool, when evaluated with either contigs subsampled from complete genomes or assembled from a simulated human gut metagenome. For example, for contigs subsampled from complete genomes, VirFinder had 78-, 2.4-, and 1.8-fold higher TPRs than VirSorter for 1, 3, and 5 kb contigs, respectively, at the same false positive rates as VirSorter (0, 0.003, and 0.006, respectively), thus VirFinder works considerably better for small contigs than VirSorter. VirFinder furthermore identified several recently sequenced virus genomes (after 1 January 2014) that VirSorter did not and that have no nucleotide similarity to previously sequenced viruses, demonstrating VirFinder’s potential advantage in identifying novel viral sequences. Application of VirFinder to a set of human gut metagenomes from healthy and liver cirrhosis patients reveals higher viral diversity in healthy individuals than cirrhosis patients. We also identified contig bins containing crAssphage-like contigs with higher abundance in healthy patients and a putative Veillonella genus prophage associated with cirrhosis patients.
**Conclusions:** This innovative k-mer based tool complements gene-based approaches and will significantly improve prokaryotic viral sequence identification, especially for metagenomic-based studies of viral ecology.
**Keywords:** Metagenome, Virus, k-mer, Human gut, Liver cirrhosis | | https://github.com/jessieren/VirFinder | https://link.springer.com/epdf/10.1186/s40168-017-0283-5? | Phage Kitchen | | | | | | | | |
429 | 2023-11-13T11:12:18.883Z | VIRIDIC (Virus Intergenomic Distance Calculator) computes pairwise intergenomic distances/similarities amongst viral genomes. | | VIRIDIC stand-alone is available now (see download tab). You can use it for jobs with high computational demand and/or for implementing it in your own pipelines. It is very easy to install on your own servers (it is wrapped as a Singularity). You can continue to use the VIRIDIC web-service for small to medium projects (e.g. up to 200 phages per project, no viromes please, they will crash our resources and the analysis will fail). | | | https://doi.org/10.3390/v12111268
http://rhea.icbm.uni-oldenburg.de/VIRIDIC/ | Phage Kitchen | | | | | | | | |
430 | 2023-11-13T11:12:18.883Z | virnet | | | | | | | | | | | | | | |
431 | 2023-11-13T11:12:18.883Z | VirNET - A deep attention model for viral reads identification | | VirNet: A deep attention model for viral reads identification
This tool is able to identifiy viral sequences from a mixture of viral and bacterial sequences. Also, it can purify viral metagenomic data from bacterial contamination
**Description:**
VirNet: A deep attention model for viral reads identification
This tool is able to identifiy viral sequences from a mixture of viral and bacterial sequences. Also, it can purify viral metagenomic data from bacterial contamination | | https://github.com/alyosama/virnet
https://github.com/alyosama/virnet | | Phage Kitchen | | | | | | | | |
432 | 2023-11-13T11:12:18.883Z | Virsorter2 beta | | | | | | | | | | | | | | |
433 | 2023-11-13T11:12:18.883Z | VirSorter2 Sullivan lab SOP | | | | | https://www.protocols.io/view/viral-sequence-identification-sop-with-virsorter2-bwm5pc86 | Phage Kitchen | | | | | | | | |
434 | 2023-11-13T11:12:18.883Z | Virsorter2 vs VirSorter, VirFinder, DeepVirFinder, MARVEL, and VIBRANT | | | | | https://microbiomejournal.biomedcentral.com/articles/10.1186/s40168-020-00990-y | Phage Kitchen | | | | | | | | |
435 | 2023-11-13T11:12:18.883Z | VirSorter2: a multi-classifier, expert-guided approach to detect diverse DNA and RNA viruses | | VirSorter2 applies a multi-classifier, expert-guided approach to detect diverse DNA and RNA virus genomes. It has made major updates to its previous version:
* work with more viral groups including dsDNA phages, ssDNA viruses, RNA viruses, NCLDV (Nucleocytoviricota), lavidaviridae (virophages);
* apply machine learning to estimate viralness using genomic features including structural/functional/taxonomic annotation and viral hallmark genes;
* train with high quality virus genomes from metagenomes or other sources.
A tutorial/SOP on how to quality control VirSorter2 results is available
Source code of VirSorter2 is freely available ( and VirSorter2 is also available both on bioconda and as an iVirus app on CyVerse ( | | https://github.com/jiarong/VirSorter2
https://bitbucket.org/MAVERICLab/virsorter2), | https://microbiomejournal.biomedcentral.com/articles/10.1186/s40168-020-00990-y
https://www.protocols.io/view/viral-sequence-identification-sop-with-virsorter2-btv8nn9w
https://de.cyverse.org/de). | Phage Kitchen | https://trello.com/1/cards/6178aa86d796dd19c663c746/attachments/6178ad86b2b7bd7413e685f4/download/image.png | | | | | | | |
436 | 2023-11-13T11:12:18.883Z | Virus-Host DB | | | | | | | | | | | | | | |
437 | 2023-11-13T11:12:18.883Z | VOGDB | | | | | | | | | | | | | | |
438 | 2023-11-13T11:12:18.883Z | VPF | | | | | | | | | | | | | | |
439 | 2023-11-13T11:12:18.883Z | VPF-Class: taxonomic assignment and host prediction of uncultivated viruses based on viral protein families | | Supplementary information
Viral Protein Families (VPFs) can be used for the robust identification of new viral sequences in large metagenomics datasets. Despite the importance of VPF information for viral discovery, VPFs have not yet been explored for determining viral taxonomy and host targets. | | https://github.com/biocom-uib/vpf-tools | http://bioinfo.uib.es/~recerca/VPF-Class/
https://academic.oup.com/bioinformatics/article-abstract/37/13/1805/6104829 | Phage Kitchen | https://trello.com/1/cards/6178ade1cc3ee83046d70aba/attachments/619341538a0dbc7b547eaa21/download/image.png | | | | | | | |
440 | 2023-11-13T11:12:18.883Z | What the Phage: A scalable workflow for the identification and analysis of phage sequences | | * WtP is a scalable and easy-to-use workflow for phage identification and analysis. Our tool currently combines 10 established phage identification tools
* An attempt to streamline the usage of various phage identification and prediction tools
* The main focus is stability and data filtering/analysis for the user
* The tool is intended for fasta and fastq reads to identify phages in contigs/reads
* Proper prophage detection is not implemented (yet) - but a handful of tools report them - so they are mostly identified | | https://github.com/replikation/What_the_Phage | https://doi.org/10.1101/2020.07.24.219899 | Phage Kitchen | https://trello.com/1/cards/618325564f862049e2f9e7e7/attachments/6183259926d4af2200999bd4/download/wtp-flowchart-simple.png | | | | | | | |
441 | 2023-11-13T11:12:18.883Z | WIsH - who is the host? Predicting prokaryotic hosts from metagenomic phage contigs | | WIsH can identify bacterial hosts from metagenomic data, keeping good accuracy even on
smaller contigs.
WIsH predicts prokaryotic hosts of phages from their genomic sequences. It achieves 63% mean accuracy when predicting the host genus among 20 genera for 3 kbp-long phage contigs. Over the best current tool, WisH shows much improved accuracy on phage sequences of a few kbp length and runs hundreds of times faster, making it suited for metagenomics studies. | | https://github.com/soedinglab/WIsH | https://doi.org/10.1093/bioinformatics/btx383 | Phage Kitchen | | | | | | | | |
442 | 2023-11-13T11:12:18.883Z | Xfams | | | | | | | | | | | | | | |
456 | 2024-01-23T14:08:16.566Z | | | | | | | | | | | | | | | |