Phage genomes decoded

Automated phage genome analysis tool

Phagenomics mock-up
Focus on science, not syntax
We streamline the bioinformatics of phage genome assembly and annotation, so you can focus on the science.

Using a combination of proprietary and 3rd party algorithms, we produce optimised assemblies and annotations in a fully automated and user-friendly online service.
DNA figure
Researcher figure
Data–driven phage research
Phagenomics suits academics, biotechs and the pharmaceutical industry.

We place special emphasis in the therapeutic potential of phages and support clinicians in making informed, data-based decisions in the emerging field of phage therapy.
Compare multiple phage genomes
Our comparative genomics module enables you to construct smart phage cocktails or understand the phylogeny of your phage collection. All within your browser.
Researcher figure
Pricing
Free
$0/month

  • 3 genomes / month
  • Phage assembly
  • Phage analysis
  • Download files (.genbank and more)
PRO
from 39€/month

  • 20 genomes / month
  • Phage assembly
  • Phage analysis
  • Download files (.genbank and more)
  • Comparative genomics
  • Detailed protein annotations
  • Export annotations to Excel or .csv
  • Your own host strain database
  • Secure storage for raw data

Frequently Asked Questions (FAQ)

Usually between 10 to 20 minutes, depending on the size and complexity of your phage genome.

We can certainly accommodate batch processing of multiple genomes using our batch processing system. However, this system is not yet available on the web service. Please contact us for a custom solution.

No. You will only be charged once you sign up for a subscription.

In Phagenomics, "analysis" is what follows "assembly". The analysis pipeline is a complicated set of rules that scrutinise the assembled genome to extract as much data out of it as possible. We begin by polishing your phage genome with raw reads (if available) to ensure no sequencing errors remain. We then determine physical genome ends (see separate FAQ on this). After obtaining the optimal continuous contig, we predict ORFs using Prodigal and employ several HMM profiles and to annotate each ORFs using custom scripts. We also use AI-based 3rd party software to predict structural protein coding genes. All annotations are combined into one master annotation that works through hierarchical concatenation of several annotation inputs. In addition to gene predictions, we run multiple proprietary and 3rd party tools to characterise the genome in terms of completeness, virulence, antibiotic resistance and relatedness to existing phage genomes.

Yes, you will retain ownership of your data after upload. However, do note that we may use your phage data for training AI models in the future. Please read our policies for further information.

Phagenomics is the only comprehensive and user-friendly phage genome assembly, analysis and comparison web portal. While many of the tools we use in our pipelines are open source and can be used by experts with command line and bioinformatics know-how, we bundle them all into a single streamlined package. On top of this we've developed our own tools and quality control checkpoints, which result in higher quality genomic assemblies and annotations. We are currently benchmarking the system against published genomes, stay tuned.

Your data is securely stored using DigitalOcean's Object Storage, which is designed for durability and security. Importantly, all files stored in our system are set to 'Private' by default. This means files are not publicly accessible and can only be accessed through secure authentication protocols established by our system. Please remember that while we implement comprehensive security measures, no method of transmission or storage is 100% secure. We highly encourage users to employ security best practices on their end too, such as using strong and unique passwords.

After creating a new phage, you will be redirected to a page where you can upload your raw reads or a preassembled genome. Everything is done using the browser. If you wish to upload a large batch of genomic data at once (for multiple samples), please contact us at [email protected]

The "Materials and methods" tab on an analysed phage's page lists all third party tools used in the analysis (and assembly if applicable) and their version and citation information. We also report the version of Phagenomics used in the analysis. We encourage researchers to cite Phagenomics and 3rd party tools, when they are used in your study. For Phagenomics, cite "Phagenomics phage genome analysis portal version X.X."

This is a tricky one. Phage genomes vary in their genome types and replication strategies and we've spent a long time optimising our system to accommodate common genomes as well as edge cases. We begin by finding common assembly artefacts in the initial assembled genome. After fixing these via proprietary algorithms, we examine the genome using PhageTerm to determine replication types. We also look for common signature genes from which genomes are conventionally set to start from. Concatenating these data, we then use custom algorithms to flip the genome from an optimal position. In case of terminally redundant genomes, redundant ends are placed at the 5' and 3' ends of the genome. In case of cohesive ends, we flip the genome to start from the cohesive end (in case of 5' end) or end with the cohesive end (in case of 3' end). We do not use PhageTerm outputs as is, as they have been found to produce artefacts in our results. Note that for full certainty of physical genome ends, experimental verification using for example Sanger sequencing is the gold standard. You can always assemble your genome using Phagenomics and then design primers for genome end verification. After experimental verification, you could rearrange the genome using your software of choice and then upload your modified genome as a "preassembled" genome for analysis in Phagenomics (we do not reorient preassembled genomes). Or you can outsource all this to us.

For raw reads, we currently only accept Illumina paired end reads in FastQ format. The files may be gzipped or not. Alternatively, you may upload a preassembled genome (.fasta file) that was made using any technology.

After ensuring FastQ file integrity, we trim the reads to exclude low quality data and exclude unpaired reads. We then subsample the reads to optimise assembly performance. The genomes are assembled with Spades. Each contig is then blasted against a custom phage/bacterium database and coverage for each contig is determined by mapping the raw reads back to the contigs. A set of algorithms then calculates scores for each contig and assesses whether it is advisable to continue to analysis with the best contig. At this point you are presented with a choice to continue or not. If you decide to continue, the pipeline will proceed to the analysis part.

It is true that genome assembly and analysis is generally composed of automated steps (e.g. assembling a genome) and lots of manual annotation and manual quality control. We used to do everything by hand too, but quickly noticed the redundancy in our actions – and it is from this lack of automation that Phagenomics was born.

In automating the assembly and annotation processes, we aim to capture the manual actions of a researcher on an algorithmic level while drastically cutting the time used for phage analysis. This approach also enables a standardised and reproducible set of methods for phage genomics – something we often see missing in published phage genomes. For example, one crucial step in assembly is to fish out the main phage contig from all the contigs output by an assembler. A researcher would manually do this by mapping raw reads back onto the contigs and observe the read coverage distribution as well as contig sizes and GC%. Finally, they would choose the high coverage contig that represents the desired amplified phage genome (and if several contigs were of high coverage, they would likely abandon the problematic sequencing project and resequence the phage). With Phagenomics we mimic this behaviour, automatically picking the high coverage phage contig from the sea of mostly bacterial contigs (we do give you the option of manually choosing another contig too). In case of multiple competing contigs or the apparent lack of viral sequences, we warn the user not to continue further. We use similar steps in other parts of the pipeline too, for example in phage genome end detection (see separate FAQ for that).

However, if you are hesitant with a fully automated approach, you can always use Phagenomics as a starting point and do any manual adjustments later using the provided fasta, genbank (.gbk) and general features file (.gff) of an analysed phage.

Phagenomics was created by academics and we fully appreciate the need to understand the methods used for assembling and analysing your phage genome. Our tools work through a pipeline of interlinked scripts. These scripts are either our own proprietary code or "wrappers" for 3rd party open source programs (used within the bounds of their respective licenses). When you finish a phage analysis, a list of citations against these 3rd party programs is generated. This list contains version numbers, citation information and explanation of how the tool was used in the analysis of your phage, so you will have understanding of what was actually done. As a commercial actor, we cannot disclose the details of our own proprietary algorithms, but still aim to provide you with an overview of what was done. We are currently implementing details of the procedure in the Materials and methods section of a finished analysis, but please contact us if you have any questions.

Unfortunately, we currently only support assembly from Illumina paired end raw reads. However, you can always upload and analyse a preassembled genome that was based on NanoPore or Pacbio sequencing.

All payments made through Phagenomics are processed using Stripe, a leading global payment processor trusted by millions of businesses worldwide. Stripe is fully PCI compliant and uses advanced encryption methods to ensure your financial data is always securely transmitted and stored. Your credit card information is never stored on our servers and is completely handled by Stripe.
Built by PrecisionPhage
Phagenomics was designed and built by PrecisionPhage – a team dedicated to unlocking bacteriophage potential

Need comprehensive phage services?
PrecisionPhage offers all-inclusive phage services ranging from phage production to DNA isolation, library preparation, sequencing and genome analysis.
Read more at precisionphage.com