Method: Protein sequences for eukaryotic viruses present in RefSeq52 were collected through the Step 1 - coevolution MSA : MSA generated with MMseqs2 using the RefSeq virus protein database as target. Step 2 - modeling : Model generated using ColabFold v1.3.0 (b9c8505) with AlphaFold producing 3 models with 3 recycles each, without model relaxation, without templates, ranked by pLDDT, starting from a custom MSA. Structures data are available in and consists of 67,715 eukaryotic virus proteins.
Birth of new protein folds and functions in the virome Jason Nomburg, Nathan Price, Jennifer A. Doudna biorxiv
47841 predicted structures are available in the ViralZone datasheets via the "3D Models" tab. All available structures are sorted by virus. For each structure, a link to the corresponding UniProt entry shows the name of the protein, and a link to the ModelArchive provides access to the structure data. The pop-up window that appears when hovering over the links allows you to preview the structure (ModelArchive) or the UniProt entry displaying a PDB structure, if available.