SARS coronavirus 2 /Covid-19 genome expression

Genome: single stranded RNA messenger 29.9kb long, encoding 13 ORFs.
Coronavirus genomes have the longuest RNA virus genome known. Their RNA-dependent_RNA polymerase is also the only known to display proofreading function, maybe to stabilize this long RNA sequence.
The proteins are expressed by two ways: primary translation of polyprotein that initiates the infection, and after some replication, subgenomic mRNA expression which produces all structural proteins.


Protein naming: Non structural proteins can be produced from the polyprotein or subgenomic mRNAs. In the first case those are numbered relative to the polyprotein from N to C terminal and called NSP: NSP 1->16, in the second case relative to the sgRNA number and called ORF: ORF3a->ORF9b. When two ORFs are expressed from the same sgRNA those are called a and b, example N(ORF9a) and ORF9b.

Polyprotein products expression

image image

Subgenomic mRNA products

Subgenomic RNAs (sgRNAs) are created by discontinous transcription . During transcription of minus strand RNA, the polymerase have chances to pause on transcription-regulating sequences (TRS) and jump to leader TRS, thereby creating a major deletion. This creates a set of 9 RNAs that are subsequently replicated and transcribed. sgRNAS allows translation of all the structural proteins.
The figure belows illustrate the discontinous transcription leading into 10 different RNAs . Only mRNA1 is encapsided and assembled in virions.