Adeno-associated virus (AAV) is one of the most actively investigated gene therapy vehicles. It was initially discovered as a contaminant of adenovirus preparations, hence its name. Simply put, AAV is a protein shell surrounding and protecting a small, single-stranded DNA genome of approximately 4.8 kilobases (kb). AAV belongs to the parvovirus family and is dependent on co-infection with other viruses, mainly adenoviruses, in order to replicate. Initially distinguished serologically, molecular cloning of AAV genes has identified hundreds of unique AAV strains in numerous species.
There have been 11 AAV serotypes described until now. AAV is highly prevalent in humans and other primates and several serotypes have been isolated from various tissue samples. Serotypes 2, 3, 5, and 6 were discovered in human cells, AAV serotypes 1, 4, and 7–11 in nonhuman primate samples. Two species of AAV were recognised by the International Committee on Taxonomy of Viruses in 2013: adeno-associated dependoparvovirus A (formerly AAV-1, -2, -3 and -4) and adeno-associated dependoparvovirus B (formerly AAV-5).
Its single-stranded genome contains three genes, Rep (Replication), Cap (Capsid), and aap (Assembly). These three genes give rise to at least nine gene products through the use of three promoters, alternative translation start sites, and differential splicing. These coding sequences are flanked by inverted terminal repeats (ITRs) that are required for genome replication and packaging. The Rep gene encodes four proteins (Rep78, Rep68, Rep52, and Rep40), which are required for viral genome replication and packaging, while Cap expression gives rise to the viral capsid proteins (VP; VP1/VP2/VP3), which form the outer capsid shell that protects the viral genome, as well as being actively involved in cell binding and internalization. The aap gene encodes the assembly-activating protein (AAP) in an alternate reading frame overlapping the cap gene. This nuclear protein is thought to provide a scaffolding function for capsid assembly.
- Capsid proteins The cap gene encodes the three structural proteins of the AAV capsid, VP1 (87 kDa), VP2 (72 kDa), and VP3 (63 kDa), translated from mRNA transcribed from the p40 promoter. Differential splicing yields major and minor spliced products. VP1 is translated from the minor spliced mRNA, yielding less VP1 protein. VP2 and VP3 are both translated from the more abundant major spliced mRNA; however, VP2 is translated less efficiently because it initiates at an ACG codon, while VP3 is translated very efficiently because of a favorable Kozak context. It is estimated that the viral coat is comprised of 60 proteins arranged into an icosahedral structure with the capsid proteins in a molar ratio of 1:1:10 (VP1: VP2: VP3).
- Nonstructural proteins The rep gene encodes four nonstructural proteins, Rep78, Rep68, Rep52, and Rep40, which play a role in viral genome replication and transcription, as well as packaging. Rep78 and Rep68 are translated from mRNAs transcribed from the p5 promoter, while Rep52 and Rep 40 are derived from mRNAs transcribed from the p19 promoter. Alternative splicing replaces a 92 amino acid C-terminal element in Rep78 and Rep52 with a 9 amino acid element in Rep68 and Rep40.