Monday, November 6, 2017

A glycosylated HIV microbicide based on N6

Pegylated Griffithsin -- an "early model"  (5/10/16) 
Lectins like griffithsin have received some interest as potential HIV microbicides where binding the HIV particle’s glycan coat with lectins might act to prevent infection. Griffithsin binds mannose moieties. All female reproductive tract epithelia express the mucin MUC1, and MUC1 has N-linked glycosylation that results in a glycan coat which may have mannose moieties. So mannose binding lectins like griffithsin may just as easily bind mucus proteins as HIV and be prevented from reaching their target. We might want something that binds HIV more specifically.

One possible design is a glycosylated neutralizer based on the N6 antibody, an antibody that has been shown to effectively neutralize 98% of 181 different HIV isolates that it was tested against. The N6 antibody has the property that it binds the CD4 binding site on HIV's gp120 protein mostly with one chain -- the heavy chain. The light chain has evolved to stay out of the way. This allows the majority of the antibody to be cut away leaving a small molecule with the expectation that this molecule might still bind the CD4 binding site just fine.

Some of the strongest phenotypic properties of HIV's Env structure linked to sexual transmission are shorter hypervariable domains and fewer potential N-linked glycosylation sites. A reduction in glycosylation sites could be supporting viral entry for a variety of reasons, so to preserve the presentation of a dense glycan coat, it might be desirable for a protein based neutralizer, such as the design proposed above to have a dense glycan coat of it's own. This should also improve the solubility of the neutralizer.

Here is an idea of what these molecules might be like.

Gly -- a protein glycosylation designer

Image Credit
This software finds locations on a protein surface such that if we mutate the sites to have the N-linked glycosylation consensus sequence, it has a good chance of being efficiently expressed and glycosylated.

The algorithm implements a simple heuristic. If we want to mutate a protein residue, as long as the mutation doesn't disturb the molecule's hydrophobic core, and the mutation is to a hydrophilic residue, there is a good chance the mutation will be successful. The N-linked glycosylation "sequon" which is the attachment point for a glycan, involves pairs of hydrophilic residues in the mutant. So the algorithm looks for pairs of residues that have solvent exposed sidechains as potential mutation sites. Prolines are known to be "deal killers" for N-linked glycosylation, so prolines or other unknown residues in the neighborhood of a potential glycosylation site causes that mutation candidate to be excluded.

The program takes PDB files as inputs. There is a file with the complete structure, a file with solvent exposed atoms, a file with beta sheet residues and a file with alpha helix residues. I am currently using the pymol plugin findSurfaceResidues to get the solvent exposed atoms and pymol itself to get the secondary structure files.

Since predictions are not 100% accurate in all cases, the best glycosylation sites should be worked out experimentally. Look for the sites that are efficiently expressed and efficiently glycosylated. In other words, when you express a singly glycosylated version of your protein and do a western or SDS page, look for "bright" bands where the majority of the protein is seen to be in the glycosylated form with small amounts of unglycosylated protein. When expressing protein with two or more glycans, look for the combinations that give the maximum amount of glycans. The appearance of unglycosylated protein when expressing multiple glycans is undesireable and should be avoided.

This software has been tested against a number of data sets. See for example:

Erythropoetin -- 83% accurate.
Interferon alpha -- 71% accurate.
YFP -- 100% accurate.

An example with the analysis of a western blot may be found in the glycosylating interferon alpha data set.

The current version of the software may be found on github. (https://github.com/aequorea/gly)