I BLASTed all 4 2019-nCoV insert sequences and agree with that user. The sequences are short and found in many other proteins. It is appropriate to trim out the gaps (relative to the HIV sequences) in inserts 3 & 4, reducing the length of the query. In other words, we have 4 sequences of lengths: 6, 6, 8, & 12 amino acids, where the alphabet of naturally occurring amino acids is N=20. Amino acid frequency in proteins in non-uniform.
No comments yet.