Peter Y. Chou and Gerald D. Fasman developed the Chou–Fasman method in 1974 for prediction of secondary structures of proteins. The methods properly utilizes previously available information obtained from X-ray crystallograhy experiments e.g the relative frequencies of each amino acid in turns, alpha helices and beta sheets. Based on this information algorithm decides the probability of each position and apppearance of each amino acid in all possible secondary structures and all these parameters are combined for the prediction of final appearance of any sequence of amino acids in a protein (turn, helix or beta sheet). The accuracy of this algorithm was approximately 50-60% which is very less as compared to modern tecniques exploiting machine learning and artificial intelligence techniques.
Amino Acid Propensities
The original parameters of algorithm were not trustworthy because at that time the data was collected from a very small and non-representative sample of protein structures. Major problem in original parameters was the strong tendency of amino acids to prefer one type of secondary structure over others. For example, helix former amino acids (Alanine, glutamate, leucine, and methionine) and amino acids with unique conformational propensities that can end a helix (proline and glycine). With time these parameters kept on improving along with some modifications in original algorithm. This algorithm only considers the properties of specific amino acid rather than the properties of whole sequence including its neighbors. Properties of only specific amino acid are not sufficient to acccuately predict a definite secondary structure but this helps in increasing its computational efficiency. New methods also considers the propensities of neighboring amino acids for more acccuate prediction (GOR Method).
Table of Amino Acid Propensities
Name | P(a) | P(b) | P(turn) | f(i) | f(i+1) | f(i+2) | f(i+3) |
Alanine | 142 | 83 | 66 | 0.06 | 0.076 | 0.035 |
0.058 |
Arginine | 98 | 93 | 95 | 0.07 | 0.106 | 0.099 | 0.085 |
Aspartic Acid | 101 | 54 | 146 | 0.147 | 0.110 | 0..179 | 0.081 |
Asparagine | 67 | 89 | 156 | 0.161 | 0.083 | 0.191 | 0.091 |
Cysteine | 70 | 119 | 119 | 0.149 | 0.0550 | 0.117 | 0.128 |
Glutamic Acid | 151 | 37 | 74 | 0.056 | 0.060 | 0.077 | 0.064 |
Glutamine | 111 | 110 | 98 | 0.074 | 0.098 | 0.037 | 0.098 |
Glycine | 57 | 75 | 1156 | 0.102 | 0.085 | 0.190 | 0.152 |
Histidine | 100 | 87 | 95 | 0.140 | 0.047 | 0.093 | 0.054 |
Isoleucine | 108 | 160 | 47 | 0.043 | 0.034 | 0.013 | 0.056 |
Leucine | 121 | 130 | 59 | 0.061 | 0.025 | 0.036 | 0.070 |
Lysine | 114 | 74 | 101 | 0.055 | 0.115 | 0.072 | 0.095 |
Methionine | 145 | 105 | 60 | 0.068 | 0.082 | 0.014 | 0.055 |
Phenylalanine | 113 | 138 | 60 | 0.059 | 0.041 | 0.065 | 0.065 |
Proline | 57 | 55 | 152 | 0.102 | 0.301 | 0.034 | 0.068 |
Serine | 77 | 75 | 143 | 0.120 | 0.139 | 0.125 | 0.106 |
Threonine | 83 | 119 | 96 | 0.086 | 0.108 | 0.065 | 0.079 |
Tryptophan | 108 | 137 | 96 | 0.077 | 0.013 | 0..064 | 0.167 |
Tyrosine | 69 | 147 | 114 | 0.082 | 0.065 | 0.114 | 0.125 |
Valine | 106 | 170 | 50 | 0.062 | 0.048 | 0.028 | 0.053 |
Algorithm in simple steps
- A set of appropriate parameters for all amino acids in the sequence
- Any region where 4 of 6 contiguous amino acids have a P(a) > 100 will be considered as alpha-helix. This alpha-helix will extend in both directions until 4 contiguous amino acids with an average of p(a) < 100 are not reached. If the final segment is greater than 5 amino acids and the average P(a) > P(b) that segment will be considered as helix.
- Identify of helical segments in the given sequence.
- Any region where 3 of 5 contiguous amino acids have a P(b) > 100 will be considered as alpha-helix. This alpha-helix will extend in both directions until 4 contiguous amino acids with an average of p(b) < 100 are not reached. If the final segment has P(b) > 105 and the average P(b) > P(a) that segment will be considered as beta-sheet.
- Segments with overlapping alpha-helical and beta-sheet assignments are considered as helical if the average P(a) > P(b) and a beta sheet if the average P(b) > P(a).
- The turn on any residue (j) will be calculated by the following formula:
- p(t) = f(j)f(j+1)f(j+2)f(j+3) (where the f(j+1) value for the j+1 residue, the f(j+2) value for the j+2 residue and the f(j+3) value for the j+3 residue).
- The beta-turn is predicted if the following 3 criteria's are fulfilled:
- p(t) > 0.000075
- The average value for P(t) > 1.00 in the tetrapeptide
- The averages for the tetrapeptide obey the inequality P(a) < P(t) > P(b)