Frequently Asked Questions:

Q.  I’m not sure of what protein too look up. Where should I start?

A. A number of case studies of CoC analysis are already published. Click here to read more and explore these protein's pages.

 

Q. I know which pdb I'm interested in, but could you explain what all the stuff on the protein's page means?

A. Here is an annotated version of a protein's main page.

 

Q.  Why can’t I find my protein?

A.  Since we use the FSSP as our source of representative structures, only those structures will have entries.  If no “related structures” are retuned in your search, you may wish to visit the FSSP and find that PDB’s representative.

 

Q.  Why doesn’t my protein have any CoC residues?

A.  Due to high standards of statistical rigor, not every protein will have significant predictions.

 

Q. Why do we need to use both entropy and p-value cutoffs in predicting important residues?

A. Two cutoffs are needed because small number of residues have low p-value, but relatively high entropy. We noticed this a while ago and always wanted to go after these residues as they can be variable (higher entropy) surface residues that are less variable than expected indicating some involvement in binding. These residues are not truly CoC, so we need to filter them out by the cutoff on entropy.

 

Q.  What is a "significant" result and why do some residues have p-values that are off the chart?

A. Any p-value < 10-8 is extremely significant.  Although we report values as low as 10-17, we only show p-values as low as 10-9 since plotting them would make the graphs more difficult to read without adding additional information.

 

Q. What are "six types" and "twenty types" in the context of amino acids?

A. In "twenty types" the statistics are calculated considering each of the twenty naturally occurring amino acids as distinct.  "Six types" is based on the observation that amino acids may be grouped by their physical properties into aliphatic (AVLIMC), aromatic (FWYH), polar (STNQ), positive (KR), negative (DE), and "special" (GP) types. Such grouping of amino acids avoids penalizing residue positions with conservative substitutions such as Val to Leu and results in better statistics for calculation of CoC (see methods section).

 

Q. Why are the cutoffs different when six vs. twenty amino acid types are used?

A. The entropy cutoff must be different since the range of values is different (form 0 to log(6) and from 0 to log(20)).

 

Q.  Can I choose my own entropy and p-value cutoffs?  How should I do this?

A.  Below are plots of p-value distributions and number of CoC predictions at various cutoffs that may be helpful in selecting values appropriate to your situation.

 

 

Copyright © 2004 Hubner, Donald, Shakhnovich, and Mirny

FAQ