Ach! Vae mihi! I was commissioned to construct a paper model of G. gallus catalase last year q3 and forgot about working on it until now! It is time to launch the almighty UGene with the alignment between the PDB of the cow catalase and the sequence (i.e. non-graphical) of the chicken catalase.
A brief look at the PDB entries reveals that the vertebrate with the most entries is the cow, B. taurus. There are some human models as well, but the ones available are erythrocyte catalase while the cow models are liver catalases, just like the ones used during the first laboratory assignment in AP Biology.
[If only Chimera was a bit more stable and had a better alignment interface. Ah well, UGene has a perfectly good one. Also, keep in mind the fact that I am using only B. taurus and G. gallus and H. sapiens sequences when I say anything is conserved.]
The initial 41 or so residues seem a bit less conserved. 1TGU reflects this with a somewhat chaotic arrangement with an unusual amount of coil. This is just sequence A, so it may be incorrect. Enabling sequence D and taking a cursory look at only the ribbons (without atoms), reveals that the initial helix is probably necessary for proper assembly of the enzyme. The loosest interval at the moment (to be verified against conservation later) seems to be (16,22), although 16 and 17 would have to have fairly high helix propensities. What I previously thought was a practically non-essential coil seems to be the place where chain C binds to chain A, and hydrogen bond inference indicated that the two chains were bound relatively tightly through the interval (24,38). Perhaps, this simply indicates that the choice of amino acids for this component is less strict and easily adaptable. I doubt the heme-binding sites will be as loose.
Next, I compare the results to the alignments. I thought I must have deleted something, because the initial alanine in the PDB as seen in Chimera sequence viewer was absent. It probably matters less, because the residue itself is not visible. In any case, the crazy mix of colors that provided the bases [sic, pun not intended] for the conjecture above stops as early as about 21 (by now shifted to match the Chimera sequence numerically). In addition, the relatively conserved initial sequence seems to terminate at 14, which is quite close to 16 and probably explains the necessity of that particular sequence. For this purpose, similar amino acids were also considered, i.e. d-e, n-q, g-a, and the others. I see few differences until 40, in which there is serine (hydrophilic) in B. taurus but isoleucine and valine in the chicken and human catalases respectively. I will note this oddity when I reach the 156th residue. Perhaps, the serine is only a, “spacer,” of sorts.
Euh, 156 is aspartate in the chicken, cow, and human sequences. This will be a good excuse to take a break from dull, “high,” school for a nice 0240 of Berg, Tymoczko, and Stryer.
The next major difference seems to occur at 290, with isoleucine, arginine, and threonine for the cow, chicken, and human sequences respectively. Most coincidentally, it is on the outer surface of the protein and selecting everything within a nanometer yields only the sequences from and to the rest of the protein. In this case anything without too much reactivity should work. Replacing this residue with histidine would make an interesting study.
The next irregularity is the presence of an extra sequence TK prior to the TK found in the mammals of the group, yielding a sequence of TKTK on the interval (299,302) relative to the corrected G. gallus sequence. Mammalian TK is located next to the barrel. I will have to make sure this is not a human or computer error later. Burying a duplicate inside an already-tight location seems more than a little harmful.
Following that, there is a rather sudden appearance of less conserved residues (411,426). It seems to be another one of the surface regions because showing within a radius of 2 nm still provides the sequence clear access to the surface. My random conjecture is that it forms a hinge of some kind, but it is probably wrong because of its placement.
With that, the initial examination of catalase is finished. Tomorrow (or rather, later today), I will be preparing a general view of the critical components and focus on the most conserved intervals, because those are the most likely, “moving parts.”
Many thanks to another member (A real-life doctor with similar interests and models of proteins!) who revived my interest in this specific component of biochemistry. Sic tyrannidem nvmerorvm terminatvr.