You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I would like to use NIST website (https://www.ncnr.nist.gov/resources/activation/) to calculate SLD of a protein containing phosphoserine. I have tried filling amino acid sequence of a protein on the website and used “J” for phosphoserine, however, it didn’t recognize the “J” as phosphoserine because I didn’t see any phosphorus in the chemical composition of the sample.
So I was wondering if there is another way to include phosphoserine on the website.
Looking at wikipedia, J is used in FASTA to represent either L or I,[1] so I average them 50:50.[2]
I see that there are a number of post-translational modifications that may occur,[3] but I don't know which formats can represent them. I can imagine extending FASTA with an optional lower case translation code after each sequence element. For example, phosphoserine could be Sp rather than S. This would be easy enough to parse, but I would rather not invent a new format if one already exists.
Once the format is defined, and the parser[4] updated, the residue table[5] will need to be extended with new codes, volumes, chemical formulae (including labile hydrogen and charge), and name.
Meanwhile, you can do this in stages. Enter the fasta sequence and press calculate then type in
nHPO3 + sample formula @ density
where n is the number of phosphorylized serine and sample formula + density is printed by the first calculation. The density will be wrong, but probably within uncertainty since (a) the number of SEP will be small relative to the total sequence and (b) the computed density is already a poor approximation given that it assumes perfectly packed residue volumes regardless of protein conformation.
Looking at wikipedia, J is used in FASTA to represent either L or I,[1] so I average them 50:50.[2]
I see that there are a number of post-translational modifications that may occur,[3] but I don't know which formats can represent them. I can imagine extending FASTA with an optional lower case translation code after each sequence element. For example, phosphoserine could be Sp rather than S. This would be easy enough to parse, but I would rather not invent a new format if one already exists.
Once the format is defined, and the parser[4] updated, the residue table[5] will need to be extended with new codes, volumes, chemical formulae (including labile hydrogen and charge), and name.
[1] FASTA: https://en.wikipedia.org/wiki/FASTA_format#Sequence_representation
[2] periodictable fasta 'J': https://github.com/pkienzle/periodictable/blob/master/periodictable/fasta.py#L351
[3] PTMs by residue: https://en.wikipedia.org/wiki/Posttranslational_modification#Common_PTMs_by_residue
[4] FASTA parser:
periodictable/periodictable/fasta.py
Lines 198 to 208 in 4fb8068
[5] residue table:
periodictable/periodictable/fasta.py
Lines 320 to 354 in 4fb8068
The text was updated successfully, but these errors were encountered: