Currently, you have a number of different options for continuing. Since you said that you have identified a "novel" protein, we will assume that you have already searched the sequence against the known protein and gene databases, including those such as GenBank, SwissProt, EMBL, etc.
As a practical matter, you could use the following scheme, but it is not the only approach by any means . 1) Generate the possible codons for the entire amino acid sequence by virtue of "reverse translation". 2) Search this sequence against the known databases to make certain that it is truly "unique". 3) Generate synthetic DNA primers (oligonucleotides) from regions that have "low redundancy" in your sequence (not an absolute requirement though). 4) Prepare complementary DNA copies from the messenger RNA isolated from several tissues, or from the specific cells themselves from which the protein was purified. 5) Using the polymerase chain reaction, amplify a DNA fragment that encodes the protein information. 6) Sequence this new piece of DNA. 7) Prepare more of the fragment and label it by one of several different means and use it to screen a "genomic DNA library" prepared from the DNA of the starting organism. 8) Sequence the isolated clones from the library.
Keep in mind that there are several different ways to do the task that you have undertaken and that the availability of different tools and reagents will determine the one that you can most readily use.
2006-07-24 01:48:13
·
answer #1
·
answered by Gene Guy 5
·
1⤊
0⤋
Since the genetic code is a degenerate code, there are many possible sequences which could code for a given protein. However, you can look for portions of the protein that have specific amino acids that use only one or two codon triplets in a series and write the possible sequences for that and use it as your search criterion.
For example, if your novel protein held a sequence of methionine, tryptophan, and cysteine, there would be only two unique sequences of triplets that would code for that. If you knew which end of the protein was transcribed first, you would know how the sequence read from 5' to 3' transcription order.
So look for unique or at least fairly unique triplet sequences, code for that, prepare a tagged string of these sequences and let them attach to the raw DNA. Where they stick will help to indicate which genes you must target.
Use a PCR to amplify the resulting genes and try again, targeting another sequence. Eventually you will have narrowed down your search to a small handful of genes which you can then sequence fairly easily.
2006-07-23 21:07:27
·
answer #2
·
answered by aichip_mark2 3
·
0⤊
0⤋
Since the genetic code is a degenerate code, there are many possible sequences which could code for a given protein. However, you can look for portions of the protein that have specific amino acids that use only one or two codon triplets in a series and write the possible sequences for that and use it as your search criterion.
For example, if your novel protein held a sequence of methionine, tryptophan, and cysteine, there would be only two unique sequences of triplets that would code for that. If you knew which end of the protein was transcribed first, you would know how the sequence read from 5' to 3' transcription order.
So look for unique or at least fairly unique triplet sequences, code for that, prepare a tagged string of these sequences and let them attach to the raw DNA. Where they stick will help to indicate which genes you must target.
Use a PCR to amplify the resulting genes and try again, targeting another sequence. Eventually you will have narrowed down your search to a small handful of genes which you can then sequence fairly easily.
2006-07-23 20:50:26
·
answer #3
·
answered by anonymous 1
·
0⤊
0⤋
If you have the correct amino acid sequence, you could use bioinformatics to help you do this.
First you'd have to blast your amino acid sequence against all known proteins in something called at blastp.
So then whatever results you get you can either select the protein based on the organism(if you know it) or else check for maximum homology and use that single or couple of proteins that nearly match the known sequence.
Then using the links on the site you can get the gene sequence.
Once that is done, you can do some cloning experiments and check whether the two are same indeed.
2006-07-23 22:57:44
·
answer #4
·
answered by v_navneet 2
·
0⤊
0⤋
If this is from a genome that has been sequenced, or a gene that has been sequenced, you should be able to find its exact match from a blast search at www.ncbi.nih.gov. Remember to do a "protein sequence against a translated database" also known as "tblastn". If you cannot find its exact match, then the gene has not been sequenced yet. Since you already know the protein sequence, you'll be able to design degenerate oligonucleotides to isolate the gene from your organism's cDNA using PCR. You can find computer programs to do this on the www.
2006-07-23 21:01:12
·
answer #5
·
answered by ♪ ♫ ☮ NYbron ☮ ♪ ♫ 6
·
0⤊
0⤋