Jennie K.

asked • 03/19/23

Write a function called get_genes(dna_string) that takes in a string of uppercase letters similar to the above, and returns a list of strings representing the genes in that DNA sequence.

Write a function called get_genes(dna_string) that takes in a string of uppercase letters similar to the above, and returns a list of strings representing the genes in that DNA sequence.


The rules for how genes work are complex, but we’re going to make some simplifying assumptions: 

  1. Individual genes are substrings that occur between three character start and stop codons (but they don’t include the codons themselves).
  2. Genes always begin with the start codon: ATG 
  3. Genes end with one of the following 3 stop codons: TAG, TAA or TGA.  
  4. Start codons can appear anywhere in the string, followed by a series of nucleotides and ending with a stop codon.  
  5. Every start codon will have a corresponding stop codon that occurs before the next start codon.  
  6. The substrings ATG, TAG, TAA, and TGA will only occur at the start and end of genes.
  7. No characters other than A, T, C, and G will be present in the string given.


For example, consider the following DNA sequence:


"TCATGTGCCCAATTCTGACCTACGATGGCCCAATAGCG"


The sequence above contains two genes, which have been underlined (the stop/start codons are shown in bold)::


"TCATGTGCCCAATTCTGACCTACGATGGCCCAATAGCG"


So get_genes("TCATGTGCCCAATTCTGACCTACGATGGCCCAATAGCG") would return the list: ["TGCCCAATTC", "GCCCAA"].





1 Expert Answer

By:

Still looking for help? Get the right answer, fast.

Ask a question for free

Get a free answer to a quick problem.
Most questions answered within 4 hours.

OR

Find an Online Tutor Now

Choose an expert and meet online. No packages or subscriptions, pay only for the time you need.