R for custom codon optimization

Posted on Mar 28, 2023

Optimizing codon sequences for various organisms can be challenging, especially if one chooses to use more than a single codon for each aminoacid. I wrote a short R script that explains how to obtain the “optimized” sequence for a given protein sequence based on a codon frequency table. Tools to do this task are available, but it is rarely clear how exactly they work.

The code is probably not very pretty and could be simplified, but, as far as I tested it, it works.

Schematics of using codon table and protein sequence to generate a coding sequence

The full output of the R markdown script can be visualized and the .Rmd file can be downloaded from that page. It was the easieast solution to include R Markdown output to this Hugo blog.