The problem of choosing an optimal codon sequence arises when synthetic protein-coding genes are added to cloning vectors for expression within a non-native host organism: to maximize yield, the chosen codons should have a high frequency in the host genome, but particular nucleotide bases sequences (called “motifs”) should be avoided or, instead, included. Dynamic programming (DP) has successfully been used in previous approaches to this problem. However, DP has a computational limit, especially when long motifs are forbidden, and does not allow control of motif positioning and combination. We reformulate the problem as an integer linear program (IP) and show that, with the same computational resources, one can easily solve problems with much more nucleotide bases and much longer forbidden/desired motifs than with DP. Moreover, IP (i) offers more flexibility than DP to treat constraints/objectives of different nature, and (ii) can efficiently deal with newly discovered critical motifs by dynamically re-optimizing additional variables and mathematical constraints.

Codon optimization by 0-1 linear programming

Arbib C.;Rossi F.;Tessitore A.
2020

Abstract

The problem of choosing an optimal codon sequence arises when synthetic protein-coding genes are added to cloning vectors for expression within a non-native host organism: to maximize yield, the chosen codons should have a high frequency in the host genome, but particular nucleotide bases sequences (called “motifs”) should be avoided or, instead, included. Dynamic programming (DP) has successfully been used in previous approaches to this problem. However, DP has a computational limit, especially when long motifs are forbidden, and does not allow control of motif positioning and combination. We reformulate the problem as an integer linear program (IP) and show that, with the same computational resources, one can easily solve problems with much more nucleotide bases and much longer forbidden/desired motifs than with DP. Moreover, IP (i) offers more flexibility than DP to treat constraints/objectives of different nature, and (ii) can efficiently deal with newly discovered critical motifs by dynamically re-optimizing additional variables and mathematical constraints.
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: http://hdl.handle.net/11697/148252
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 2
  • ???jsp.display-item.citation.isi??? 0
social impact