Documentation for SLiM function nucleotidesToCodons, which is a method of the SLiM class SLiMBuiltin. Note that the R function is a stub, it does not do anything in R (except bring up this documentation). It will only do anything useful when used inside a slim_block function further nested in a slim_script function call, where it will be translated into valid SLiM code as part of a full SLiM script.

nucleotidesToCodons(sequence)

Arguments

sequence

An object of type integer or string. See details for description.

Value

An object of type integer.

Details

Documentation for this function can be found in the official SLiM manual: page 750.

Returns the codon sequence corresponding to the nucleotide sequence in sequence. The codon sequence is an integer vector with values from 0 to 63, based upon successive nucleotide triplets in the nucleotide sequence. The codon value for a given nucleotide triplet XYZ is 16X + 4Y + Z, where X, Y, and Z have the usual values A=0, C=1, G=2, T=3. For example, the triplet AAA has a codon value of 0, AAC is 1, AAG is 2, AAT is 3, ACA is 4, and on upward to TTT which is 63. If the nucleotide sequence AACACATTT is passed in, the codon vector 1 4 63 will therefore be returned. These codon values can be useful in themselves; they can also be passed to codonsToAminoAcids() to translate them into the corresponding amino acid sequence if desired. The nucleotide sequence in sequence may be supplied in any of three formats: a string vector with single-letter nucleotides (e.g., "T", "A", "T", "A"), a singleton string of nucleotide letters (e.g., "TATA"), or an integer vector of nucleotide values (e.g., 3, 0, 3, 0) using SLiM's standard code of A=0, C=1, G=2, T=3. If the choice of format is not driven by other considerations, such as ease of manipulation, then the singleton string format will certainly be the most memory-efficient for long sequences, and will probably also be the fastest. The nucleotide sequence provided must be a multiple of three in length, so that it translates to an integral number of codons. (is)randomNucleotides(integer$ length, [Nif basis = NULL], [string$ format = "string"]) Generates a new random nucleotide sequence with length bases. The four nucleotides ACGT are equally probable if basis is NULL (the default); otherwise, basis may be a 4-element integer or float vector providing relative fractions for A, C, G, and T respectively (these need not sum to 1.0, as they will be normalized). More complex generative models such as Markov processes are not supported intrinsically in SLiM at this time, but arbitrary generated sequences may always be loaded from files on disk. The format parameter controls the format of the returned sequence. It may be "string" to obtain the generated sequence as a singleton string (e.g., "TATA"), "char" to obtain it as a string vector of single characters (e.g., "T", "A", "T", "A"), or "integer" to obtain it as an integer vector (e.g., 3, 0, 3, 0), using SLiM's standard code of A=0, C=1, G=2, T=3. For passing directly to initializeAncestralNucleotides(), format "string" (a singleton string) will certainly be the most memory-efficient, and probably also the fastest. Memory efficiency can be a significant consideration; the nucleotide sequence for a chromosome of length 109 will occupy approximately 1 GB of memory when stored as a singleton string (with one byte per nucleotide), and much more if stored in the other formats. However, the other formats can be easier to work with in Eidos, and so may be preferable for relatively short chromosomes if you are manipulating the generated sequence.

Author

Benjamin C Haller (bhaller@benhaller.com) and Philipp W Messer (messer@cornell.edu)

Examples

## This just brings up the documentation:
nucleotidesToCodons()