Documentation for Eidos function readCSV, which is a method of Eidos. Note that the R function is a stub, it does not do anything in R (except bring up this documentation). It will only do anything useful when used inside a slim_block function further nested in a slim_script function call, where it will be translated into valid SLiM code as part of a full SLiM script.

eidos_readCSV(filePath, colNames, colTypes, sep, quote, dec, comment)

Arguments

filePath

An object of type string. Must be of length 1 (a singleton). See details for description.

colNames

An object of type logical or string. The default value is T. See details for description.

colTypes

An object of type null or string. Must be of length 1 (a singleton). The default value is NULL. See details for description.

sep

An object of type string. Must be of length 1 (a singleton). The default value is ",". See details for description.

quote

An object of type string. Must be of length 1 (a singleton). The default value is '"'. See details for description.

dec

An object of type string. Must be of length 1 (a singleton). The default value is ".". See details for description.

comment

An object of type string. Must be of length 1 (a singleton). The default value is "". See details for description.

Value

An object of type DataFrame object. Return will be of length 1 (a singleton)

Details

Documentation for this function can be found in the official SLiM manual: page NA.

Reads data from a CSV or other delimited file specified by filePath and returns a DataFrame object containing the data in a tabular form. CSV (comma-separated value) files use a somewhat standard file format in which a table of data is provided, with values within a row separated by commas, while rows in the table are separated by newlines. Software from R to Excel (and Eidos; see the serialize() method of Dictionary) can export data in CSV format. This function can actually also read files that use a delimiter other than commas; TSV (tab-separated value) files are a popular alternative. Since there is substantial variation in the exact file format for CSV files, this documentation will try to specify the precise format expected by this function. Note that CSV files represent values differently that Eidos usually does, and some of the format options allowed by readCSV(), such as decimal commas, are not otherwise available in Eidos. If colNames is T (the default), the first row of data is taken to be a header, containing the string names of the columns in the data table; those names will be used by the resulting DataFrame. If colNames is F, a header row is not expected and column names are auto-generated as X1, X2, etc. If colNames is a string vector, a header row is not expected and colNames will be used as the column names; if additional columns exist beyond the length of colNames their names will be auto-generated. Duplicate column names will generate a warning and be made unique. If colTypes is NULL (the default), the value type for each column will be guessed from the values it contains, as described below. If colTypes is a singleton string, it should contain single-letter codes indicating the desired type for each column, from left to right. The letters lifs have the same meaning as in Eidos signatures (logical, integer, float, and string); in addition, ? may be used to indicate that the type for that column should be guessed as by default, and _ or - may be used to indicate that that column should be skipped - omitted from the returned DataFrame. Other characters in colTypes will result in an error. If additional columns exist beyond the end of the colTypes string their types will be guessed as by default. The separator between values is supplied by sep; it is a comma by default, but a tab can be used instead by supplying tab ("\t" in Eidos), or another character may also be used. If sep is the empty string "", the separator between values is "whitespace", meaning one or more spaces or tabs. When the separator is whitespace, whitespace at the beginning or the end of a line will be ignored. 74 Similarly, the character used to quote string values is a double quote ('"' in Eidos), by default, but another character may be supplied in quote. When the string delimiter is encountered, all following characters are considered to be part of the string until another string delimiter is encountered, terminating the string; this includes spaces, comment characters, newlines, and everything else. Within a string value, the string delimiter itself is used twice in a row to indicate that the delimiter itself is present within the string; for example, if the string value (shown without the usual surrounding quotes to try to avoid confusion) is she said "hello", and the string delimiter is the double quote as it is by default, then in the CSV file the value would be given as "she said ""hello""". The usual Eidos style of escaping characters using a backslash is not part of the CSV standard followed here. (When a string value is provided without using the string delimiter, all following characters are considered part of the string except a newline, the value separator sep, the quote separator quote, and the comment separator comment; if none of those characters are present in the string value, the quote delimiter may be omitted.) The character used to indicate a decimal delimiter in numbers may be supplied with dec; by default this is "." (and so 10.0 would be ten, written with a decimal point), but "," is common in European data files (and so 10,0 would be ten, written with a decimal comma). Note that dec and sep may not be the same, so that it is unambiguous whether 10,0 is two numbers (10 and 0) or one number (10.0). For this reason, European CSV files that use a decimal comma typically use a semicolon as the value separator, which may be supplied with sep=";" to readCSV(). Finally, the remainder of a line following a comment character will be ignored when the file is read; by default comment is the empty string, "", indicating that comments do not exist at all, but "#" is a popular comment prefix. To translate the CSV data into a DataFrame, it is necessary for Eidos to guess what value type each column is unless a column type is specified by colTypes. Quotes surrounding a value are irrelevant to this guess; for example, 1997 and "1997" are both candidates to be integer values (because some programs generate CSV output in which every value is quoted regardless of type). If every value in a column is either true, false, TRUE, FALSE, T, or F, the column will be taken to be logical. Otherwise, if every value in a column is an integer (here defined as an optional + or -, followed by nothing but decimal digits 0123456789), the column will be taken to be integer. Otherwise, if every value in a column is a floating-point number (here defined as an optional + or -, followed by decimal digits 0123456789, optionally a decimal separator and then optionally more decimal digits, and ending with an optional exponent like e7, E+05, or e-2), the column will be taken to be float; the special values NAN, INF, INFINITY, -INF, and -INFINITY (not case-sensitive) are also candidates to be float (if the rest of the column is also convertible to float), representing the corresponding float constants. Otherwise, the column will be taken to be string. NULL and NA are not recognized by readCSV() in CSV files and will be read as strings. Every line in a CSV file must contain the same number of values (forming a rectangular data table); missing values are not allowed by readCSV() since there is no way to represent them in DataFrame (since Eidos has no equivalent of R's NA value). Spaces are considered part of a data field and are not trimmed, following the RFC 4180 standard. These choices are an attempt to provide optimal behavior for most clients, but given the lack of any universal standard for CSV files, and the lack of any type information in the CSV format, they will not always work as desired; in such cases, it should be reasonably straightforward to preprocess input files using standard Unix text-processing tools like sed and awk.

See also

Other Eidos: Eidos, eidos_abs(), eidos_acos(), eidos_all(), eidos_any(), eidos_apply(), eidos_array(), eidos_asFloat(), eidos_asInteger(), eidos_asLogical(), eidos_asString(), eidos_asin(), eidos_assert(), eidos_atan2(), eidos_atan(), eidos_beep(), eidos_catn(), eidos_cat(), eidos_cbind(), eidos_ceil(), eidos_citation(), eidos_clock(), eidos_cmColors(), eidos_color2rgb(), eidos_colors(), eidos_cor(), eidos_cos(), eidos_cov(), eidos_createDirectory(), eidos_cumProduct(), eidos_cumSum(), eidos_c(), eidos_date(), eidos_dbeta(), eidos_debugIndent(), eidos_defineConstant(), eidos_defineGlobal(), eidos_deleteFile(), eidos_dexp(), eidos_dgamma(), eidos_diag(), eidos_dim(), eidos_dmvnorm(), eidos_dnorm(), eidos_drop(), eidos_elementType(), eidos_exists(), eidos_exp(), eidos_fileExists(), eidos_filesAtPath(), eidos_findInterval(), eidos_float(), eidos_floor(), eidos_flushFile(), eidos_format(), eidos_functionSignature(), eidos_functionSource(), eidos_getSeed(), eidos_getwd(), eidos_heatColors(), eidos_hsv2rgb(), eidos_identical(), eidos_ifelse(), eidos_integerDiv(), eidos_integerMod(), eidos_integer(), eidos_isFinite(), eidos_isFloat(), eidos_isInfinite(), eidos_isInteger(), eidos_isLogical(), eidos_isNAN(), eidos_isNULL(), eidos_isObject(), eidos_isString(), eidos_length(), eidos_license(), eidos_log10(), eidos_log2(), eidos_logical(), eidos_log(), eidos_lowerTri(), eidos_ls(), eidos_match(), eidos_matrixMult(), eidos_matrix(), eidos_max(), eidos_mean(), eidos_min(), eidos_nchar(), eidos_ncol(), eidos_nrow(), eidos_object(), eidos_order(), eidos_paste0(), eidos_paste(), eidos_pmax(), eidos_pmin(), eidos_pnorm(), eidos_print(), eidos_product(), eidos_qnorm(), eidos_quantile(), eidos_rainbow(), eidos_range(), eidos_rank(), eidos_rbeta(), eidos_rbind(), eidos_rbinom(), eidos_rcauchy(), eidos_rdunif(), eidos_readFile(), eidos_repEach(), eidos_rep(), eidos_rev(), eidos_rexp(), eidos_rf(), eidos_rgamma(), eidos_rgb2color(), eidos_rgb2hsv(), eidos_rgeom(), eidos_rlnorm(), eidos_rmvnorm(), eidos_rm(), eidos_rnbinom(), eidos_rnorm(), eidos_round(), eidos_rpois(), eidos_runif(), eidos_rweibull(), eidos_sample(), eidos_sapply(), eidos_sd(), eidos_seqAlong(), eidos_seqLen(), eidos_seq(), eidos_setDifference(), eidos_setIntersection(), eidos_setSeed(), eidos_setSymmetricDifference(), eidos_setUnion(), eidos_setwd(), eidos_sin(), eidos_size(), eidos_sortBy(), eidos_sort(), eidos_source(), eidos_sqrt(), eidos_stop(), eidos_strcontains(), eidos_strfind(), eidos_string(), eidos_strprefix(), eidos_strsplit(), eidos_strsuffix(), eidos_str(), eidos_substr(), eidos_sumExact(), eidos_sum(), eidos_suppressWarnings(), eidos_sysinfo(), eidos_system(), eidos_tabulate(), eidos_tan(), eidos_tempdir(), eidos_terrainColors(), eidos_time(), eidos_trunc(), eidos_ttest(), eidos_type(), eidos_t(), eidos_unique(), eidos_upperTri(), eidos_usage(), eidos_var(), eidos_version(), eidos_whichMax(), eidos_whichMin(), eidos_which(), eidos_writeFile(), eidos_writeTempFile()

Author

Benjamin C Haller (bhaller@benhaller.com) and Philipp W Messer (messer@cornell.edu)