diff --git a/src/keyworddocs.jl b/src/keyworddocs.jl index 658d7a7c..55310f08 100644 --- a/src/keyworddocs.jl +++ b/src/keyworddocs.jl @@ -12,7 +12,7 @@ const KEYWORD_DOCS = """ * `ignoreemptyrows::Bool=true`: whether empty rows in a file should be ignored (if `false`, each column will be assigned `missing` for that empty row) * `select`: an `AbstractVector` of `Integer`, `Symbol`, `String`, or `Bool`, or a "selector" function of the form `(i, name) -> keep::Bool`; only columns in the collection or for which the selector function returns `true` will be parsed and accessible in the resulting `CSV.File`. Invalid values in `select` are ignored. * `drop`: inverse of `select`; an `AbstractVector` of `Integer`, `Symbol`, `String`, or `Bool`, or a "drop" function of the form `(i, name) -> drop::Bool`; columns in the collection or for which the drop function returns `true` will ignored in the resulting `CSV.File`. Invalid values in `drop` are ignored. - * `limit`: an `Integer` to indicate a limited number of rows to parse in a csv file; use in combination with `skipto` to read a specific, contiguous chunk within a file; note for large files when multiple threads are used for parsing, the `limit` argument may not result in an exact # of rows parsed; use `threaded=false` to ensure an exact limit if necessary + * `limit`: an `Integer` to indicate a limited number of rows to parse in a csv file; use in combination with `skipto` to read a specific, contiguous chunk within a file; note for large files when multiple threads are used for parsing, the `limit` argument may not result in an exact # of rows parsed; use `ntasks=1` to ensure an exact limit if necessary * `buffer_in_memory`: a `Bool`, default `false`, which controls whether a `Cmd`, `IO`, or gzipped source will be read/decompressed in memory vs. using a temporary file. * `ntasks::Integer=Threads.nthreads()`: [not applicable to `CSV.Rows`] for multithreaded parsed files, this controls the number of tasks spawned to read a file in concurrent chunks; defaults to the # of threads Julia was started with (i.e. `JULIA_NUM_THREADS` environment variable or `julia -t N`); setting `ntasks=1` will avoid any calls to `Threads.@spawn` and just read the file serially on the main thread; a single thread will also be used for smaller files by default (< 5_000 cells) * `rows_to_check::Integer=$DEFAULT_ROWS_TO_CHECK`: [not applicable to `CSV.Rows`] a multithreaded parsed file will be split up into `ntasks` # of equal chunks; `rows_to_check` controls the # of rows are checked to ensure parsing correctly found valid rows; for certain files with very large quoted text fields, `lines_to_check` may need to be higher (10, 30, etc.) to ensure parsing correctly finds these rows