R/readColumns.R
readColumns.Rd
Read a contiguous set of columns from sheet into an R data.frame. Uses the
RInterface
for speed.
readColumns(
sheet,
startColumn,
endColumn,
startRow,
endRow = NULL,
as.data.frame = TRUE,
header = TRUE,
colClasses = NA,
...
)
a Worksheet
object.
a numeric value for the starting column.
a numeric value for the ending column.
a numeric value for the starting row.
a numeric value for the ending row. If NULL
it reads
all the rows in the sheet. If you request more than the existing rows in
the sheet, the result will be truncated by the actual row number.
a logical value indicating if the result should be
coerced into a data.frame
. If FALSE
, the result is a list
with one element for each column.
a logical value indicating whether the first row corresponding
to the first element of the rowIndex
vector contains the names of the
variables.
a character vector that represents the class of each
column. Recycled as necessary, or if NA
an attempt is made to guess
the type of each column by reading the first row of data. Only
numeric
, character
, Date
, POSIXct
, column types
are accepted. Anything else will be coverted to a character
type.
If the length is less than the number of columns requested, replicate it.
other arguments to data.frame
, for example
stringsAsFactors
A data.frame or a list, depending on the as.data.frame
argument.
Use the readColumns
function when you want to read a rectangular
block of data from an Excel worksheet. If you request columns which are
blank, these will be read in as empty character "" columns. Internally, the
loop over columns is done in R, the loop over rows is done in Java, so this
function achieves good performance when number of rows >> number of columns.
Excel internally stores dates and datetimes as numeric values, and does not
keep track of time zones and DST. When a numeric column is formatted as a
datetime, it will be converted into POSIXct
class with a GMT
timezone. If you need a Date
column, you need to specify explicitly
using colClasses
argument.
For a numeric column Excels's errors and blank cells will be returned as NaN
values. Excel's #N/A
will be returned as NA. Formulas will be
evaluated. For a chracter column, blank cells will be returned as "".
read.xlsx2
for reading entire sheets. See also
addDataFrame
for writing a data.frame
to a sheet.
if (FALSE) {
file <- system.file("tests", "test_import.xlsx", package = "xlsx")
wb <- loadWorkbook(file)
sheets <- getSheets(wb)
sheet <- sheets[["all"]]
res <- readColumns(sheet, startColumn=3, endColumn=10, startRow=3,
endRow=7)
sheet <- sheets[["NAs"]]
res <- readColumns(sheet, 1, 6, 1, colClasses=c("Date", "character",
"integer", rep("numeric", 2), "POSIXct"))
}