[R] Loading large .pxt and .asc datasets causes issues.

Torvon torvon at gmail.com
Tue Feb 23 20:13:01 CET 2016


Hi,

I want to load a dataset into R. This dataset is available in two formats:
.XPT and .ASC. The dataset is available at
http://www.cdc.gov/brfss/annual_data/annual_2006.htm.

They are about 40mb zipped, and about 500mb unzipped.

I can get the .xpt data to load, using:

> library(hmisc)
> data <- sasxport.get("CDBRFS06.XPT")

The data look fine, no error messages. However, the data only contains 302
columns, which is less than it should have (according to the
documentation). It does not contain my variables of interest, so either the
documentation or the data file is wrong, and I want to make sure it's not
the data file.

Hence I wanted to see if I get the same results loading the .ASC file.
However, multiple ways to do so have failed.

> library(adehabitat)
> import.asc("CDBRFS06.asc")

Results in:
Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings,
: scan() expected 'a real', got '1191.8808943.38209868648.960119'

> library(SDMTools)
> read.asc("CDBRFS06.asc")

Results in:
Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings,
: scan() expected 'a real', got '1191.8808943.38209868648.960119' In
addition: Warning messages: 1: In scan(file, what, nmax, sep, dec, quote,
skip, nlines, na.strings, : number of items read is not a multiple of the
number of columns 2: In scan(file, what, nmax, sep, dec, quote, skip,
nlines, na.strings, : number of items read is not a multiple of the number
of columns 3: In scan(file, what, nmax, sep, dec, quote, skip, nlines,
na.strings, : number of items read is not a multiple of the number of
columns 4: In scan(file, what, nmax, sep, dec, quote, skip, nlines,
na.strings, : number of items read is not a multiple of the number of
columns 5: In scan(file, nmax = nl * nc, skip = 6, quiet = TRUE) : NAs
introduced by coercion to integer range

Thank you for your help.
   Eiko

	[[alternative HTML version deleted]]



More information about the R-help mailing list