[R] Help with how to process multiple column variable in a read.table
arun
smartpink111 at yahoo.com
Thu May 16 17:58:24 CEST 2013
Hi,
Try this:
unemp.wy <- read.table("ftp://ftp.bls.gov/pub/time.series/la/la.data.59.Wyoming", header=TRUE, sep="\t",stringsAsFactors=FALSE,na.strings="")
dim(unemp.wy)
#[1] 46692 5
head(unemp.wy)
# series_id year period value footnote_codes
#1 LASST56000003 1976 M01 4.2 <NA>
#2 LASST56000003 1976 M02 4.1 <NA>
#3 LASST56000003 1976 M03 4.0 <NA>
#4 LASST56000003 1976 M04 3.9 <NA>
#5 LASST56000003 1976 M05 3.9 <NA>
#6 LASST56000003 1976 M06 3.9 <NA>
str(unemp.wy)
#'data.frame': 46692 obs. of 5 variables:
# $ series_id : chr "LASST56000003 " "LASST56000003 " "LASST56000003 " "LASST56000003 " ...
# $ year : int 1976 1976 1976 1976 1976 1976 1976 1976 1976 1976 ...
# $ period : chr "M01" "M02" "M03" "M04" ...
# $ value : num 4.2 4.1 4 3.9 3.9 3.9 4 4.1 4.1 4 ...
# $ footnote_codes: chr NA NA NA NA ...
tail(unemp.wy)
# series_id year period value footnote_codes
#46687 LAUST56000006 2012 M11 305820 D
#46688 LAUST56000006 2012 M12 304293 D
#46689 LAUST56000006 2012 M13 306064 D
#46690 LAUST56000006 2013 M01 305150 <NA>
#46691 LAUST56000006 2013 M02 304918 <NA>
#46692 LAUST56000006 2013 M03 305556 P
A.K.
>I am new to R. I am trying to read a table from BLS FTP site: the
column structure has 5 columns but on the 5th column data is not always
present, >so it is throwing of error: here is my code:
>
>unemp.wy <- read.table("ftp://ftp.bls.gov/pub/time.series/la/la.data.59.Wyoming", header=FALSE, sep="", skip=2 )
>
>Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings, :
> line 384 did not have 4 elements
>
>Here is the structure of the text. About 384 rows the footnote
column gets added as well. This seems to throw of the read.table. Is it
possible to just >read the line a a text string and then parse it or is
there a better way to approach this problem.
>series_id year period value footnote_codes
>LASST56000003 1976 M01 4.2
>LASST56000003 1976 M02 4.1
>LASST56000003 1976 M03 4.0
LASST56000003 1976 M04 3.9
>LASST56000003 1976 M05 3.9
>
>Thanks I am using R after having used SAS for years, so I am
unsure of the best way to overcome a Program vector approach to data
cleansing.
>
>Thanks
More information about the R-help
mailing list