[R] [Rd] Scan data from a .txt file
Marc Schwartz (via MN)
mschwartz at mn.rr.com
Thu Nov 17 17:24:47 CET 2005
I have a feeling that Vasu wants (mistakenly) this:
dat <- read.table("clipboard", header = FALSE)
> dat
V1 V2 V3 V4
1 Name Weight Height Gender
2 Anne 150 65 F
3 Rob 160 68 M
4 George 180 65 M
5 Greg 205 69 M
> str(dat)
`data.frame': 5 obs. of 4 variables:
$ V1: Factor w/ 5 levels "Anne","George",..: 4 1 5 2 3
$ V2: Factor w/ 5 levels "150","160","180",..: 5 1 2 3 4
$ V3: Factor w/ 4 levels "65","68","69",..: 4 1 2 1 3
$ V4: Factor w/ 3 levels "F","Gender","M": 2 1 3 3 3
> dat$V1
[1] Name Anne Rob George Greg
Levels: Anne George Greg Name Rob
> dat$V2
[1] Weight 150 160 180 205
Levels: 150 160 180 205 Weight
> dat$V3
[1] Height 65 68 65 69
Levels: 65 68 69 Height
> dat$V4
[1] Gender F M M M
Levels: F Gender M
So that the colnames are actually part of the data frame columns.
Vasu, note however that all values become factors or you can convert to
character, for example:
> as.character(dat$V1)
[1] "Name" "Anne" "Rob" "George" "Greg"
neither of which I suspect is what you really want.
You can access the column names of the data frame using colnames():
> dat <- read.table("clipboard", header = TRUE)
> dat
Name Weight Height Gender
1 Anne 150 65 F
2 Rob 160 68 M
3 George 180 65 M
4 Greg 205 69 M
> colnames(dat)
[1] "Name" "Weight" "Height" "Gender"
This keeps the column names separate from the actual data, which unless
we are missing something here, is the proper way to do this. Think of a
data frame as a rectangular data set, which can contain more than one
data type across the columns, much like a spreadsheet. The difference
here (unlike a spreadsheet) is that the first row does not contain the
column names/labels. These are separate from the data itself, which in a
typical spreadsheet would start on row 2.
Note as Andy pointed out, that in this case, you should use
read.table(), not scan().
Review "An Introduction To R" and the "R Data Import/Export" manuals for
more information. Both are available with your installation and/or from
the main R web site under Documentation.
HTH,
Marc Schwartz
On Thu, 2005-11-17 at 10:41 -0500, Liaw, Andy wrote:
> [Re-directing to R-help, as this is more appropriate there.]
>
> I tried copying the snippet of data into the windows clipboard and tried it:
>
> > dat <- read.table("clipboard", header=T)
> > dat
> Name Weight Height Gender
> 1 Anne 150 65 F
> 2 Rob 160 68 M
> 3 George 180 65 M
> 4 Greg 205 69 M
> > str(dat)
> `data.frame': 4 obs. of 4 variables:
> $ Name : Factor w/ 4 levels "Anne","George",..: 1 4 2 3
> $ Weight: int 150 160 180 205
> $ Height: int 65 68 65 69
> $ Gender: Factor w/ 2 levels "F","M": 1 2 2 2
> > dat <- read.table("clipboard", header=T, row=1)
> > str(dat)
> `data.frame': 4 obs. of 3 variables:
> $ Weight: int 150 160 180 205
> $ Height: int 65 68 65 69
> $ Gender: Factor w/ 2 levels "F","M": 1 2 2 2
> > dat
> Weight Height Gender
> Anne 150 65 F
> Rob 160 68 M
> George 180 65 M
> Greg 205 69 M
>
> Don't see how it "doesn't work". Please give more detail on what "doesn't
> work" means.
>
> Andy
>
> > From: Vasundhara Akkineni
> >
> > Hi all,
> > Am trying to read data from a .txt file in such a way that i
> > can access the
> > column names too. For example, the data in the table.txt file
> > is as below:
> > Name Weight Height Gender
> > Anne 150 65 F
> > Rob 160 68 M
> > George 180 65 M
> > Greg 205 69 M
> > i used the following commands:
> > data<-scan("table.txt",list("",0,0,0),sep="")
> > a<-data[[1]]
> > b<-data[[2]]
> > c<-data[[3]]
> > d<-data[[4]]
> > But this doesn't work because of type mismatch. I want to
> > pull the col
> > names also into the respective lists. For example i want 'b' to have
> > (weight,150,160,180,205) so that i can access the col name
> > and also the
> > induvidual weights. I tried using the read.table method too,
> > but couldn't
> > get this working. Can someone suggest a way to do this.
> > Thanks,
> > Vasu.
> >
More information about the R-help
mailing list