[R] how to address last and all but last column in dataframe

drflxms drflxms at googlemail.com
Sat Sep 6 21:00:15 CEST 2008


Dear R-colleagues,

another question from a newbie: I am creating a lot of simple
pivot-charts from my raw data using the reshape-package. In these charts
we have medical doctors judging videos in the columns and the videos
they judge in the rows. Simple example of chart/data.frame "input" with
two categories 1/0:

video 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21

1      1 0 0 0 0 0 0 0 0 0  0  0  0  0  0  0  0  0  0  0  0  0
2      2 0 0 0 0 0 0 0 0 0  0  0  0  0  0  0  0  0  0  0  0  1
3      3 0 0 0 0 0 0 0 0 0  0  0  0  0  0  0  0  0  0  0  0  0
4      4 0 0 0 0 0 0 0 0 0  0  0  0  0  0  0  0  0  0  0  0  0
5      5 0 0 0 0 0 0 0 0 0  0  0  0  0  0  0  0  0  0  0  1  0
6      6 0 0 0 0 0 0 0 0 0  0  0  0  0  1  0  0  0  0  0  0  0
7      7 0 0 0 0 0 0 0 0 0  0  0  0  0  0  0  0  0  0  0  0  0
8      8 0 0 0 0 0 0 0 0 0  0  0  0  0  0  1  0  0  0  0  0  0
9      9 0 0 0 0 0 0 0 0 0  1  0  1  1  0  1  1  0  0  0  1  0
10    10 0 0 0 0 0 0 0 0 0  0  0  0  0  0  0  0  0  0  0  0  0

I recently learned, that I can easily create a confusion matrix out of
this data using the following commands:

pairs<-data.frame(pred=factor(unlist(input[2:21])),ref=factor(input[,22]))
pred<-pairs$pred
ref <- pairs$ref
library (caret)
confusionMatrix(pred, ref, positive=1)

- where column 21 is the reference/goldstandard.

My problem is now, that I analyse data.frames with an unknown count of
columns. So to get rid of the first and last column for the "pred"
variable and to select the last column for the "ref" variable, I have to
look at the data.frame before doing the above commands to set the proper
column numbers.

It would be very comfortable, if I could address the last column not by
number (where I have to count beforehand) but by a variable "last column".

Probably there is a more easy solution for this problem using the names
of the columns as well: the reference is always number "21" the first
column is always called "video". So I tried:

attach(input)
pairs<-data.frame(pred=factor(unlist(input[[,-c(video,21)]])),ref=factor(input[[21]]))

which does not work unfortunately :-(.

I'd be very happy in case someone could help me out, cause I am really
tired of counting - there are a lot of tables to analyse...

Cheers and greetings from Munich,
Felix



More information about the R-help mailing list