[R] how to address last and all but last column in dataframe
David Winsemius
dwinsemius at comcast.net
Sat Sep 6 21:52:17 CEST 2008
Not sure where your "input" came from. It's not in a format I would
have expected of an R object and the first line is not in a form that
would be particularly easy to read into a valid R object. Numbers are
no legitimate object names. It's also not clear what you want to do
with the duplicated line numbers at the beginning. Your question
implies that you do not consider them part of the data.
In the future a worked example along the lines of that constructed by
Jorge Ivan Velez in a recent answer to another question might increase
chances of a prompt reply with tested code:
# Data set
DF=read.table(textConnection("V1 V2 V3
a b 0:1:12
d f 1:2:1
c d 1:0:9
b e 2:2:6
f c 5:5:0"),header=TRUE)
closeAllConnections()
The "length" of a dataframe is the number of columns.
?length
Dataframes can be referenced using the extract operation e.g.
df[<row>, <col>]
?Extract # for additional information on indexing using column
vectors.
So:
video[ ,length(video)] #should return the last column vector although
it will be no longer be named.
The rest of the dataframe with intact column names could be obtained
with:
video[ ,-length(video)]
--
David Winsemius
On Sep 6, 2008, at 3:00 PM, drflxms wrote:
> Dear R-colleagues,
>
> another question from a newbie: I am creating a lot of simple
> pivot-charts from my raw data using the reshape-package. In these
> charts
> we have medical doctors judging videos in the columns and the videos
> they judge in the rows. Simple example of chart/data.frame "input"
> with
> two categories 1/0:
>
> video 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21
>
> 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
> 2 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1
> 3 3 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
> 4 4 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
> 5 5 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0
> 6 6 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0
> 7 7 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
> 8 8 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0
> 9 9 0 0 0 0 0 0 0 0 0 1 0 1 1 0 1 1 0 0 0 1 0
> 10 10 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
>
> I recently learned, that I can easily create a confusion matrix out of
> this data using the following commands:
>
> pairs<-data.frame(pred=factor(unlist(input[2:21])),ref=factor(input[,
> 22]))
> pred<-pairs$pred
> ref <- pairs$ref
> library (caret)
> confusionMatrix(pred, ref, positive=1)
>
> - where column 21 is the reference/goldstandard.
>
> My problem is now, that I analyse data.frames with an unknown count of
> columns. So to get rid of the first and last column for the "pred"
> variable and to select the last column for the "ref" variable, I
> have to
> look at the data.frame before doing the above commands to set the
> proper
> column numbers.
>
> It would be very comfortable, if I could address the last column not
> by
> number (where I have to count beforehand) but by a variable "last
> column".
>
> Probably there is a more easy solution for this problem using the
> names
> of the columns as well: the reference is always number "21" the first
> column is always called "video". So I tried:
>
> attach(input)
> pairs<-data.frame(pred=factor(unlist(input[[,-c(video,
> 21)]])),ref=factor(input[[21]]))
>
> which does not work unfortunately :-(.
>
> I'd be very happy in case someone could help me out, cause I am really
> tired of counting - there are a lot of tables to analyse...
>
> Cheers and greetings from Munich,
> Felix
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list