[R] data frame is killing me! help
bbslover
dluthm at yeah.net
Mon Oct 26 17:44:53 CET 2009
Thank you ,Petr
It is a good answer,clearly.
thanks!
Petr Pikal wrote:
>
> Hi
>
>> data(gasoline)
>> str(gasoline)
> 'data.frame': 60 obs. of 2 variables:
> $ octane: num 85.3 85.2 88.5 83.4 87.9 ...
> $ NIR : AsIs [1:60, 1:401] -0.050193 -0.044227 -0.046867 -0.046705
> -0.050859 ...
> ..- attr(*, "dimnames")=List of 2
> .. ..$ : chr "1" "2" "3" "4" ...
> .. ..$ : chr "900 nm" "902 nm" "904 nm" "906 nm" ...
>> str(gasoline$NIR)
> AsIs [1:60, 1:401] -0.050193 -0.044227 -0.046867 -0.046705 -0.050859 ...
> - attr(*, "dimnames")=List of 2
> ..$ : chr [1:60] "1" "2" "3" "4" ...
> ..$ : chr [1:401] "900 nm" "902 nm" "904 nm" "906 nm" ...
>> is.matrix(gasoline$NIR)
> [1] TRUE
>
> so the second element of gasoline data frame is a matrix
>
>> ?AsIs
>
>> df<-data.frame(x=1:5, I(matrix(rnorm(10), 5,2)))
>> df
> x matrix.rnorm.10...5..2..1 matrix.rnorm.10...5..2..2
> 1 1 0.187703.... 0.213312....
> 2 2 -0.66264.... -0.47941....
> 3 3 -0.82334.... -0.04324....
> 4 4 -0.37255.... 0.883027....
> 5 5 -0.28700.... -1.03431....
>> str(df)
> 'data.frame': 5 obs. of 2 variables:
> $ x : int 1 2 3 4 5
> $ matrix.rnorm.10...5..2.: AsIs [1:5, 1:2] 0.187703.... -0.66264....
> -0.82334.... -0.37255.... -0.28700.... ...
>>
>
> Regards
> Petr
>
> r-help-bounces at r-project.org napsal dne 23.10.2009 18:43:56:
>
>>
>> I have read that one ,I want to this method to be used to my data.but I
> donot
>> know how to put my data into R.
>>
>> James W. MacDonald wrote:
>> >
>> >
>> >
>> > bbslover wrote:
>> >>
>> >>
>> >> Steve Lianoglou-6 wrote:
>> >>> Hi,
>> >>>
>> >>> On Oct 22, 2009, at 2:35 PM, bbslover wrote:
>> >>>
>> >>>> Usage
>> >>>> data(gasoline)
>> >>>> Format
>> >>>> A data frame with 60 observations on the following 2 variables.
>> >>>> octane
>> >>>> a numeric vector. The octane number.
>> >>>> NIR
>> >>>> a matrix with 401 columns. The NIR spectrum
>> >>>>
>> >>>> and I see the gasoline data to see below
>> >>>> NIR.1686 nm NIR.1688 nm NIR.1690 nm NIR.1692 nm NIR.1694 nm
> NIR.1696
>> >>>> nm
>> >>>> NIR.1698 nm NIR.1700 nm
>> >>>> 1 1.242645 1.250789 1.246626 1.250985 1.264189 1.244678 1.245913
>> >>>> 1.221135
>> >>>> 2 1.189116 1.223242 1.253306 1.282889 1.215065 1.225211 1.227985
>> >>>> 1.198851
>> >>>> 3 1.198287 1.237383 1.260979 1.276677 1.218871 1.223132 1.230321
>> >>>> 1.208742
>> >>>> 4 1.201066 1.233299 1.262966 1.272709 1.211068 1.215044 1.232655
>> >>>> 1.206696
>> >>>> 5 1.259616 1.273713 1.296524 1.299507 1.226448 1.230718 1.232864
>> >>>> 1.202926
>> >>>> 6 1.24109 1.262138 1.288401 1.291118 1.229769 1.227615 1.22763
>> >>>> 1.207576
>> >>>> 7 1.245143 1.265648 1.274731 1.292441 1.218317 1.218147 1.222273
>> >>>> 1.200446
>> >>>> 8 1.222581 1.245782 1.26002 1.290305 1.221264 1.220265 1.227947
>> >>>> 1.188174
>> >>>> 9 1.234969 1.251559 1.272416 1.287405 1.211995 1.213263 1.215883
>> >>>> 1.196102
>> >>>>
>> >>>> look at this NIR.1686 nm NIR.1688 nm NIR.1690 nm NIR.1692 nm NIR.
>> >>>> 1694 nm
>> >>>> NIR.1696 nm NIR.1698 nm NIR.1700 nm
>> >>>>
>> >>>> how can I add letters NIR to my variable, because my 600
>> >>>> independents never
>> >>>> have NIR as the prefix. however, it is needed to model the plsr.
> for
>> >>>> example aa=plsr(y~NIR, data=data ,....), the prefix NIR is
>> >>>> necessary, how
>> >>>> can I do with it?
>> >>> I'm not really sue that I'm getting you, but if your problem is that
>
>> >>> the column names of your data.frame don't match the variable names
>> >>> you'd like to use in your formula, just change the colnames of your
>> >>> data.frame to match your formula.
>> >>>
>> >>> BTW - I have no idea where to get this gasoline data set, so I'm
> just
>> >>> imagining:
>> >>>
>> >>> eg.
>> >>> colnames(gasoline) <- c('put', 'the', 'variable', 'names', 'that',
>> >>> 'you', 'want', 'here')
>> >>>
>> >>> -steve
>> >>>
>> >>> --
>> >>> Steve Lianoglou
>> >>> Graduate Student: Computational Systems Biology
>> >>> | Memorial Sloan-Kettering Cancer Center
>> >>> | Weill Medical College of Cornell University
>> >>> Contact Info: http://cbio.mskcc.org/~lianos/contact
>> >>>
>> >>> ______________________________________________
>> >>> R-help at r-project.org mailing list
>> >>> https://stat.ethz.ch/mailman/listinfo/r-help
>> >>> PLEASE do read the posting guide
>> >>> http://www.R-project.org/posting-guide.html
>> >>> and provide commented, minimal, self-contained, reproducible code.
>> >>>
>> >>>
>> >>
>> >> thanks for you. but the numbers of indenpendence are so many, it is
> not
>> >> easy
>> >> to identify them one by one, is there some better way?
>> >
>> > You don't need to identify anything. What you need to do is read the
>> > help page for the function you want to use, so you (at the very least)
>
>> > know how to use the function.
>> >
>> > > library(pls)
>> > > data(gasoline)
>> > > fit <- plsr(octane~NIR, data=gasoline, validation = "CV")
>> > > summary(fit)
>> > Data: X dimension: 60 401
>> > Y dimension: 60 1
>> > Fit method: kernelpls
>> > Number of components considered: 53
>> >
>> > VALIDATION: RMSEP
>> > Cross-validated using 10 random segments.
>> > (Intercept) 1 comps 2 comps 3 comps 4 comps 5 comps 6
> comps
>> > CV 1.543 1.372 0.3827 0.2522 0.2347 0.2455 0.2281
>> > adjCV 1.543 1.367 0.3740 0.2497 0.2360 0.2407 0.2243
>> > 7 comps 8 comps 9 comps 10 comps 11 comps 12 comps 13
> comps
>> > CV 0.2311 0.2352 0.2455 0.2534 0.2737 0.2814 0.2832
>> > adjCV 0.2257 0.2303 0.2395 0.2473 0.2646 0.2705 0.2726
>> > 14 comps 15 comps 16 comps 17 comps 18 comps 19 comps 20
>> > comps
>> > CV 0.2913 0.2932 0.2985 0.3137 0.3289 0.3323
>> > 0.3391
>> > adjCV 0.2808 0.2821 0.2863 0.3008 0.3141 0.3172
>> > 0.3228
>> > 21 comps 22 comps 23 comps 24 comps 25 comps 26 comps 27
>> > comps
>> > CV 0.3476 0.3384 0.3316 0.3213 0.3155 0.3118
>> > 0.3062
>> > adjCV 0.3307 0.3217 0.3154 0.3057 0.3002 0.2964
>> > 0.2908
>> > 28 comps 29 comps 30 comps 31 comps 32 comps 33 comps 34
>> > comps
>> > CV 0.3033 0.3034 0.3074 0.3083 0.3094 0.3087
>> > 0.3105
>> > adjCV 0.2881 0.2881 0.2917 0.2926 0.2936 0.2929
>> > 0.2946
>> > 35 comps 36 comps 37 comps 38 comps 39 comps 40 comps 41
>> > comps
>> > CV 0.3108 0.3106 0.3105 0.3104 0.3104 0.3105
>> > 0.3105
>> > adjCV 0.2949 0.2947 0.2946 0.2945 0.2945 0.2945
>> > 0.2946
>> > 42 comps 43 comps 44 comps 45 comps 46 comps 47 comps 48
>> > comps
>> > CV 0.3105 0.3105 0.3105 0.3105 0.3105 0.3105
>> > 0.3105
>> > adjCV 0.2946 0.2946 0.2946 0.2946 0.2946 0.2946
>> > 0.2946
>> > 49 comps 50 comps 51 comps 52 comps 53 comps
>> > CV 0.3105 0.3105 0.3105 0.3105 0.3105
>> > adjCV 0.2946 0.2946 0.2946 0.2946 0.2946
>> >
>> > TRAINING: % variance explained
>> > 1 comps 2 comps 3 comps 4 comps 5 comps 6 comps 7 comps
>
>> > 8 comps
>> > X 70.97 78.56 86.15 95.4 96.12 96.97 97.32
>> > 98.1
>> > octane 31.90 94.66 97.71 98.0 98.68 98.93 99.06
>> > 99.1
>> > 9 comps 10 comps 11 comps 12 comps 13 comps 14 comps 15
>> > comps
>> > X 98.32 98.71 98.84 99.00 99.21 99.46
>> > 99.52
>> > octane 99.20 99.24 99.36 99.44 99.49 99.51
>> > 99.58
>> > 16 comps 17 comps 18 comps 19 comps 20 comps 21 comps 22
>
>> > comps
>> > X 99.57 99.64 99.68 99.76 99.78 99.82
>> > 99.84
>> > octane 99.65 99.69 99.78 99.81 99.86 99.89
>> > 99.92
>> > 23 comps 24 comps 25 comps 26 comps 27 comps 28 comps 29
>
>> > comps
>> > X 99.88 99.91 99.92 99.93 99.94 99.95
>> > 99.96
>> > octane 99.93 99.94 99.95 99.97 99.98 99.99
>> > 99.99
>> > 30 comps 31 comps 32 comps 33 comps 34 comps 35 comps 36
>
>> > comps
>> > X 99.96 99.97 99.97 99.98 99.98 99.98
>> > 99.98
>> > octane 99.99 100.00 100.00 100.00 100.00 100.00
>> > 100.00
>> > 37 comps 38 comps 39 comps 40 comps 41 comps 42 comps 43
>
>> > comps
>> > X 99.99 99.99 99.99 99.99 100 100
>> > 100
>> > octane 100.00 100.00 100.00 100.00 100 100
>> > 100
>> > 44 comps 45 comps 46 comps 47 comps 48 comps 49 comps 50
>
>> > comps
>> > X 100 100 100 100 100 100
>> > 100
>> > octane 100 100 100 100 100 100
>> > 100
>> > 51 comps 52 comps 53 comps
>> > X 100 100 100
>> > octane 100 100 100
>> >
>> >
>> >>
>> >>
>> >
>> > --
>> > James W. MacDonald, M.S.
>> > Biostatistician
>> > Douglas Lab
>> > University of Michigan
>> > Department of Human Genetics
>> > 5912 Buhl
>> > 1241 E. Catherine St.
>> > Ann Arbor MI 48109-5618
>> > 734-615-7826
>> >
>> > ______________________________________________
>> > R-help at r-project.org mailing list
>> > https://stat.ethz.ch/mailman/listinfo/r-help
>> > PLEASE do read the posting guide
>> > http://www.R-project.org/posting-guide.html
>> > and provide commented, minimal, self-contained, reproducible code.
>> >
>> >
>>
>> --
>> View this message in context:
> http://www.nabble.com/data-frame-is-killing-me%
>> 21-help-tp26015079p26029667.html
>> Sent from the R help mailing list archive at Nabble.com.
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>
--
View this message in context: http://www.nabble.com/data-frame-is-killing-me%21-help-tp26015079p26063206.html
Sent from the R help mailing list archive at Nabble.com.
More information about the R-help
mailing list