[R] survival
Paulo Brando
pmbrando at ipam.org.br
Wed Mar 8 16:21:37 CET 2006
Dear R-helpers,
We marked 6000 leaves from 5 SPECIES - 10 individuals/species - in two
different TREATMENTs: a control and a dry-plot from which 50% of
incoming precipitation was excluded. We followed those leaves for 42
months and noted the presence and absence at each visit. I then carried
out a Cox Harzard model to see differences in leaf mortality between
parcels and among species over time:
leaves.cox <- coxph(Surv(time, censo) ~ treatment + species, data= wsuv)
When I plot 'survfitt(leaves.cox)', I come up with a survivor curve that
starts at 1 ends at 0.4. The problem is that at time 42 almost all
leaves are dead. I wander if surfit plot at time 42 should also be close
to zero?
I followed examples from Venables and Ripley' book. (These analysis are
quite new for me).
> summary(leaves.cox)
Call:
coxph(formula = Surv(time, censo) ~ (treatment) + species, data = wsuv)
n= 140840
coef exp(coef) se(coef) z p
treatment -0.0209 0.98 0.00847 -2.47 0.014
species 0.0712 1.07 0.00296 24.07 0.000
exp(coef) exp(-coef) lower .95 upper .95
treatment 0.98 1.021 0.963 0.996
species 1.07 0.931 1.068 1.080
Rsquare= 0.004 (max possible= 1 )
Likelihood ratio test= 590 on 2 df, p=0
Wald test = 587 on 2 df, p=0
Score (logrank) test = 588 on 2 df, p=0
My best regards and thanks in advance!
Paulo
________________________________________
Paulo M. Brando
Instituto de Pesquisa Ambiental da Amazonia (IPAM)
Santarem, PA, Brasil.
Av. Rui Barbosa, 136.
Fone: + 55 93 3522 55 38
www.ipam.org.br
E-mail: pmbrando at ipam.org.br
-----Mensagem original-----
De: r-help-bounces at stat.math.ethz.ch
[mailto:r-help-bounces at stat.math.ethz.ch] Em nome de
r-help-request at stat.math.ethz.ch
Enviada em: Wednesday, March 08, 2006 3:00 AM
Para: r-help at stat.math.ethz.ch
Assunto: R-help Digest, Vol 37, Issue 8
Send R-help mailing list submissions to
r-help at stat.math.ethz.ch
To subscribe or unsubscribe via the World Wide Web, visit
https://stat.ethz.ch/mailman/listinfo/r-help
or, via email, send a message with subject or body 'help' to
r-help-request at stat.math.ethz.ch
You can reach the person managing the list at
r-help-owner at stat.math.ethz.ch
When replying, please edit your Subject line so it is more specific
than "Re: Contents of R-help digest..."
Today's Topics:
1. Constrained linear least squares (Domenico Vistocco)
2. Building tkentry dynamicly (a.menicacci at fr.fournierpharma.com)
3. (newbie) Accessing the pieces of a 'by' object (Vivek Satsangi)
4. POSIX time zone codes (Jason Horn)
5. Re: (newbie) Accessing the pieces of a 'by' object
(Gabor Grothendieck)
6. Re: Building tkentry dynamicly (Peter Dalgaard)
7. Re: POSIX time zone codes (Jason Horn)
8. Re: (newbie) Accessing the pieces of a 'by' object
(Gabor Grothendieck)
9. Re: Building tkentry dynamicly (John Fox)
10. Re: Interleaving elements of two vectors? (bogdan romocea)
11. lme and gls : accessing values from correlation structure and
variance functions (Pryseley Assam)
12. Re: Building tkentry dynamicly (Sean Davis)
13. Re: returning the largest element in an array/matrix?
(Henrik Bengtsson)
14. Re: POSIX time zone codes (Gabor Grothendieck)
15. Re: Building tkentry dynamicly (John Fox)
16. breslow estimator for cumulative hazard function (singyee ling)
17. Re: POSIX time zone codes (Jason Horn)
18. Re: breslow estimator for cumulative hazard function
(Christos Hatzis)
19. Re: POSIX time zone codes (Gabor Grothendieck)
20. Re: (newbie) Accessing the pieces of a 'by' object
(Vivek Satsangi)
21. Re: (newbie) Accessing the pieces of a 'by' object
(Gabor Grothendieck)
22. How to change time zones? (Jason Horn)
23. Re: How to change time zones? (Gabor Grothendieck)
24. coding problems ((s) Richard Nuttall)
25. Re: QCA adn Fuzzy (Adrian DUSA)
26. Regarding categorization or grouping of data (Andrew Athan)
27. Re: Regarding categorization or grouping of data (Sean Davis)
28. glm automation (A Mani)
29. Re: returning the largest element in an array/matrix? (Michael)
30. Re: returning the largest element in an array/matrix?
(Gabor Grothendieck)
31. Making an S3 object act like a data.frame (hadley wickham)
32. Re: Making an S3 object act like a data.frame (Gabor Grothendieck)
33. reading in only one column from text file (mark salsburg)
34. reading in only one column from text file (mark salsburg)
35. how to use the rpart function? (Michael)
36. Re: Making an S3 object act like a data.frame (hadley wickham)
37. Re: reading in only one column from text file (Seth Falcon)
38. Fwd: reading in only one column from text file (mark salsburg)
39. Fwd: reading in only one column from text file (mark salsburg)
40. Re: reading in only one column from text file (Peter Dalgaard)
41. Re: reading in only one column from text file
(Kjetil Brinchmann Halvorsen)
42. Re: reading in only one column from text file (Berton Gunter)
43. Re: reading in only one column from text file (Liaw, Andy)
44. Writing out complex text file (Sean Davis)
45. Re: reading in only one column from text file (Jean Eid)
46. Re: glm automation (Jean Eid)
47. Re: Making an S3 object act like a data.frame (Henrik Bengtsson)
48. Re: Making an S3 object act like a data.frame (hadley wickham)
49. Re: Making an S3 object act like a data.frame (Gabor Grothendieck)
50. Re: Writing out complex text file (Sean Davis)
51. Re: Making an S3 object act like a data.frame (hadley wickham)
52. Re: Making an S3 object act like a data.frame (Gabor Grothendieck)
53. Re: Making an S3 object act like a data.frame (hadley wickham)
54. how to use the randomForest and rpart function? (Michael)
55. Re: how to use the randomForest and rpart function? (Michael)
56. Re: Making an S3 object act like a data.frame (Gabor Grothendieck)
57. Re: how to use the randomForest and rpart function? (Liaw, Andy)
58. Re: glm automation (ronggui)
59. Re: how to use the randomForest and rpart function? (Michael)
60. Re: glm automation (Liaw, Andy)
61. Re: how to use the randomForest and rpart function? (Liaw, Andy)
62. problem installing RNetCDF (Zepu Zhang)
63. Degrees of freedom using Box.test() (Nestor Arguea)
64. info() function? (Robert Lundqvist)
65. Re: info() function? (Henrik Bengtsson)
66. package installation on Mac OS X 10.3.9 (Patrick Giraudoux)
67. removing of memory - optim()? (Marcel Prokopczuk)
68. Re: Degrees of freedom using Box.test() (Patrick Burns)
69. Read.table (Matias Mayor Fernandez)
70. Re: Read.table (Uwe Ligges)
71. Re: predicted values in mgcv gam (Simon Wood)
----------------------------------------------------------------------
Message: 1
Date: Tue, 07 Mar 2006 12:59:26 +0100
From: Domenico Vistocco <vistocco at unicas.it>
Subject: [R] Constrained linear least squares
To: r-help at stat.math.ethz.ch
Message-ID: <7.0.0.16.0.20060307125446.00ea12e0 at unicas.it>
Content-Type: text/plain; charset="us-ascii"; format=flowed
Is there a function in R for constrained linear least squares?
I used the matlab function LSQLIN: my aim is to obtain
non-negative regression coefficients which sum 1.
Thanks in advance,
domenico vistocco
___________________________________
Yahoo! Mail: gratis 1GB per i messaggi e allegati da 10MB
http://mail.yahoo.it
------------------------------
Message: 2
Date: Tue, 7 Mar 2006 13:06:08 +0100
From: a.menicacci at fr.fournierpharma.com
Subject: [R] Building tkentry dynamicly
To: r-help at stat.math.ethz.ch
Message-ID:
<OF8BE5F9CA.A9B686F2-ONC125712A.003AC100-C125712A.00427B36 at fr.fournierph
arma.com>
Content-Type: text/plain; charset="iso-8859-1"
Dear R-users,
I would like to build N "tkentry" compounds in the same window, with
default text for each. As N is variable I need to construct them in an
iterative way :
library(tcltk)
main<-tktoplevel()
tktitle(main)<-"My Tool"
filenames<-c("toto","tata","titi")
N<-length(filenames)
for (i in 1: N) {
text<-tclVar(filenames[i]) # get a filename (string value)
textField<-tkentry(main,textvariable=text) # build a text field
tkgrid(textField)
}
The problem is : How to keep references for each tclVar created, in
order
to acces to the text modifications eventually done for each field ?
Example :
(Embedded image moved to file: pic14771.jpg) to
(Embedded image moved to file: pic11538.jpg)
Regards.
Alexandre MENICACCI
Bioinformatics - FOURNIER PHARMA
50, rue de Dijon - 21121 Daix - FRANCE
a.menicacci at fr.fournierpharma.com
t?l : 03.80.44.76.17
------------------------------
Message: 3
Date: Tue, 7 Mar 2006 08:05:00 -0500
From: "Vivek Satsangi" <vivek.satsangi at gmail.com>
Subject: [R] (newbie) Accessing the pieces of a 'by' object
To: r-help at stat.math.ethz.ch
Message-ID:
<bcb171920603070505x1989380fvb3a3347e03af9409 at mail.gmail.com>
Content-Type: text/plain; charset=ISO-8859-1
Folks,
I know that I can do the following using a loop. That's been a lot
easier for me to write and understand. But I am trying to force myself
to use more vectorized / matrixed code so that eventually I will
become a better R programmer.
I have a dataframe that has some values by Year, Quarter and Ranking.
The variable of interest is the return (F3MRet), to be weighted
averaged within the year, quarter and ranking. At the end, we want to
end up with a table like this:
year quarter ranking1 ranking2 ... ranking10
1987 1 1.33 1.45 ... 1.99
1987 2 6.45 3.22 ... 8.33
.
.
2005 4 2.22 3.33 ... 1.22
The dataset is too large to post and I can't come up with a small
working example very easily.
I tried the Reshape() package and also the aggregate and reshape
functions. Those don't work too well becuase of the need to pass
weighted.mean a weights vector. I tried the by() function, but now I
don't know how to coerce the returned object into a matrix so that I
can reshape it.
> fvs_weighted.mean <- function(y) weighted.mean(y$F3MRet,
y$IndexWeight, na.rm=T);
> tmp_byRet <- by(dfReturns,
list(dfReturns$Quarter,dfReturns$Year,dfReturns$Ranking),
fvs_weighted.mean);
And various other ways to get the tmp_byRet object into a matrix were
tried, eg. unlist(), a loop like this:
dfRet <- data.frame(tmp_byRet);
for(i in 1:dim(dfRet)[2]){
dfRet[ ,i] <- as.vector(dfRet[ ,i]);
}
In each case, I got some error or the other.
So, please help me get unstuck. How can I get the tmp_byRet() object
into a matrix or a dataframe?
--
-- Vivek Satsangi
Rochester, NY USA
"No amount of sophistication is going to allay the fact that all your
knowledge is about the past and all your decisions are about the
future." -- Ian Wilson
------------------------------
Message: 4
Date: Tue, 7 Mar 2006 08:05:49 -0500
From: Jason Horn <jason at 109valentine.com>
Subject: [R] POSIX time zone codes
To: R-help at stat.math.ethz.ch
Message-ID: <5A31514C-F08E-4DC3-8637-D5E9C857F730 at 109valentine.com>
Content-Type: text/plain
The manual entry for as.POSIX says this about time zone codes...
Usage
as.POSIXct(x, tz = "")
tz
A timezone specification to be used for the conversion...
but it fails to mention what these "specifications" are. So far, I
have tried...
as.POSIX(x, tz="UTC") ... works, gives UTC times
as.POSIX(x, tz="UTC") ... works, gives EST times
as.POSIX(x, tz="CST") ... does NOT work, gives UTC times
How do I get Central Standard Time or the US? Anybody know what the
code is?
Thanks.
[[alternative HTML version deleted]]
------------------------------
Message: 5
Date: Tue, 7 Mar 2006 08:12:09 -0500
From: "Gabor Grothendieck" <ggrothendieck at gmail.com>
Subject: Re: [R] (newbie) Accessing the pieces of a 'by' object
To: "Vivek Satsangi" <vivek.satsangi at gmail.com>
Cc: r-help at stat.math.ethz.ch
Message-ID:
<971536df0603070512y38d1f33dga24173babfd55536 at mail.gmail.com>
Content-Type: text/plain; charset=ISO-8859-1
Try this:
iris.by <- do.call("rbind", by(iris[,-5], iris[,5,drop=FALSE],colSums))
do.call("rbind", iris.by)
On 3/7/06, Vivek Satsangi <vivek.satsangi at gmail.com> wrote:
> Folks,
> I know that I can do the following using a loop. That's been a lot
> easier for me to write and understand. But I am trying to force myself
> to use more vectorized / matrixed code so that eventually I will
> become a better R programmer.
>
> I have a dataframe that has some values by Year, Quarter and Ranking.
> The variable of interest is the return (F3MRet), to be weighted
> averaged within the year, quarter and ranking. At the end, we want to
> end up with a table like this:
> year quarter ranking1 ranking2 ... ranking10
> 1987 1 1.33 1.45 ... 1.99
> 1987 2 6.45 3.22 ... 8.33
> .
> .
> 2005 4 2.22 3.33 ... 1.22
>
> The dataset is too large to post and I can't come up with a small
> working example very easily.
>
> I tried the Reshape() package and also the aggregate and reshape
> functions. Those don't work too well becuase of the need to pass
> weighted.mean a weights vector. I tried the by() function, but now I
> don't know how to coerce the returned object into a matrix so that I
> can reshape it.
>
> > fvs_weighted.mean <- function(y) weighted.mean(y$F3MRet,
y$IndexWeight, na.rm=T);
> > tmp_byRet <- by(dfReturns,
> list(dfReturns$Quarter,dfReturns$Year,dfReturns$Ranking),
> fvs_weighted.mean);
>
> And various other ways to get the tmp_byRet object into a matrix were
> tried, eg. unlist(), a loop like this:
> dfRet <- data.frame(tmp_byRet);
> for(i in 1:dim(dfRet)[2]){
> dfRet[ ,i] <- as.vector(dfRet[ ,i]);
> }
> In each case, I got some error or the other.
>
> So, please help me get unstuck. How can I get the tmp_byRet() object
> into a matrix or a dataframe?
>
> --
> -- Vivek Satsangi
> Rochester, NY USA
> "No amount of sophistication is going to allay the fact that all your
> knowledge is about the past and all your decisions are about the
> future." -- Ian Wilson
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide!
http://www.R-project.org/posting-guide.html
>
------------------------------
Message: 6
Date: 07 Mar 2006 14:18:04 +0100
From: Peter Dalgaard <p.dalgaard at biostat.ku.dk>
Subject: Re: [R] Building tkentry dynamicly
To: a.menicacci at fr.fournierpharma.com
Cc: r-help at stat.math.ethz.ch
Message-ID: <x2y7zmmm0j.fsf at viggo.kubism.ku.dk>
Content-Type: text/plain; charset=iso-8859-1
a.menicacci at fr.fournierpharma.com writes:
> Dear R-users,
>
> I would like to build N "tkentry" compounds in the same window, with
> default text for each. As N is variable I need to construct them in an
> iterative way :
>
>
> library(tcltk)
>
> main<-tktoplevel()
>
> tktitle(main)<-"My Tool"
>
> filenames<-c("toto","tata","titi")
> N<-length(filenames)
>
> for (i in 1: N) {
> text<-tclVar(filenames[i]) # get a filename (string value)
> textField<-tkentry(main,textvariable=text) # build a text field
> tkgrid(textField)
> }
>
> The problem is : How to keep references for each tclVar created, in
order
> to acces to the text modifications eventually done for each field ?
Can't you just use lists?
library(tcltk)
main <- tktoplevel()
tktitle(main) <- "My Tool"
filenames <- c("toto", "tata", "titi")
N <- length(filenames)
text <- vector("list", N)
textField <- vector("list", N)
for (i in 1:N) {
text[[i]] <- tclVar(filenames[i]) # get a filename (string value)
textField[[i]] <- tkentry(main,textvariable=text[[i]]) #build a
text field
tkgrid(textField[[i]])
}
tclvalue(text[[2]]) <- "tutu"
>
> Example :
>
> (Embedded image moved to file: pic14771.jpg) to
> (Embedded image moved to file: pic11538.jpg)
??!
--
O__ ---- Peter Dalgaard ?ster Farimagsgade 5, Entr.B
c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
(*) \(*) -- University of Copenhagen Denmark Ph: (+45)
35327918
~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk) FAX: (+45)
35327907
------------------------------
Message: 7
Date: Tue, 7 Mar 2006 08:21:59 -0500
From: Jason Horn <jhorn at bu.edu>
Subject: Re: [R] POSIX time zone codes
To: R-help at stat.math.ethz.ch
Message-ID: <A20DCC00-B80B-42D9-83FD-ACF00453DD2C at bu.edu>
Content-Type: text/plain
Whoops,
[EDIT]
as.POSIX(x, tz="UTC") ... works, gives UTC times
as.POSIX(x, tz="EST") ... works, gives EST times
as.POSIX(x, tz="CST") ... does NOT work, gives UTC times
[/EDIT]
On Mar 7, 2006, at 8:05 AM, Jason Horn wrote:
> as.POSIX(x, tz="UTC") ... works, gives UTC times
> as.POSIX(x, tz="UTC") ... works, gives EST times
>
> as.POSIX(x, tz="CST") ... does NOT work, gives UTC times
[[alternative HTML version deleted]]
------------------------------
Message: 8
Date: Tue, 7 Mar 2006 08:28:26 -0500
From: "Gabor Grothendieck" <ggrothendieck at gmail.com>
Subject: Re: [R] (newbie) Accessing the pieces of a 'by' object
To: "Vivek Satsangi" <vivek.satsangi at gmail.com>
Cc: r-help at stat.math.ethz.ch
Message-ID:
<971536df0603070528ld475230gc80a2249ed9ee766 at mail.gmail.com>
Content-Type: text/plain; charset=ISO-8859-1
Sorry, that should be:
iris.by <- by(iris[,-5], iris[,5,drop=FALSE],colSums)
do.call("rbind", iris.by)
On 3/7/06, Gabor Grothendieck <ggrothendieck at gmail.com> wrote:
> Try this:
>
> iris.by <- do.call("rbind", by(iris[,-5],
iris[,5,drop=FALSE],colSums))
> do.call("rbind", iris.by)
>
> On 3/7/06, Vivek Satsangi <vivek.satsangi at gmail.com> wrote:
> > Folks,
> > I know that I can do the following using a loop. That's been a lot
> > easier for me to write and understand. But I am trying to force
myself
> > to use more vectorized / matrixed code so that eventually I will
> > become a better R programmer.
> >
> > I have a dataframe that has some values by Year, Quarter and
Ranking.
> > The variable of interest is the return (F3MRet), to be weighted
> > averaged within the year, quarter and ranking. At the end, we want
to
> > end up with a table like this:
> > year quarter ranking1 ranking2 ... ranking10
> > 1987 1 1.33 1.45 ... 1.99
> > 1987 2 6.45 3.22 ... 8.33
> > .
> > .
> > 2005 4 2.22 3.33 ... 1.22
> >
> > The dataset is too large to post and I can't come up with a small
> > working example very easily.
> >
> > I tried the Reshape() package and also the aggregate and reshape
> > functions. Those don't work too well becuase of the need to pass
> > weighted.mean a weights vector. I tried the by() function, but now I
> > don't know how to coerce the returned object into a matrix so that I
> > can reshape it.
> >
> > > fvs_weighted.mean <- function(y) weighted.mean(y$F3MRet,
y$IndexWeight, na.rm=T);
> > > tmp_byRet <- by(dfReturns,
> > list(dfReturns$Quarter,dfReturns$Year,dfReturns$Ranking),
> > fvs_weighted.mean);
> >
> > And various other ways to get the tmp_byRet object into a matrix
were
> > tried, eg. unlist(), a loop like this:
> > dfRet <- data.frame(tmp_byRet);
> > for(i in 1:dim(dfRet)[2]){
> > dfRet[ ,i] <- as.vector(dfRet[ ,i]);
> > }
> > In each case, I got some error or the other.
> >
> > So, please help me get unstuck. How can I get the tmp_byRet() object
> > into a matrix or a dataframe?
> >
> > --
> > -- Vivek Satsangi
> > Rochester, NY USA
> > "No amount of sophistication is going to allay the fact that all
your
> > knowledge is about the past and all your decisions are about the
> > future." -- Ian Wilson
> >
> > ______________________________________________
> > R-help at stat.math.ethz.ch mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide!
http://www.R-project.org/posting-guide.html
> >
>
------------------------------
Message: 9
Date: Tue, 7 Mar 2006 08:29:07 -0500
From: "John Fox" <jfox at mcmaster.ca>
Subject: Re: [R] Building tkentry dynamicly
To: <a.menicacci at fr.fournierpharma.com>
Cc: r-help at stat.math.ethz.ch
Message-ID:
<20060307132904.HHDE16051.tomts20-srv.bellnexxia.net at JohnDesktop8300>
Content-Type: text/plain; charset="iso-8859-1"
Dear Alexandre,
It is possible to do what you want. Take a look, for example, at the
dialog
box produced by "Statistics -> Contingency tables -> Enter and analyze
two-way table" in the Rcmdr package. That dialog box is able to modify
itself and to keep variables for an arbitrary number of tkentry() boxes.
It
does this by constructing names for the variables as text strings, and
then
using assign() and eval() to set and retrieve values. (Perhaps there's a
more elegant way to do this.) The code for the function enterTable(),
which
constructs this dialog, is in the file statistics-tables-menu.R in the
Rcmdr
source package.
I hope this helps,
John
--------------------------------
John Fox
Department of Sociology
McMaster University
Hamilton, Ontario
Canada L8S 4M4
905-525-9140x23604
http://socserv.mcmaster.ca/jfox
--------------------------------
> -----Original Message-----
> From: r-help-bounces at stat.math.ethz.ch
> [mailto:r-help-bounces at stat.math.ethz.ch] On Behalf Of
> a.menicacci at fr.fournierpharma.com
> Sent: Tuesday, March 07, 2006 7:06 AM
> To: r-help at stat.math.ethz.ch
> Subject: [R] Building tkentry dynamicly
>
>
>
>
>
>
> Dear R-users,
>
> I would like to build N "tkentry" compounds in the same
> window, with default text for each. As N is variable I need
> to construct them in an iterative way :
>
>
> library(tcltk)
>
> main<-tktoplevel()
>
> tktitle(main)<-"My Tool"
>
> filenames<-c("toto","tata","titi")
> N<-length(filenames)
>
> for (i in 1: N) {
> text<-tclVar(filenames[i]) # get a filename (string value)
> textField<-tkentry(main,textvariable=text) # build a text field
> tkgrid(textField)
> }
>
> The problem is : How to keep references for each tclVar
> created, in order to acces to the text modifications
> eventually done for each field ?
>
>
> Example :
>
> (Embedded image moved to file: pic14771.jpg) to
> (Embedded image moved to file: pic11538.jpg)
>
>
> Regards.
>
>
>
>
>
>
>
> Alexandre MENICACCI
> Bioinformatics - FOURNIER PHARMA
> 50, rue de Dijon - 21121 Daix - FRANCE
> a.menicacci at fr.fournierpharma.com
> t?l : 03.80.44.76.17
------------------------------
Message: 10
Date: Tue, 7 Mar 2006 08:38:38 -0500
From: "bogdan romocea" <br44114 at gmail.com>
Subject: Re: [R] Interleaving elements of two vectors?
To: ajayshah at mayin.org
Cc: r-help <R-help at stat.math.ethz.ch>
Message-ID:
<8d5a36350603070538k42884d53k8c4ff23242c1d51f at mail.gmail.com>
Content-Type: text/plain; charset=ISO-8859-1
For a general solution without warnings try
interleave <- function(v1,v2)
{
ord1 <- 2*(1:length(v1))-1
ord2 <- 2*(1:length(v2))
c(v1,v2)[order(c(ord1,ord2))]
}
interleave(rep(1,5),rep(3,8))
> -----Original Message-----
> From: r-help-bounces at stat.math.ethz.ch
> [mailto:r-help-bounces at stat.math.ethz.ch] On Behalf Of Gabor
> Grothendieck
> Sent: Monday, March 06, 2006 12:12 AM
> To: Ajay Narottam Shah
> Cc: R-help
> Subject: Re: [R] Interleaving elements of two vectors?
>
> Try this (note that your x and y do not have the same length
> and in this case the expression will recycle the shorter one
> and give a warning):
>
> z <- c(rbind(x, y))
>
>
> On 3/5/06, Ajay Narottam Shah <ajayshah at mayin.org> wrote:
> > Suppose one has
> >
> > x <- c(1, 2, 7, 9, 14)
> > y <- c(71, 72, 77)
> >
> > How would one write an R function which alternates between
> elements of
> > one vector and the next? In other words, one wants
> >
> > z <- c(x[1], y[1], x[2], y[2], x[3], y[3], x[4],
> y[4], x[5], y[5])
> >
> > I couldn't think of a clever and general way to write this.
> I am aware
> > of gdata::interleave() but it deals with interleaving rows of a data
> > frame, not elems of vectors.
> >
> > --
> > Ajay Shah
> http://www.mayin.org/ajayshah
> > ajayshah at mayin.org
> http://ajayshahblog.blogspot.com
> > <*(:-? - wizard who doesn't know the answer.
> >
> > ______________________________________________
> > R-help at stat.math.ethz.ch mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide!
> http://www.R-project.org/posting-guide.html
> >
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide!
> http://www.R-project.org/posting-guide.html
>
------------------------------
Message: 11
Date: Tue, 7 Mar 2006 05:45:19 -0800 (PST)
From: Pryseley Assam <assampryseley at yahoo.com>
Subject: [R] lme and gls : accessing values from correlation structure
and variance functions
To: r-help at stat.math.ethz.ch
Message-ID: <20060307134519.57246.qmail at web37103.mail.mud.yahoo.com>
Content-Type: text/plain
Dear R-users
I am relatively new to R, i hope my many novice questions are welcome.
I have problems accessing some objects (specifically the random
effects, correlation structure and variance function) from an object of
class gls and lme.
I used the following models:
yah <- gls (outcome~ -1 + as.factor(Trial):as.factor(endpoint)+
as.factor(Trial):as.factor(endpoint):trt, data=datt[datt$Trial<4,],
correlation = corSymm(form=~1|as.factor(Trial)/as.factor(subject)),
weights=varIdent(form=~1|endpoint))
bm <- lme (outcome~ -1 + as.factor(endpoint)+ as.factor(endpoint):trt,
data=datt[datt$Trial<4,],
random=~-1 + as.factor(endpoint) +
as.factor(endpoint):trt|as.factor(Trial),
correlation = corSymm(form=~1|as.factor(Trial)/as.factor(subject)),
weights=varIdent(form=~1|endpoint))
When i print the object "bm" i get the following output:
------------------------------------------------------------------------
----------------------------------
> bm
Linear mixed-effects model fit by REML
Data: datt[datt$Trial < 4, ]
Log-restricted-likelihood: -52.23147
Fixed: outcome ~ -1 + as.factor(endpoint) + as.factor(endpoint):trt
as.factor(endpoint)-1 as.factor(endpoint)1
as.factor(endpoint)-1:trt
-3.663087 -1.772427
-3.661823
as.factor(endpoint)1:trt
-3.209671
Random effects:
Formula: ~-1 + as.factor(endpoint) + as.factor(endpoint):trt |
as.factor(Trial)
Structure: General positive-definite, Log-Cholesky parametrization
StdDev Corr
as.factor(endpoint)-1 2.05744327 as.()-1 as.()1 a.()-1:
as.factor(endpoint)1 0.08400874 -0.976
as.factor(endpoint)-1:trt 1.90318009 0.975 -0.967
as.factor(endpoint)1:trt 3.25432832 -0.992 0.982 -0.972
Residual 6.48819860
Correlation Structure: General
Formula: ~1 | as.factor(Trial)/as.factor(subject)
Parameter estimate(s):
Correlation:
1
2 0.812
Variance function:
Structure: Different standard deviations per stratum
Formula: ~1 | endpoint
Parameter estimates:
1 -1
1.000000 1.764878
Number of Observations: 18
Number of Groups: 3
------------------------------------------------------------------------
----------------------------
Can somebody tell me how to access the values in blue above?
Also, when i tried accessing these values i obtained the following
bm$modelStruct
corStruct parameters:
[1] -1.394879
varStruct parameters:
[1] 0.5680815
What do these values represent.
Thanks in advance..
Pryseley
sample data:
RowNames Trial subject VISUAL0 TRT VISUAL24 VISUAL52 TREAT outcome
endpoint trt
4 1 1003 65 4 65 55 2 0 1
1
8 1 1007 67 1 64 68 2 -3 1
-1
12 2 1110 59 4 53 42 2 -6 1
1
14 2 1111 64 1 72 65 2 8 1
-1
16 2 1112 39 1 37 37 2 -2 1
-1
18 2 1115 59 4 54 58 2 -5 1
1
24 3 1806 46 4 27 24 2 -19 1
1
26 3 1813 31 4 33 48 2 2 1
1
28 3 1815 64 1 67 64 2 3 1
-1
4 1 1003 65 4 65 55 2 -10 -1
1
8 1 1007 67 1 64 68 2 1 -1
-1
12 2 1110 59 4 53 42 2 -17 -1
1
14 2 1111 64 1 72 65 2 1 -1
-1
16 2 1112 39 1 37 37 2 -2 -1
-1
18 2 1115 59 4 54 58 2 -1 -1
1
24 3 1806 46 4 27 24 2 -22 -1
1
26 3 1813 31 4 33 48 2 17 -1
1
28 3 1815 64 1 67 64 2 0 -1
-1
---------------------------------
[[alternative HTML version deleted]]
------------------------------
Message: 12
Date: Tue, 07 Mar 2006 08:50:50 -0500
From: Sean Davis <sdavis2 at mail.nih.gov>
Subject: Re: [R] Building tkentry dynamicly
To: John Fox <jfox at mcmaster.ca>, <a.menicacci at fr.fournierpharma.com>
Cc: R-Help <r-help at stat.math.ethz.ch>
Message-ID: <C032F9EA.9F8%sdavis2 at mail.nih.gov>
Content-Type: text/plain; charset="US-ASCII"
On 3/7/06 8:29, "John Fox" <jfox at mcmaster.ca> wrote:
> Dear Alexandre,
>
> It is possible to do what you want. Take a look, for example, at the
dialog
> box produced by "Statistics -> Contingency tables -> Enter and analyze
> two-way table" in the Rcmdr package. That dialog box is able to modify
> itself and to keep variables for an arbitrary number of tkentry()
boxes. It
> does this by constructing names for the variables as text strings, and
then
> using assign() and eval() to set and retrieve values. (Perhaps there's
a
> more elegant way to do this.) The code for the function enterTable(),
which
> constructs this dialog, is in the file statistics-tables-menu.R in the
Rcmdr
> source package.
I haven't used tcltk enough to comment in any major way, but I will ask
a
question. Couldn't these variables be maintained as a list?
Sean
------------------------------
Message: 13
Date: Tue, 7 Mar 2006 14:53:36 +0100
From: "Henrik Bengtsson" <hb at maths.lth.se>
Subject: Re: [R] returning the largest element in an array/matrix?
To: "Petr Pikal" <petr.pikal at precheza.cz>
Cc: R-help at stat.math.ethz.ch
Message-ID:
<59d7961d0603070553o7a0a4efatcd6e0cedfe5aa0f6 at mail.gmail.com>
Content-Type: text/plain; charset=ISO-8859-1
This is a problem how to convert vector indices to array indices. Here
is a general solution utilizing the fact that matrices are stored
column by column in R (this extends to arrays of any dimension):
arrayIndex <- function(i, dim) {
ndim <- length(dim); # number of dimension
v <- cumprod(c(1,dim)); # base
# Allocate return matrix
j <- matrix(NA, nrow=length(i), ncol=ndim);
i <- (i-1); # one-based indices
for (kk in 1:ndim)
j[,kk] <- (i %% v[kk+1])/v[kk];
1 + floor(j); # one-based indices
}
# Now we can the optimized which.max() function:
m <- matrix(1:14, nrow=7, ncol=4)
arrayIndex(which.max(m), dim=dim(m))
Gives:
[,1] [,2]
[1,] 7 2
# The less efficient:
arrayIndex(which(m==max(m)), dim=dim(m))
Gives:
[,1] [,2]
[1,] 7 2
[2,] 7 4
BTW, isn't there such a function in R already? I tried to find it,
but I couldn't.
Cheers
Henrik
On 3/7/06, Petr Pikal <petr.pikal at precheza.cz> wrote:
> Hi
>
> If you do not insist on which.max() you can use
>
> which(mat==max(mat), arr.ind=T)
>
> HTH
> Petr
>
>
>
>
> On 6 Mar 2006 at 20:55, Michael wrote:
>
> Date sent: Mon, 6 Mar 2006 20:55:20 -0800
> From: Michael <comtech.usa at gmail.com>
> To: R-help at stat.math.ethz.ch
> Subject: [R] returning the largest element in an
array/matrix?
>
> > Hi all,
> >
> > I want to use "which.max" to identify the maximum in a 2D
array/matrix
> > and I want "argmin" and return the row and column indices.
> >
> > But "which.max" only works for vector...
> >
> > Is there any convinient way to solve this problem?
> >
> > Thanks a lot!
> >
> > [[alternative HTML version deleted]]
> >
> > ______________________________________________
> > R-help at stat.math.ethz.ch mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide!
> > http://www.R-project.org/posting-guide.html
>
> Petr Pikal
> petr.pikal at precheza.cz
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide!
http://www.R-project.org/posting-guide.html
------------------------------
Message: 14
Date: Tue, 7 Mar 2006 09:00:11 -0500
From: "Gabor Grothendieck" <ggrothendieck at gmail.com>
Subject: Re: [R] POSIX time zone codes
To: "Jason Horn" <jhorn at bu.edu>
Cc: R-help at stat.math.ethz.ch
Message-ID:
<971536df0603070600y670f9b48y252742852774a523 at mail.gmail.com>
Content-Type: text/plain; charset=ISO-8859-1
Only "" and "GMT" are really guaranteed to work on all systems
since the time zones are system dependent but try: "CDT6CST"
and see if that works on your system.
On 3/7/06, Jason Horn <jhorn at bu.edu> wrote:
> Whoops,
>
> [EDIT]
>
> as.POSIX(x, tz="UTC") ... works, gives UTC times
> as.POSIX(x, tz="EST") ... works, gives EST times
>
> as.POSIX(x, tz="CST") ... does NOT work, gives UTC times
>
> [/EDIT]
>
> On Mar 7, 2006, at 8:05 AM, Jason Horn wrote:
>
> > as.POSIX(x, tz="UTC") ... works, gives UTC times
> > as.POSIX(x, tz="UTC") ... works, gives EST times
> >
> > as.POSIX(x, tz="CST") ... does NOT work, gives UTC times
>
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide!
http://www.R-project.org/posting-guide.html
>
------------------------------
Message: 15
Date: Tue, 7 Mar 2006 09:17:38 -0500
From: "John Fox" <jfox at mcmaster.ca>
Subject: Re: [R] Building tkentry dynamicly
To: "'Sean Davis'" <sdavis2 at mail.nih.gov>
Cc: a.menicacci at fr.fournierpharma.com, 'R-Help'
<r-help at stat.math.ethz.ch>
Message-ID:
<20060307141734.TOBH29052.tomts13-srv.bellnexxia.net at JohnDesktop8300>
Content-Type: text/plain; charset="us-ascii"
Dear Sean,
That was Peter Dalgaard's suggestion, posted a little while ago. It's a
much
better solution to Alexandre's problem than what I suggested. I'm not
sure
that it would work for the Rcmdr dialog that I mentioned, but I'll take
a
look when I have a chance.
Regards,
John
--------------------------------
John Fox
Department of Sociology
McMaster University
Hamilton, Ontario
Canada L8S 4M4
905-525-9140x23604
http://socserv.mcmaster.ca/jfox
--------------------------------
> -----Original Message-----
> From: r-help-bounces at stat.math.ethz.ch
> [mailto:r-help-bounces at stat.math.ethz.ch] On Behalf Of Sean Davis
> Sent: Tuesday, March 07, 2006 8:51 AM
> To: John Fox; a.menicacci at fr.fournierpharma.com
> Cc: R-Help
> Subject: Re: [R] Building tkentry dynamicly
>
>
>
>
> On 3/7/06 8:29, "John Fox" <jfox at mcmaster.ca> wrote:
>
> > Dear Alexandre,
> >
> > It is possible to do what you want. Take a look, for
> example, at the
> > dialog box produced by "Statistics -> Contingency tables ->
> Enter and
> > analyze two-way table" in the Rcmdr package. That dialog
> box is able
> > to modify itself and to keep variables for an arbitrary number of
> > tkentry() boxes. It does this by constructing names for the
> variables
> > as text strings, and then using assign() and eval() to set and
> > retrieve values. (Perhaps there's a more elegant way to do
> this.) The
> > code for the function enterTable(), which constructs this
> dialog, is
> > in the file statistics-tables-menu.R in the Rcmdr source package.
>
> I haven't used tcltk enough to comment in any major way, but
> I will ask a question. Couldn't these variables be
> maintained as a list?
>
> Sean
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide!
> http://www.R-project.org/posting-guide.html
------------------------------
Message: 16
Date: Tue, 7 Mar 2006 14:17:43 +0000
From: "singyee ling" <singyee.ling at googlemail.com>
Subject: [R] breslow estimator for cumulative hazard function
To: r-help at stat.math.ethz.ch
Message-ID: <ca33a9890603070617n138bb6e9p at mail.gmail.com>
Content-Type: text/plain
Dear R-users,
I am checking the proportional hazard assumption of a cox model for a
given covariate, let say Z1, after adjusting for other relavent
covariates
in the model. To this end, I fitted cox model stratified on the discrete
values of Z1 and try to get beslow estimator for the baseline cumulative
hazard function (H(t)) in each stratum. As far as i know, if the
proportionality assumption holds, the plot of ln[H(t)] of each stratum
versus time should be approximately parallel.
i.e
fit<-coxph(Surv(start,end,status)~sx+rated+AGLEVEL+strata(Z1),data=ALLDP
infectionandbronchitis)
ss<-survfit(fit)
plot(ss,fun="cumhaz")
My question is on whether the cumulative hazard given by the above
command
is actually a breslow estimator for baseline cumulative hazard ,i.e,
estimator=sum( number of death/ (sum(risk score in risk set)) or a
nelson-Aalen estimator. if the above command does not give me breslow
estimator, please advise on how I can get it.
Thanks for any help given.
kind regards,
sing yee
[[alternative HTML version deleted]]
------------------------------
Message: 17
Date: Tue, 7 Mar 2006 09:22:09 -0500
From: Jason Horn <jason at 109valentine.com>
Subject: Re: [R] POSIX time zone codes
To: R-help at stat.math.ethz.ch
Message-ID: <56E098FF-B959-4FDE-AFB1-6FBF2F6893B8 at 109valentine.com>
Content-Type: text/plain; charset=US-ASCII; delsp=yes; format=flowed
Thank you again Gabor, that did the trick. Any thoughts on where I
go go for a reference for these time codes? Where did you get
"CDT6CST" from? Or is this just one of those things that is "common
knowledge" in UNIX circles.
To the R developers: I recommend a sentence be added to the manual
for as.POSIXxx such as "vales for tz are system dependent, examples
for common systems are....."
Thanks!
On Mar 7, 2006, at 9:00 AM, Gabor Grothendieck wrote:
> Only "" and "GMT" are really guaranteed to work on all systems
> since the time zones are system dependent but try: "CDT6CST"
> and see if that works on your system.
>
>
> On 3/7/06, Jason Horn <jhorn at bu.edu> wrote:
>> Whoops,
>>
>> [EDIT]
>>
>> as.POSIX(x, tz="UTC") ... works, gives UTC times
>> as.POSIX(x, tz="EST") ... works, gives EST times
>>
>> as.POSIX(x, tz="CST") ... does NOT work, gives UTC times
>>
>> [/EDIT]
>>
>> On Mar 7, 2006, at 8:05 AM, Jason Horn wrote:
>>
>>> as.POSIX(x, tz="UTC") ... works, gives UTC times
>>> as.POSIX(x, tz="UTC") ... works, gives EST times
>>>
>>> as.POSIX(x, tz="CST") ... does NOT work, gives UTC times
>>
>>
>> [[alternative HTML version deleted]]
>>
>> ______________________________________________
>> R-help at stat.math.ethz.ch mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide! http://www.R-project.org/posting-
>> guide.html
>>
>
------------------------------
Message: 18
Date: Tue, 7 Mar 2006 10:12:34 -0500
From: "Christos Hatzis" <christos at silicoinsights.com>
Subject: Re: [R] breslow estimator for cumulative hazard function
To: "'singyee ling'" <singyee.ling at googlemail.com>,
<r-help at stat.math.ethz.ch>
Message-ID:
<003001c641f9$93613000$0e010a0a at headquarters.silicoinsights>
Content-Type: text/plain; charset="us-ascii"
The function 'basehaz' computes the predicted survivor function for a
Cox
prop hazards model centered (at the average values of the covariates) or
non-centered (at zero level of covariates).
As far as the estimator used, basehaz defaults to the Nelson-Aalen
estimate
of the cumulative hazard with a Breslow-type estimate of survival.
Alternatively, a K-M estimator for the cumulative hazard can be
selected.
See ?basehaz for details.
In your example, you could do something like this:
bh <- basehaz(fit, centered=TRUE)
require("lattice")
xyplot(hazard ~ time | strata, data=ALLDPinfectionandbronchitis)
-Christos
Christos Hatzis, Ph.D.
Vice President, Technology
Nuvera Biosciences, Inc.
400 West Cummings Park
Suite 5350
Woburn, MA 01801
Tel: 781-938-3830
www.nuverabio.com
-----Original Message-----
From: r-help-bounces at stat.math.ethz.ch
[mailto:r-help-bounces at stat.math.ethz.ch] On Behalf Of singyee ling
Sent: Tuesday, March 07, 2006 9:18 AM
To: r-help at stat.math.ethz.ch
Subject: [R] breslow estimator for cumulative hazard function
Dear R-users,
I am checking the proportional hazard assumption of a cox model for a
given covariate, let say Z1, after adjusting for other relavent
covariates
in the model. To this end, I fitted cox model stratified on the discrete
values of Z1 and try to get beslow estimator for the baseline cumulative
hazard function (H(t)) in each stratum. As far as i know, if the
proportionality assumption holds, the plot of ln[H(t)] of each stratum
versus time should be approximately parallel.
i.e
fit<-coxph(Surv(start,end,status)~sx+rated+AGLEVEL+strata(Z1),data=ALLDP
infe
ctionandbronchitis)
ss<-survfit(fit)
plot(ss,fun="cumhaz")
My question is on whether the cumulative hazard given by the above
command
is actually a breslow estimator for baseline cumulative hazard ,i.e,
estimator=sum( number of death/ (sum(risk score in risk set)) or a
nelson-Aalen estimator. if the above command does not give me breslow
estimator, please advise on how I can get it.
Thanks for any help given.
kind regards,
sing yee
[[alternative HTML version deleted]]
______________________________________________
R-help at stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide!
http://www.R-project.org/posting-guide.html
------------------------------
Message: 19
Date: Tue, 7 Mar 2006 10:13:28 -0500
From: "Gabor Grothendieck" <ggrothendieck at gmail.com>
Subject: Re: [R] POSIX time zone codes
To: "Jason Horn" <jason at 109valentine.com>
Cc: R-help at stat.math.ethz.ch
Message-ID:
<971536df0603070713q25d8b1c3w927c6333bddd38a6 at mail.gmail.com>
Content-Type: text/plain; charset=ISO-8859-1
Each OS would presumably have to document it. I personally
use Windows and in that case there is some info here:
http://msdn2.microsoft.com/en-us/library/90s5c885.aspx
On 3/7/06, Jason Horn <jason at 109valentine.com> wrote:
> Thank you again Gabor, that did the trick. Any thoughts on where I
> go go for a reference for these time codes? Where did you get
> "CDT6CST" from? Or is this just one of those things that is "common
> knowledge" in UNIX circles.
>
> To the R developers: I recommend a sentence be added to the manual
> for as.POSIXxx such as "vales for tz are system dependent, examples
> for common systems are....."
>
>
>
>
> Thanks!
>
>
> On Mar 7, 2006, at 9:00 AM, Gabor Grothendieck wrote:
>
> > Only "" and "GMT" are really guaranteed to work on all systems
> > since the time zones are system dependent but try: "CDT6CST"
> > and see if that works on your system.
> >
> >
> > On 3/7/06, Jason Horn <jhorn at bu.edu> wrote:
> >> Whoops,
> >>
> >> [EDIT]
> >>
> >> as.POSIX(x, tz="UTC") ... works, gives UTC times
> >> as.POSIX(x, tz="EST") ... works, gives EST times
> >>
> >> as.POSIX(x, tz="CST") ... does NOT work, gives UTC times
> >>
> >> [/EDIT]
> >>
> >> On Mar 7, 2006, at 8:05 AM, Jason Horn wrote:
> >>
> >>> as.POSIX(x, tz="UTC") ... works, gives UTC times
> >>> as.POSIX(x, tz="UTC") ... works, gives EST times
> >>>
> >>> as.POSIX(x, tz="CST") ... does NOT work, gives UTC times
> >>
> >>
> >> [[alternative HTML version deleted]]
> >>
> >> ______________________________________________
> >> R-help at stat.math.ethz.ch mailing list
> >> https://stat.ethz.ch/mailman/listinfo/r-help
> >> PLEASE do read the posting guide! http://www.R-project.org/posting-
> >> guide.html
> >>
> >
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide!
http://www.R-project.org/posting-guide.html
>
------------------------------
Message: 20
Date: Tue, 7 Mar 2006 10:33:28 -0500
From: "Vivek Satsangi" <vivek.satsangi at gmail.com>
Subject: Re: [R] (newbie) Accessing the pieces of a 'by' object
To: r-help at stat.math.ethz.ch
Message-ID:
<bcb171920603070733k5c05d19dg4b8aa36ae232164c at mail.gmail.com>
Content-Type: text/plain; charset=ISO-8859-1
I am writing to document the answer for the next poor sod who comes
along.
To get tmp_byRet() into a multi-dimentional matrix, copy the object
using as.vector(), then copy the dim and dimnames from tmp_byRet into
the new object. However, this may not be what you want, since you
probably want the values of the factors within the object (i.e. it
should be a dataframe, not a matrix).
To get tmp_byRet into a dataframe, use unique() to create a dataframe
with just the unique values of your factors. Add a new column to the
dataframe, where you will store the summary stats. Use a loop to
populate this vector. Then use reshape() on the dataframe to get it to
the shape you want it in. It is difficult at best to vectorize this
and avoid the loop -- and trying to do so will lead to probably less
transparent code.
Vivek
On 3/7/06, Vivek Satsangi <vivek.satsangi at gmail.com> wrote:
> Folks,
> I know that I can do the following using a loop. That's been a lot
> easier for me to write and understand. But I am trying to force myself
> to use more vectorized / matrixed code so that eventually I will
> become a better R programmer.
>
> I have a dataframe that has some values by Year, Quarter and Ranking.
> The variable of interest is the return (F3MRet), to be weighted
> averaged within the year, quarter and ranking. At the end, we want to
> end up with a table like this:
> year quarter ranking1 ranking2 ... ranking10
> 1987 1 1.33 1.45 ... 1.99
> 1987 2 6.45 3.22 ... 8.33
> .
> .
> 2005 4 2.22 3.33 ... 1.22
>
> The dataset is too large to post and I can't come up with a small
> working example very easily.
>
> I tried the Reshape() package and also the aggregate and reshape
> functions. Those don't work too well becuase of the need to pass
> weighted.mean a weights vector. I tried the by() function, but now I
> don't know how to coerce the returned object into a matrix so that I
> can reshape it.
>
> > fvs_weighted.mean <- function(y) weighted.mean(y$F3MRet,
y$IndexWeight, na.rm=T);
> > tmp_byRet <- by(dfReturns,
> list(dfReturns$Quarter,dfReturns$Year,dfReturns$Ranking),
> fvs_weighted.mean);
>
> And various other ways to get the tmp_byRet object into a matrix were
> tried, eg. unlist(), a loop like this:
> dfRet <- data.frame(tmp_byRet);
> for(i in 1:dim(dfRet)[2]){
> dfRet[ ,i] <- as.vector(dfRet[ ,i]);
> }
> In each case, I got some error or the other.
>
> So, please help me get unstuck. How can I get the tmp_byRet() object
> into a matrix or a dataframe?
>
> --
------------------------------
Message: 21
Date: Tue, 7 Mar 2006 10:48:57 -0500
From: "Gabor Grothendieck" <ggrothendieck at gmail.com>
Subject: Re: [R] (newbie) Accessing the pieces of a 'by' object
To: "Vivek Satsangi" <vivek.satsangi at gmail.com>
Cc: r-help at stat.math.ethz.ch
Message-ID:
<971536df0603070748qab58361hf121c5e74e3742b9 at mail.gmail.com>
Content-Type: text/plain; charset=ISO-8859-1
Another solution (ex #3) is to return a data frame with at least two
columns
and then remove the dummy one using drop = FALSE:
# ex #1. error
iris.by <- by(iris, iris[,5, drop = FALSE], length)
do.call("rbind", iris.by)
# ex #2. ok but no column heading
iris.by <- by(iris, iris[,5, drop=FALSE], function(x)
data.frame(length = length(x)))
do.call("rbind", iris.by)
# ex #3. ok
iris.by <- by(iris, iris[,5, drop=FALSE], function(x) data.frame(1,
length = length(x)))
do.call("rbind", iris.by)[,-1,drop=FALSE]
On 3/7/06, Vivek Satsangi <vivek.satsangi at gmail.com> wrote:
> I am writing to document the answer for the next poor sod who comes
along.
>
> To get tmp_byRet() into a multi-dimentional matrix, copy the object
> using as.vector(), then copy the dim and dimnames from tmp_byRet into
> the new object. However, this may not be what you want, since you
> probably want the values of the factors within the object (i.e. it
> should be a dataframe, not a matrix).
>
> To get tmp_byRet into a dataframe, use unique() to create a dataframe
> with just the unique values of your factors. Add a new column to the
> dataframe, where you will store the summary stats. Use a loop to
> populate this vector. Then use reshape() on the dataframe to get it to
> the shape you want it in. It is difficult at best to vectorize this
> and avoid the loop -- and trying to do so will lead to probably less
> transparent code.
>
> Vivek
>
>
> On 3/7/06, Vivek Satsangi <vivek.satsangi at gmail.com> wrote:
> > Folks,
> > I know that I can do the following using a loop. That's been a lot
> > easier for me to write and understand. But I am trying to force
myself
> > to use more vectorized / matrixed code so that eventually I will
> > become a better R programmer.
> >
> > I have a dataframe that has some values by Year, Quarter and
Ranking.
> > The variable of interest is the return (F3MRet), to be weighted
> > averaged within the year, quarter and ranking. At the end, we want
to
> > end up with a table like this:
> > year quarter ranking1 ranking2 ... ranking10
> > 1987 1 1.33 1.45 ... 1.99
> > 1987 2 6.45 3.22 ... 8.33
> > .
> > .
> > 2005 4 2.22 3.33 ... 1.22
> >
> > The dataset is too large to post and I can't come up with a small
> > working example very easily.
> >
> > I tried the Reshape() package and also the aggregate and reshape
> > functions. Those don't work too well becuase of the need to pass
> > weighted.mean a weights vector. I tried the by() function, but now I
> > don't know how to coerce the returned object into a matrix so that I
> > can reshape it.
> >
> > > fvs_weighted.mean <- function(y) weighted.mean(y$F3MRet,
y$IndexWeight, na.rm=T);
> > > tmp_byRet <- by(dfReturns,
> > list(dfReturns$Quarter,dfReturns$Year,dfReturns$Ranking),
> > fvs_weighted.mean);
> >
> > And various other ways to get the tmp_byRet object into a matrix
were
> > tried, eg. unlist(), a loop like this:
> > dfRet <- data.frame(tmp_byRet);
> > for(i in 1:dim(dfRet)[2]){
> > dfRet[ ,i] <- as.vector(dfRet[ ,i]);
> > }
> > In each case, I got some error or the other.
> >
> > So, please help me get unstuck. How can I get the tmp_byRet() object
> > into a matrix or a dataframe?
> >
> > --
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide!
http://www.R-project.org/posting-guide.html
>
------------------------------
Message: 22
Date: Tue, 7 Mar 2006 11:22:09 -0500
From: Jason Horn <jhorn at bu.edu>
Subject: [R] How to change time zones?
To: R-help at stat.math.ethz.ch
Message-ID: <5CB24417-5F1B-49AC-AACF-4FD5479BC7D6 at bu.edu>
Content-Type: text/plain
Say you have a POSIX object that is in UTC. How do you change the
values to another timezone?
If I do this:
times <- strptime(times, "%H:%M:%S")
times1 <- as.POSIXct(times, tz="UTC")
times2 <- as.POSIXct(times, tz="CDT6CST")
times1 id UTC, but times2 is still UTC, not CTD. Why? Is the only
was to change time zones to add seconds to POSIX objects?
[[alternative HTML version deleted]]
------------------------------
Message: 23
Date: Tue, 7 Mar 2006 11:30:20 -0500
From: "Gabor Grothendieck" <ggrothendieck at gmail.com>
Subject: Re: [R] How to change time zones?
To: "Jason Horn" <jhorn at bu.edu>
Cc: R-help at stat.math.ethz.ch
Message-ID:
<971536df0603070830n37e3540fk275edae5a0c0482a at mail.gmail.com>
Content-Type: text/plain; charset=ISO-8859-1
POSIX objects are UTC internally but can be displayed
with respect to other time zones.
By the way, do you really need time zones in the first
place? Read R News 4/1 help desk article which will
help you determine which date/time class is
suitable for your application and also has a table
of idioms and other discussion which may be helpful.
On 3/7/06, Jason Horn <jhorn at bu.edu> wrote:
> Say you have a POSIX object that is in UTC. How do you change the
> values to another timezone?
>
> If I do this:
>
> times <- strptime(times, "%H:%M:%S")
> times1 <- as.POSIXct(times, tz="UTC")
> times2 <- as.POSIXct(times, tz="CDT6CST")
>
> times1 id UTC, but times2 is still UTC, not CTD. Why? Is the only
> was to change time zones to add seconds to POSIX objects?
>
>
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide!
http://www.R-project.org/posting-guide.html
>
------------------------------
Message: 24
Date: Tue, 7 Mar 2006 17:48:56 -0000
From: "\(s\) Richard Nuttall"
<richard.nuttall at students.plymouth.ac.uk>
Subject: [R] coding problems
To: <r-help at stat.math.ethz.ch>
Message-ID:
<84661D5136F2FA4880B9FF16DAE46921265598 at 03-CSEXCH.uopnet.plymouth.ac.uk>
Content-Type: text/plain; charset="iso-8859-1"
hi,
I am trying to fit an ARIMA model to some time series data, I have used
differencing to make the data stationary.
dailyibm is the data I am using, could someone please help out in
identifying the coding as I can't seem to identify the problem
many thanks
> fit.ma <- arima.sim(dailyibm - mean(dailyibm),
model=list(order=c(0,1,0)),n = 3333)
Error in inherits(x, "data.frame") : couldn't find function "rand.gen"
------------------------------
Message: 25
Date: Tue, 7 Mar 2006 20:13:11 +0200
From: "Adrian DUSA" <dusa.adrian at gmail.com>
Subject: Re: [R] QCA adn Fuzzy
To: "R Gott" <richard.gott at dur.ac.uk>
Cc: r-help at r-project.org
Message-ID:
<be5487e70603071013v2109f634h85ab13cb289a1d77 at mail.gmail.com>
Content-Type: text/plain; charset=UTF-8
Dear Prof. Gott,
On Monday 06 March 2006 14:37, R Gott wrote:
> Does anybody know of aything that will help me do Quantitiative
> Comparative Analysis (QCA) and/or Fuzzy set analysis?? Or failing
that
> Quine?
> ta
> rg
> Prof R Gott
> Durham Univesrity
> UK
There is a package called QCA which (in its first release) performs only
crisp set analysis. I am currently adapting a Graphical User Interface,
but
the functions are nevertheless usefull.
For fuzzy set analysis, please consider Charles Ragin's web site
http://www.u.arizona.edu/%7Ecragin/fsQCA/index.shtml
which offers a software (still not complete, though). Also to consider
is a good software called Tosmana (http://www.tosmana.org/) which does
multi-value
QCA.
I am considering writing the inclusion algorithms in the next releases
of my
package, but it is going to take a little while. Any contributions
and/or
feedback are more than welcome.
I hope this helps you,
Adrian
--
Adrian DUSA
Romanian Social Data Archive
1, Schitu Magureanu Bd
050025 Bucharest sector 5
Romania
Tel./Fax: +40 21 3126618 \
+40 21 3120210 / int.101
------------------------------
Message: 26
Date: Tue, 07 Mar 2006 14:20:28 -0500
From: Andrew Athan <aathan_R_1542 at memeplex.com>
Subject: [R] Regarding categorization or grouping of data
To: r-help at stat.math.ethz.ch
Message-ID: <440DDCFC.5080707 at memeplex.com>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Could someone please provide a examples regarding "best practice" for
grouping and/or categorization of data?
E.g., if I have a data page with columns X,CAT where CAT is an integer
1-10, and I want to create a new data page where for each value of CAT I
have the sum(X)?
E.g., I have same X,Y,CAT and I want to split this up into several sets
according to CAT,
E.g., I want to plot each set of X,Y per CAT using a different color for
each CAT ... perhaps on the same graph?
I have generally solved these problems by massaging the data e.g., for
X,Y,CAT case .... data[data$CAT=1,c(1,2)] gives me X,Y for CAT=1, etc.
but this seems like a kludge, especially since I have to loop through
each unique value of CAT ...
Thanks,
A.
------------------------------
Message: 27
Date: Tue, 07 Mar 2006 14:25:54 -0500
From: Sean Davis <sdavis2 at mail.nih.gov>
Subject: Re: [R] Regarding categorization or grouping of data
To: Andrew Athan <aathan_R_1542 at memeplex.com>, r-help
<r-help at stat.math.ethz.ch>
Message-ID: <C0334872.79E6%sdavis2 at mail.nih.gov>
Content-Type: text/plain; charset="US-ASCII"
On 3/7/06 2:20 PM, "Andrew Athan" <aathan_R_1542 at memeplex.com> wrote:
>
> Could someone please provide a examples regarding "best practice" for
> grouping and/or categorization of data?
>
> E.g., if I have a data page with columns X,CAT where CAT is an integer
> 1-10, and I want to create a new data page where for each value of CAT
I
> have the sum(X)?
See ?aggregate or ?by.
> E.g., I have same X,Y,CAT and I want to split this up into several
sets
> according to CAT,
See ?split
> E.g., I want to plot each set of X,Y per CAT using a different color
for
> each CAT ... perhaps on the same graph?
You might want to look at the lattice package for some techniques for
dealing with grouped data. If you want a simpler solution, look at
using
lines() and points() to add lines or points to a graph.
Sean
------------------------------
Message: 28
Date: Wed, 8 Mar 2006 01:55:39 +0530
From: "A Mani" <a.manigs at gmail.com>
Subject: [R] glm automation
To: r-help <r-help at stat.math.ethz.ch>
Message-ID:
<a6821d990603071225q3c17c769mbec37b69c09fae58 at mail.gmail.com>
Content-Type: text/plain
Hello,
I have two problems in automating multiple glm(s) operations.
The data file is tab delimited file with headers and two columns. like
"ABC" "EFG"
1 2
2 3
3 4
dat <- read.table("FILENAME", header=TRUE, sep="\t", na.strings="NA",
dec=".", strip.white=TRUE)
dataf <- read.table("FILENAME", header=FALSE, sep="\t", na.strings="NA",
dec=".", strip.white=TRUE)
norm1 <- glm(dataf[1,1] ~ dataf[1,2], family= normal(log), data=dat)
norm2 <- glm(dataf[1,1] ~ dataf[1,2], family= normal(identity),
data=dat)
and so on.
But glm does not work on the data unless I write ABC and EFG there... I
want
to automate the script for multiple files.
The other problem is to write the plot(GLM) to a file without
displaying it
at stdout.
Thanks,
A. Mani
Member, Cal. Math. Soc
[[alternative HTML version deleted]]
------------------------------
Message: 29
Date: Tue, 7 Mar 2006 12:50:15 -0800
From: Michael <comtech.usa at gmail.com>
Subject: Re: [R] returning the largest element in an array/matrix?
To: "Petr Pikal" <petr.pikal at precheza.cz>
Cc: R-help at stat.math.ethz.ch
Message-ID:
<b1f16d9d0603071250o39b51f0crd3a112e622196a3c at mail.gmail.com>
Content-Type: text/plain
I think this is the best solution! Thank you!
On 3/7/06, Petr Pikal <petr.pikal at precheza.cz> wrote:
>
> Hi
>
> If you do not insist on which.max() you can use
>
> which(mat==max(mat), arr.ind=T)
>
> HTH
> Petr
>
>
>
>
> On 6 Mar 2006 at 20:55, Michael wrote:
>
> Date sent: Mon, 6 Mar 2006 20:55:20 -0800
> From: Michael <comtech.usa at gmail.com>
> To: R-help at stat.math.ethz.ch
> Subject: [R] returning the largest element in an
> array/matrix?
>
> > Hi all,
> >
> > I want to use "which.max" to identify the maximum in a 2D
array/matrix
> > and I want "argmin" and return the row and column indices.
> >
> > But "which.max" only works for vector...
> >
> > Is there any convinient way to solve this problem?
> >
> > Thanks a lot!
> >
> > [[alternative HTML version deleted]]
> >
> > ______________________________________________
> > R-help at stat.math.ethz.ch mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide!
> > http://www.R-project.org/posting-guide.html
>
> Petr Pikal
> petr.pikal at precheza.cz
>
>
[[alternative HTML version deleted]]
------------------------------
Message: 30
Date: Tue, 7 Mar 2006 15:55:07 -0500
From: "Gabor Grothendieck" <ggrothendieck at gmail.com>
Subject: Re: [R] returning the largest element in an array/matrix?
To: Michael <comtech.usa at gmail.com>
Cc: Petr Pikal <petr.pikal at precheza.cz>, R-help at stat.math.ethz.ch
Message-ID:
<971536df0603071255v22afbfe9i6b1b4f7e85c3f61b at mail.gmail.com>
Content-Type: text/plain; charset=ISO-8859-1
I agree that that is the best solution although not to the
question as stated which said: "I want to use which.max ...".
On 3/7/06, Michael <comtech.usa at gmail.com> wrote:
> I think this is the best solution! Thank you!
>
> On 3/7/06, Petr Pikal <petr.pikal at precheza.cz> wrote:
> >
> > Hi
> >
> > If you do not insist on which.max() you can use
> >
> > which(mat==max(mat), arr.ind=T)
> >
> > HTH
> > Petr
> >
> >
> >
> >
> > On 6 Mar 2006 at 20:55, Michael wrote:
> >
> > Date sent: Mon, 6 Mar 2006 20:55:20 -0800
> > From: Michael <comtech.usa at gmail.com>
> > To: R-help at stat.math.ethz.ch
> > Subject: [R] returning the largest element in an
> > array/matrix?
> >
> > > Hi all,
> > >
> > > I want to use "which.max" to identify the maximum in a 2D
array/matrix
> > > and I want "argmin" and return the row and column indices.
> > >
> > > But "which.max" only works for vector...
> > >
> > > Is there any convinient way to solve this problem?
> > >
> > > Thanks a lot!
> > >
> > > [[alternative HTML version deleted]]
> > >
> > > ______________________________________________
> > > R-help at stat.math.ethz.ch mailing list
> > > https://stat.ethz.ch/mailman/listinfo/r-help
> > > PLEASE do read the posting guide!
> > > http://www.R-project.org/posting-guide.html
> >
> > Petr Pikal
> > petr.pikal at precheza.cz
> >
> >
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide!
http://www.R-project.org/posting-guide.html
>
------------------------------
Message: 31
Date: Tue, 7 Mar 2006 15:10:20 -0600
From: "hadley wickham" <h.wickham at gmail.com>
Subject: [R] Making an S3 object act like a data.frame
To: r-help <r-help at stat.math.ethz.ch>
Message-ID:
<f8e6ff050603071310v7f105416ma3cef4735a3fc4b4 at mail.gmail.com>
Content-Type: text/plain; charset=ISO-8859-1
"[.ggobiDataset" <- function(x, ..., drop=FALSE) {
x <- as.data.frame(x)
NextMethod("[", x)
}
"[[.ggobiDataset" <- function(x, ..., drop=FALSE) {
x <- as.data.frame(x)
NextMethod("[[", x)
}
"$.ggobiDataset" <- function(x, ..., drop=FALSE) {
x <- as.data.frame(x)
NextMethod("$", x)
}
> class(x)
[1] "ggobiDataset" "data.frame"
> x[1:2,1:2]
total_bill tip
1 16.99 1.01
2 10.34 1.66
> x[[1]]
[1] 16.99 10.34 21.01 23.68 24.59 25.29 8.77 26.88 15.04 14.78 10.27
35.26
[13] 15.42 18.43 14.83 21.58 10.33 16.29 16.97 20.65 17.92 20.29 15.77
39.42
...
> x$total_bill
Error in "$.default"(x, "total_bill") : invalid subscript type
What do I need to do to get "$.ggobiDataset" to imitate a data.frame
like [ and [[ do?
Thanks,
Hadley
------------------------------
Message: 32
Date: Tue, 7 Mar 2006 16:27:19 -0500
From: "Gabor Grothendieck" <ggrothendieck at gmail.com>
Subject: Re: [R] Making an S3 object act like a data.frame
To: "hadley wickham" <h.wickham at gmail.com>
Cc: r-help <r-help at stat.math.ethz.ch>
Message-ID:
<971536df0603071327g6f6c2860h9722204b9f33a9e1 at mail.gmail.com>
Content-Type: text/plain; charset=ISO-8859-1
If your class is a subclass of data.frame then I think
it ought to be a special sort of data frame so all you
need to do is the following in which case you get subscripting
for free by inheritance:
# constructor
myobj <- function(...)
structure(data.frame(...), class = c("myobj", "data.frame"))
# test
x <- myobj(x = 1:3, y = 4:6)
class(x)
x[[1]]
x$x
On 3/7/06, hadley wickham <h.wickham at gmail.com> wrote:
> "[.ggobiDataset" <- function(x, ..., drop=FALSE) {
> x <- as.data.frame(x)
> NextMethod("[", x)
> }
>
> "[[.ggobiDataset" <- function(x, ..., drop=FALSE) {
> x <- as.data.frame(x)
> NextMethod("[[", x)
> }
>
> "$.ggobiDataset" <- function(x, ..., drop=FALSE) {
> x <- as.data.frame(x)
> NextMethod("$", x)
> }
>
> > class(x)
> [1] "ggobiDataset" "data.frame"
>
> > x[1:2,1:2]
> total_bill tip
> 1 16.99 1.01
> 2 10.34 1.66
>
> > x[[1]]
> [1] 16.99 10.34 21.01 23.68 24.59 25.29 8.77 26.88 15.04 14.78 10.27
35.26
> [13] 15.42 18.43 14.83 21.58 10.33 16.29 16.97 20.65 17.92 20.29
15.77 39.42
> ...
>
> > x$total_bill
> Error in "$.default"(x, "total_bill") : invalid subscript type
>
>
> What do I need to do to get "$.ggobiDataset" to imitate a data.frame
> like [ and [[ do?
>
> Thanks,
>
> Hadley
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide!
http://www.R-project.org/posting-guide.html
>
------------------------------
Message: 33
Date: Tue, 7 Mar 2006 16:29:58 -0500
From: "mark salsburg" <mark.salsburg at gmail.com>
Subject: [R] reading in only one column from text file
To: R-help at stat.math.ethz.ch
Cc: r-help at stat.math.ethz.ch
Message-ID:
<dd48e20f0603071329s463907a5h63ab6f6621f6afde at mail.gmail.com>
Content-Type: text/plain
How do I manipulate the read.table function to read in only the 2nd
column???
[[alternative HTML version deleted]]
------------------------------
Message: 34
Date: Tue, 7 Mar 2006 16:29:58 -0500
From: "mark salsburg" <mark.salsburg at gmail.com>
Subject: [R] reading in only one column from text file
To: R-help at stat.math.ethz.ch
Cc: r-help at stat.math.ethz.ch
Message-ID:
<dd48e20f0603071329s463907a5h63ab6f6621f6afde at mail.gmail.com>
Content-Type: text/plain
How do I manipulate the read.table function to read in only the 2nd
column???
[[alternative HTML version deleted]]
------------------------------
Message: 35
Date: Tue, 7 Mar 2006 13:45:40 -0800
From: Michael <comtech.usa at gmail.com>
Subject: [R] how to use the rpart function?
To: R-help at stat.math.ethz.ch
Message-ID:
<b1f16d9d0603071345m2860b1cbx612eb553f1a6cae0 at mail.gmail.com>
Content-Type: text/plain
Hi all,
What parameter do I normally change in the rpart function? How do I set
the
"cp" option?
Is there a way to read off error rate directly from the "rpart" function
for
training data; I imagine for testing data I have to apply a "predict",
but
for training data I guess the error count would be somewhere existing
once
the "rpart" function is finished. Looks like it is related to
expressions
such as "expected loss=0.8362365" when using "summary" function.
Now I have to do this manually, and when it came to compare the correct
vs.
wrong and count the errors, it was always very tedious...
Thanks a lot!
M.
[[alternative HTML version deleted]]
------------------------------
Message: 36
Date: Tue, 7 Mar 2006 15:49:03 -0600
From: "hadley wickham" <h.wickham at gmail.com>
Subject: Re: [R] Making an S3 object act like a data.frame
To: "Gabor Grothendieck" <ggrothendieck at gmail.com>
Cc: r-help <r-help at stat.math.ethz.ch>
Message-ID:
<f8e6ff050603071349l60ad229dl6dd58fd51736384b at mail.gmail.com>
Content-Type: text/plain; charset=ISO-8859-1
> If your class is a subclass of data.frame then I think
> it ought to be a special sort of data frame so all you
> need to do is the following in which case you get subscripting
> for free by inheritance:
My class is actually an external pointer to a data set stored in
ggobi, so I don't think this will work. as.data.frame.ggobiDataset
retrieves the whole dataset out of ggobi and in to R.
Hadley
------------------------------
Message: 37
Date: Tue, 07 Mar 2006 13:56:53 -0800
From: Seth Falcon <sfalcon at fhcrc.org>
Subject: Re: [R] reading in only one column from text file
To: r-help at stat.math.ethz.ch
Message-ID: <m2hd69ex5m.fsf at ziti.local>
Content-Type: text/plain; charset=us-ascii
"mark salsburg" <mark.salsburg at gmail.com> writes:
> How do I manipulate the read.table function to read in only the 2nd
> column???
If your data is small, you can read in all columns and then subset the
resulting data frame. Try that first.
Perhaps there is a nicer way to do this that I don't know about, but
recently I coded up the following to allow for a "streamy" read.table.
I've adjusted a few things, but haven't tested. May not work as is,
but it should give you an idea.
+ seth
readBatch <- function(con, batch.size) {
colClasses <- rep("character", 20) ## fix for your data
## adjust to pick out the columns that you want
read.csv(con, colClasses=colClasses, as.is=TRUE,
nrows=batch.size, header=FALSE)[, 1:2]
}
readTableStreamily <- function(filePath) {
BATCH_SIZE <- 5000 ## no idea what a good value is depends on file
and RAM
con <- file(filePath, 'r')
colNames <- readBatch(con, batch.size=1)
chunks <- list()
i <- 1
done <- FALSE
while (!done) {
done <- tryCatch({
cat(".")
chunks[[i]] <- readBatch(con, batch.size=BATCH_SIZE)
i <- i + 1
FALSE
}, error=function(e) TRUE)
}
close(con)
cat("\n")
df <- do.call("rbind", chunks)
names(df) <- colNames
df
}
------------------------------
Message: 38
Date: Tue, 7 Mar 2006 16:58:01 -0500
From: "mark salsburg" <mark.salsburg at gmail.com>
Subject: [R] Fwd: reading in only one column from text file
To: R-help at stat.math.ethz.ch
Cc: r-help at stat.math.ethz.ch
Message-ID:
<dd48e20f0603071358p293f9686r790c30f8d3391e4a at mail.gmail.com>
Content-Type: text/plain
---------- Forwarded message ----------
From: mark salsburg <mark.salsburg at gmail.com>
Date: Mar 7, 2006 4:57 PM
Subject: Re: [R] reading in only one column from text file
To: Berton Gunter <gunter.berton at gene.com>
I've tried that:
read.table(myData, colClasses = NULL)
colClasses doesn't seem to do anything when I put in NULL.
How do I tell R to skip the 2nd column i'm reading in???
thank you,
On 3/7/06, Berton Gunter <gunter.berton at gene.com> wrote:
>
> See the "NULL" value for argument colClasses of read.table().
>
> -- Bert Gunter
> Genentech Non-Clinical Statistics
> South San Francisco, CA
>
> "The business of the statistician is to catalyze the scientific
learning
> process." - George E. P. Box
>
>
>
> > -----Original Message-----
> > From: r-help-bounces at stat.math.ethz.ch
> > [mailto: r-help-bounces at stat.math.ethz.ch] On Behalf Of mark
salsburg
> > Sent: Tuesday, March 07, 2006 1:30 PM
> > To: R-help at stat.math.ethz.ch
> > Cc: r-help at stat.math.ethz.ch
> > Subject: [R] reading in only one column from text file
> >
> > How do I manipulate the read.table function to read in only the 2nd
> > column???
> >
> > [[alternative HTML version deleted]]
> >
> > ______________________________________________
> > R-help at stat.math.ethz.ch mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide!
> > http://www.R-project.org/posting-guide.html
> >
>
>
[[alternative HTML version deleted]]
------------------------------
Message: 39
Date: Tue, 7 Mar 2006 16:58:01 -0500
From: "mark salsburg" <mark.salsburg at gmail.com>
Subject: [R] Fwd: reading in only one column from text file
To: R-help at stat.math.ethz.ch
Cc: r-help at stat.math.ethz.ch
Message-ID:
<dd48e20f0603071358p293f9686r790c30f8d3391e4a at mail.gmail.com>
Content-Type: text/plain
---------- Forwarded message ----------
From: mark salsburg <mark.salsburg at gmail.com>
Date: Mar 7, 2006 4:57 PM
Subject: Re: [R] reading in only one column from text file
To: Berton Gunter <gunter.berton at gene.com>
I've tried that:
read.table(myData, colClasses = NULL)
colClasses doesn't seem to do anything when I put in NULL.
How do I tell R to skip the 2nd column i'm reading in???
thank you,
On 3/7/06, Berton Gunter <gunter.berton at gene.com> wrote:
>
> See the "NULL" value for argument colClasses of read.table().
>
> -- Bert Gunter
> Genentech Non-Clinical Statistics
> South San Francisco, CA
>
> "The business of the statistician is to catalyze the scientific
learning
> process." - George E. P. Box
>
>
>
> > -----Original Message-----
> > From: r-help-bounces at stat.math.ethz.ch
> > [mailto: r-help-bounces at stat.math.ethz.ch] On Behalf Of mark
salsburg
> > Sent: Tuesday, March 07, 2006 1:30 PM
> > To: R-help at stat.math.ethz.ch
> > Cc: r-help at stat.math.ethz.ch
> > Subject: [R] reading in only one column from text file
> >
> > How do I manipulate the read.table function to read in only the 2nd
> > column???
> >
> > [[alternative HTML version deleted]]
> >
> > ______________________________________________
> > R-help at stat.math.ethz.ch mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide!
> > http://www.R-project.org/posting-guide.html
> >
>
>
[[alternative HTML version deleted]]
------------------------------
Message: 40
Date: 07 Mar 2006 23:01:33 +0100
From: Peter Dalgaard <p.dalgaard at biostat.ku.dk>
Subject: Re: [R] reading in only one column from text file
To: "mark salsburg" <mark.salsburg at gmail.com>
Cc: R-help at stat.math.ethz.ch, r-help at stat.math.ethz.ch
Message-ID: <x2acc1ykw2.fsf at turmalin.kubism.ku.dk>
Content-Type: text/plain; charset=iso-8859-1
"mark salsburg" <mark.salsburg at gmail.com> writes:
> How do I manipulate the read.table function to read in only the 2nd
> column???
Something along the lines of
cc <- rep("NULL", ncols)
cc[2] <- NA # use type.convert
read.table(.... colClasses=cc ....)
--
O__ ---- Peter Dalgaard ?ster Farimagsgade 5, Entr.B
c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
(*) \(*) -- University of Copenhagen Denmark Ph: (+45)
35327918
~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk) FAX: (+45)
35327907
------------------------------
Message: 41
Date: Tue, 07 Mar 2006 18:06:51 -0400
From: Kjetil Brinchmann Halvorsen
<kjetilbrinchmannhalvorsen at gmail.com>
Subject: Re: [R] reading in only one column from text file
To: mark salsburg <mark.salsburg at gmail.com>
Cc: R-help at stat.math.ethz.ch
Message-ID: <440E03FB.2090906 at gmail.com>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
mark salsburg wrote:
> How do I manipulate the read.table function to read in only the 2nd
> column???
Se the colClasses argument of read.table()
Kjetil
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide!
http://www.R-project.org/posting-guide.html
>
------------------------------
Message: 42
Date: Tue, 7 Mar 2006 14:07:22 -0800
From: Berton Gunter <gunter.berton at gene.com>
Subject: Re: [R] reading in only one column from text file
To: "'mark salsburg'" <mark.salsburg at gmail.com>
Cc: 'R-Help' <r-help at stat.math.ethz.ch>
Message-ID: <004401c64233$82d6d340$1a83fea9 at gne.windows.gene.com>
Content-Type: text/plain; charset="US-ASCII"
You are not reading the Help file correctly. It says:
"Character.A vector of classes to be assumed for the columns. Recycled
as
necessary."
^^^^^^^^^^^^^^^^
colClasses=NULL) means that you have no colClasses argument. It does
exactly
what you tell it to (use whatever defaults it has, inother words).
c("numeric","NULL",...)) ## however many more columns you have for ...
Please note the **quotes** . It behaves as documented.
-- Bert Gunter
Genentech Non-Clinical Statistics
South San Francisco, CA
> -----Original Message-----
> From: mark salsburg [mailto:mark.salsburg at gmail.com]
> Sent: Tuesday, March 07, 2006 1:57 PM
> To: Berton Gunter
> Subject: Re: [R] reading in only one column from text file
>
> I've tried that:
>
> read.table(myData, colClasses = NULL)
>
> colClasses doesn't seem to do anything when I put in NULL.
>
> How do I tell R to skip the 2nd column i'm reading in???
>
> thank you,
>
>
>
>
> On 3/7/06, Berton Gunter <gunter.berton at gene.com> wrote:
>
> See the "NULL" value for argument colClasses of read.table().
>
> -- Bert Gunter
> Genentech Non-Clinical Statistics
> South San Francisco, CA
>
> "The business of the statistician is to catalyze the
> scientific learning
> process." - George E. P. Box
>
>
>
> > -----Original Message-----
> > From: r-help-bounces at stat.math.ethz.ch
> > [mailto: r-help-bounces at stat.math.ethz.ch
> <mailto:r-help-bounces at stat.math.ethz.ch> ] On Behalf Of mark salsburg
> > Sent: Tuesday, March 07, 2006 1:30 PM
> > To: R-help at stat.math.ethz.ch
> > Cc: r-help at stat.math.ethz.ch
> <mailto:r-help at stat.math.ethz.ch>
> > Subject: [R] reading in only one column from text file
> >
> > How do I manipulate the read.table function to read
> in only the 2nd
> > column???
> >
> > [[alternative HTML version deleted]]
> >
> > ______________________________________________
> > R-help at stat.math.ethz.ch mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide!
> > http://www.R-project.org/posting-guide.html
> >
>
>
>
>
>
------------------------------
Message: 43
Date: Tue, 7 Mar 2006 17:08:48 -0500
From: "Liaw, Andy" <andy_liaw at merck.com>
Subject: Re: [R] reading in only one column from text file
To: "'mark salsburg'" <mark.salsburg at gmail.com>,
R-help at stat.math.ethz.ch
Cc: r-help at stat.math.ethz.ch
Message-ID:
<39B6DDB9048D0F4DAD42CB26AAFF0AFAFED8DA at usctmx1106.merck.com>
Content-Type: text/plain
read.table("datafile", colClasses=c("NULL", "numeric"), ...)
or something like that.
Andy
From: mark salsburg
>
> How do I manipulate the read.table function to read in only
> the 2nd column???
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide!
> http://www.R-project.org/posting-guide.html
>
>
------------------------------
Message: 44
Date: Tue, 07 Mar 2006 17:10:33 -0500
From: Sean Davis <sdavis2 at mail.nih.gov>
Subject: [R] Writing out complex text file
To: r-help <r-help at stat.math.ethz.ch>
Message-ID: <C0336F09.7A0A%sdavis2 at mail.nih.gov>
Content-Type: text/plain; charset="US-ASCII"
I have a group of data.frames that I need to write out interspersed with
some complex header information at the top of each. The file needs
include
many such data.frame/header pairs. Is it as simple as using a
connection
and printing to it everything (including tabs and endlines, etc.), or is
there some other way to go about this?
Thanks,
Sean
------------------------------
Message: 45
Date: Tue, 07 Mar 2006 17:15:27 -0500
From: Jean Eid <jeaneid at chass.utoronto.ca>
Subject: Re: [R] reading in only one column from text file
To: mark salsburg <mark.salsburg at gmail.com>
Cc: R-help at stat.math.ethz.ch
Message-ID: <440E05FF.8080403 at chass.utoronto.ca>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
You might want to read
?scan and pay attention to what= argument
Jean
mark salsburg wrote:
>How do I manipulate the read.table function to read in only the 2nd
>column???
>
> [[alternative HTML version deleted]]
>
>______________________________________________
>R-help at stat.math.ethz.ch mailing list
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide!
http://www.R-project.org/posting-guide.html
>
>
>
------------------------------
Message: 46
Date: Tue, 07 Mar 2006 17:21:31 -0500
From: Jean Eid <jeaneid at chass.utoronto.ca>
Subject: Re: [R] glm automation
To: A Mani <a.manigs at gmail.com>
Cc: r-help <r-help at stat.math.ethz.ch>
Message-ID: <440E076B.5030804 at chass.utoronto.ca>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
If you know what the dependent variable is called and is the same for
all regressions see
?formula
Or you can do glm(ABC~. , ...,data=dat)
PS. the way you are calling the formula you are taking the first element
in the first column and the first element in the second column. So that
is not what you want.
For plots
see ?postscript
Jean
A Mani wrote:
>Hello,
> I have two problems in automating multiple glm(s) operations.
>The data file is tab delimited file with headers and two columns. like
>
>"ABC" "EFG"
>1 2
>2 3
>3 4
>dat <- read.table("FILENAME", header=TRUE, sep="\t", na.strings="NA",
>dec=".", strip.white=TRUE)
>dataf <- read.table("FILENAME", header=FALSE, sep="\t",
na.strings="NA",
>dec=".", strip.white=TRUE)
>norm1 <- glm(dataf[1,1] ~ dataf[1,2], family= normal(log), data=dat)
>norm2 <- glm(dataf[1,1] ~ dataf[1,2], family= normal(identity),
data=dat)
>and so on.
>
>But glm does not work on the data unless I write ABC and EFG there... I
want
>to automate the script for multiple files.
>
>The other problem is to write the plot(GLM) to a file without
displaying it
>at stdout.
>
>Thanks,
>
>
>A. Mani
>Member, Cal. Math. Soc
>
> [[alternative HTML version deleted]]
>
>______________________________________________
>R-help at stat.math.ethz.ch mailing list
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide!
http://www.R-project.org/posting-guide.html
>
>
>
------------------------------
Message: 47
Date: Tue, 7 Mar 2006 23:35:29 +0100
From: "Henrik Bengtsson" <hb at maths.lth.se>
Subject: Re: [R] Making an S3 object act like a data.frame
To: "hadley wickham" <h.wickham at gmail.com>
Cc: r-help <r-help at stat.math.ethz.ch>
Message-ID:
<59d7961d0603071435n4049725eq83c613bc6561f07f at mail.gmail.com>
Content-Type: text/plain; charset=ISO-8859-1
Hi.
On 3/7/06, hadley wickham <h.wickham at gmail.com> wrote:
> "[.ggobiDataset" <- function(x, ..., drop=FALSE) {
> x <- as.data.frame(x)
> NextMethod("[", x)
> }
>
> "[[.ggobiDataset" <- function(x, ..., drop=FALSE) {
> x <- as.data.frame(x)
> NextMethod("[[", x)
> }
>
> "$.ggobiDataset" <- function(x, ..., drop=FALSE) {
> x <- as.data.frame(x)
> NextMethod("$", x)
> }
>
> > class(x)
> [1] "ggobiDataset" "data.frame"
>
> > x[1:2,1:2]
> total_bill tip
> 1 16.99 1.01
> 2 10.34 1.66
>
> > x[[1]]
> [1] 16.99 10.34 21.01 23.68 24.59 25.29 8.77 26.88 15.04 14.78
10.27 35.26
> [13] 15.42 18.43 14.83 21.58 10.33 16.29 16.97 20.65 17.92 20.29
15.77 39.42
> ...
>
> > x$total_bill
> Error in "$.default"(x, "total_bill") : invalid subscript type
where/how is "$.default"() defined? I don't have one;
> getAnywhere("$.default")
no object named '$.default' was found
/Henrik
>
> What do I need to do to get "$.ggobiDataset" to imitate a data.frame
> like [ and [[ do?
>
> Thanks,
>
> Hadley
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide!
http://www.R-project.org/posting-guide.html
>
>
------------------------------
Message: 48
Date: Tue, 7 Mar 2006 16:37:52 -0600
From: "hadley wickham" <h.wickham at gmail.com>
Subject: Re: [R] Making an S3 object act like a data.frame
To: "Henrik Bengtsson" <hb at maths.lth.se>
Cc: r-help <r-help at stat.math.ethz.ch>
Message-ID:
<f8e6ff050603071437w1f412fan21246a5b0c12a69d at mail.gmail.com>
Content-Type: text/plain; charset=ISO-8859-1
> > > x$total_bill
> > Error in "$.default"(x, "total_bill") : invalid subscript type
>
> where/how is "$.default"() defined? I don't have one;
>
> > getAnywhere("$.default")
> no object named '$.default' was found
Neither do I. I guess there is some internal magic going on for $.
Hadley
------------------------------
Message: 49
Date: Tue, 7 Mar 2006 17:44:05 -0500
From: "Gabor Grothendieck" <ggrothendieck at gmail.com>
Subject: Re: [R] Making an S3 object act like a data.frame
To: "hadley wickham" <h.wickham at gmail.com>
Cc: r-help <r-help at stat.math.ethz.ch>
Message-ID:
<971536df0603071444k4097053evc7eaab2f21cf918a at mail.gmail.com>
Content-Type: text/plain; charset=ISO-8859-1
Just to get something reproducible lets assume the
objects of class "myobj" each consists of a one-element
list that contains a data frame. Then try this:
# constructor
myobj <- function(...)
structure(list(value = data.frame(...)), class = "myobj")
"$.myobj" <- function(obj, x) .subset2(obj, 1)[[x]]
"[[.myobj" <- function(obj, ...) .subset2(obj, 1)[[...]]
"[.myobj" <- function(obj, ...) .subset2(obj, 1)[...]
# test
x <- myobj(x = 1:3, y = 4:6)
class(x)
x[[1]]
x[1,]
x$y
On 3/7/06, hadley wickham <h.wickham at gmail.com> wrote:
> > If your class is a subclass of data.frame then I think
> > it ought to be a special sort of data frame so all you
> > need to do is the following in which case you get subscripting
> > for free by inheritance:
>
> My class is actually an external pointer to a data set stored in
> ggobi, so I don't think this will work. as.data.frame.ggobiDataset
> retrieves the whole dataset out of ggobi and in to R.
>
> Hadley
>
------------------------------
Message: 50
Date: Tue, 07 Mar 2006 17:47:03 -0500
From: Sean Davis <sdavis2 at mail.nih.gov>
Subject: Re: [R] Writing out complex text file
To: Sean Davis <sdavis2 at mail.nih.gov>, r-help
<r-help at stat.math.ethz.ch>
Message-ID: <C0337797.7A1C%sdavis2 at mail.nih.gov>
Content-Type: text/plain; charset="US-ASCII"
On 3/7/06 5:10 PM, "Sean Davis" <sdavis2 at mail.nih.gov> wrote:
> I have a group of data.frames that I need to write out interspersed
with
> some complex header information at the top of each. The file needs
include
> many such data.frame/header pairs. Is it as simple as using a
connection
> and printing to it everything (including tabs and endlines, etc.), or
is
> there some other way to go about this?
After some experimentation, I answered my own qeustion. Just open a
connection using file and then use writeLines and write.table, passing
the
connection object to each. Works like a charm.
Sean
------------------------------
Message: 51
Date: Tue, 7 Mar 2006 17:00:29 -0600
From: "hadley wickham" <h.wickham at gmail.com>
Subject: Re: [R] Making an S3 object act like a data.frame
To: "Gabor Grothendieck" <ggrothendieck at gmail.com>
Cc: r-help <r-help at stat.math.ethz.ch>
Message-ID:
<f8e6ff050603071500n2b4c082akfc17923cace49ef2 at mail.gmail.com>
Content-Type: text/plain; charset=ISO-8859-1
> Just to get something reproducible lets assume the
> objects of class "myobj" each consists of a one-element
> list that contains a data frame. Then try this:
Thanks for that - it makes sense. Every time I try to use inheritance
in R, I always seem to end up using a different method.
Hadley
------------------------------
Message: 52
Date: Tue, 7 Mar 2006 18:18:53 -0500
From: "Gabor Grothendieck" <ggrothendieck at gmail.com>
Subject: Re: [R] Making an S3 object act like a data.frame
To: "hadley wickham" <h.wickham at gmail.com>
Cc: r-help <r-help at stat.math.ethz.ch>
Message-ID:
<971536df0603071518q62cd47d9r807605f4e775008f at mail.gmail.com>
Content-Type: text/plain; charset=ISO-8859-1
On 3/7/06, hadley wickham <h.wickham at gmail.com> wrote:
> > Just to get something reproducible lets assume the
> > objects of class "myobj" each consists of a one-element
> > list that contains a data frame. Then try this:
>
> Thanks for that - it makes sense. Every time I try to use inheritance
> in R, I always seem to end up using a different method.
>
> Hadley
>
I tend to have to use trial and error myself. Here is another
possibility.
myobj <- function(...)
structure(list(value = data.frame(...)), class = "myobj")
"$.myobj" <- function(obj, name) obj[[name]]
"[[.myobj" <- function(obj, ...) {
obj <- .subset2(obj, 1)
.Class <- "data.frame"
NextMethod("[[", obj)
}
"[.myobj" <- function(obj, ...) {
obj <- .subset2(obj, 1)
.Class <- "data.frame"
NextMethod("[", obj)
}
# test
z <- myobj(x = 1:3, y = 4:6)
class(z)
z[[1]]
z[1,]
z$y
z[["x"]]
------------------------------
Message: 53
Date: Tue, 7 Mar 2006 17:28:47 -0600
From: "hadley wickham" <h.wickham at gmail.com>
Subject: Re: [R] Making an S3 object act like a data.frame
To: "Gabor Grothendieck" <ggrothendieck at gmail.com>
Cc: r-help <r-help at stat.math.ethz.ch>
Message-ID:
<f8e6ff050603071528p339310aaqa04dcf206cd21d6d at mail.gmail.com>
Content-Type: text/plain; charset=ISO-8859-1
> I tend to have to use trial and error myself. Here is another
> possibility.
That's got the subsetting solved, so here's the next challenge
> lm(x ~ y, z)
Error in as.data.frame.default(data) : cannot coerce class "myobj"
into a data.frame
> as.data.frame.myobj <- function(x) x[[1]]
> lm(x ~ y, z)
Error in eval(expr, envir, enclos) : numeric 'envir' arg not of length
one
I'm guessing this is pretty much impossible to get around, because
there is no way to tell eval how to deal with myobj type objects, and
lm only dispatches based on the type of the first argument.
Hadley
------------------------------
Message: 54
Date: Tue, 7 Mar 2006 17:39:28 -0800
From: Michael <comtech.usa at gmail.com>
Subject: [R] how to use the randomForest and rpart function?
To: R-help at stat.math.ethz.ch
Message-ID:
<b1f16d9d0603071739v3b9da963jf3157f1b90954ee9 at mail.gmail.com>
Content-Type: text/plain
Hi all,
I am trying to play around with the randomForest function for
classification. I know its performance is great.
I am currently using the default options.
It has many options.
How do I further tweak the options so that I can make its performance
even
better?
What are the options that are mostly used?
Thanks a lot!
M
[[alternative HTML version deleted]]
------------------------------
Message: 55
Date: Tue, 7 Mar 2006 17:44:25 -0800
From: Michael <comtech.usa at gmail.com>
Subject: Re: [R] how to use the randomForest and rpart function?
To: R-help at stat.math.ethz.ch
Message-ID:
<b1f16d9d0603071744n49f37cb6ve87c738809f2f8fd at mail.gmail.com>
Content-Type: text/plain
When I plot the randomForest object, it shows a graph with 3 lines,
green,
red and black, what's the meaning of these three lines?
On 3/7/06, Michael <comtech.usa at gmail.com> wrote:
>
> Hi all,
>
> I am trying to play around with the randomForest function for
> classification. I know its performance is great.
>
> I am currently using the default options.
>
> It has many options.
>
> How do I further tweak the options so that I can make its performance
even
> better?
>
> What are the options that are mostly used?
>
> Thanks a lot!
>
> M
>
[[alternative HTML version deleted]]
------------------------------
Message: 56
Date: Tue, 7 Mar 2006 20:49:55 -0500
From: "Gabor Grothendieck" <ggrothendieck at gmail.com>
Subject: Re: [R] Making an S3 object act like a data.frame
To: "hadley wickham" <h.wickham at gmail.com>
Cc: r-help <r-help at stat.math.ethz.ch>
Message-ID:
<971536df0603071749w779a7631t46671b3b7cb0064f at mail.gmail.com>
Content-Type: text/plain; charset=ISO-8859-1
The problem is that x[[1]] in the definition of as.data.frame.myobj
invokes [[.myobj whereas we want to extract the first element of the
list in the internal representation of x -- which is not the same.
Try this instead:
> as.data.frame.myobj <- function(x) .subset2(x, 1)
> lm(y ~ x, z)
Call:
lm(formula = y ~ x, data = z)
Coefficients:
(Intercept) x
3 1
On 3/7/06, hadley wickham <h.wickham at gmail.com> wrote:
> > I tend to have to use trial and error myself. Here is another
> > possibility.
>
> That's got the subsetting solved, so here's the next challenge
>
> > lm(x ~ y, z)
> Error in as.data.frame.default(data) : cannot coerce class "myobj"
> into a data.frame
> > as.data.frame.myobj <- function(x) x[[1]]
> > lm(x ~ y, z)
> Error in eval(expr, envir, enclos) : numeric 'envir' arg not of length
one
>
> I'm guessing this is pretty much impossible to get around, because
> there is no way to tell eval how to deal with myobj type objects, and
> lm only dispatches based on the type of the first argument.
>
> Hadley
>
------------------------------
Message: 57
Date: Tue, 7 Mar 2006 21:00:57 -0500
From: "Liaw, Andy" <andy_liaw at merck.com>
Subject: Re: [R] how to use the randomForest and rpart function?
To: "'Michael'" <comtech.usa at gmail.com>, R-help at stat.math.ethz.ch
Message-ID:
<39B6DDB9048D0F4DAD42CB26AAFF0AFAFED8DD at usctmx1106.merck.com>
Content-Type: text/plain
As ?plot.randomForest says, it plots error rates. In addition to
overall
error rates, it also plots error rates for each class.
As to the options in randomForest, read about the options in the help
page
and the reference linked from the help page.
Andy
From: Michael
>
> When I plot the randomForest object, it shows a graph with 3
> lines, green, red and black, what's the meaning of these three lines?
>
> On 3/7/06, Michael <comtech.usa at gmail.com> wrote:
> >
> > Hi all,
> >
> > I am trying to play around with the randomForest function for
> > classification. I know its performance is great.
> >
> > I am currently using the default options.
> >
> > It has many options.
> >
> > How do I further tweak the options so that I can make its
> performance
> > even better?
> >
> > What are the options that are mostly used?
> >
> > Thanks a lot!
> >
> > M
> >
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide!
> http://www.R-project.org/posting-guide.html
>
>
------------------------------
Message: 58
Date: Wed, 8 Mar 2006 10:12:21 +0800
From: ronggui <ronggui.huang at gmail.com>
Subject: Re: [R] glm automation
To: "A Mani" <a.manigs at gmail.com>
Cc: r-help <r-help at stat.math.ethz.ch>
Message-ID: <38b9f0350603071812h78aa338dh at mail.gmail.com>
Content-Type: text/plain; charset=GB2312
2006/3/8, A Mani <a.manigs at gmail.com>:
> Hello,
> I have two problems in automating multiple glm(s) operations.
> The data file is tab delimited file with headers and two columns. like
>
> "ABC" "EFG"
> 1 2
> 2 3
> 3 4
> dat <- read.table("FILENAME", header=TRUE, sep="\t", na.strings="NA",
> dec=".", strip.white=TRUE)
> dataf <- read.table("FILENAME", header=FALSE, sep="\t",
na.strings="NA",
> dec=".", strip.white=TRUE)
> norm1 <- glm(dataf[1,1] ~ dataf[1,2], family= normal(log), data=dat)
> norm2 <- glm(dataf[1,1] ~ dataf[1,2], family= normal(identity),
data=dat)
> and so on.
It should be
norm1 <- glm(dataf[,1] ~ dataf[,2], family= normal(log), data=dat)
norm2 <- glm(dataf[,1] ~ dataf[,2], family= normal(identity), data=dat)
you should read the document of "[" about how to use index.
> But glm does not work on the data unless I write ABC and EFG there...
I want
> to automate the script for multiple files.
>
> The other problem is to write the plot(GLM) to a file without
displaying it
> at stdout.
>
> Thanks,
>
>
> A. Mani
> Member, Cal. Math. Soc
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide!
http://www.R-project.org/posting-guide.html
>
--
??????
Deparment of Sociology
Fudan University
------------------------------
Message: 59
Date: Tue, 7 Mar 2006 18:27:27 -0800
From: Michael <comtech.usa at gmail.com>
Subject: Re: [R] how to use the randomForest and rpart function?
To: "Liaw, Andy" <andy_liaw at merck.com>
Cc: R-help at stat.math.ethz.ch
Message-ID:
<b1f16d9d0603071827n456f1cb1gb55e9fa699e25a4c at mail.gmail.com>
Content-Type: text/plain
It did not have a legend showing on which color is for class1, which
color
is for class2, etc...
I've read the R-help page.
It lists a lot of options, but it did not say which ones are the key
parameters that people use most for improving performance...
Do you know?
On 3/7/06, Liaw, Andy <andy_liaw at merck.com> wrote:
>
> As ?plot.randomForest says, it plots error rates. In addition to
overall
> error rates, it also plots error rates for each class.
>
> As to the options in randomForest, read about the options in the help
page
> and the reference linked from the help page.
>
> Andy
>
> From: Michael
> >
> > When I plot the randomForest object, it shows a graph with 3
> > lines, green, red and black, what's the meaning of these three
lines?
> >
> > On 3/7/06, Michael <comtech.usa at gmail.com> wrote:
> > >
> > > Hi all,
> > >
> > > I am trying to play around with the randomForest function for
> > > classification. I know its performance is great.
> > >
> > > I am currently using the default options.
> > >
> > > It has many options.
> > >
> > > How do I further tweak the options so that I can make its
> > performance
> > > even better?
> > >
> > > What are the options that are mostly used?
> > >
> > > Thanks a lot!
> > >
> > > M
> > >
> >
> > [[alternative HTML version deleted]]
> >
> > ______________________________________________
> > R-help at stat.math.ethz.ch mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide!
> > http://www.R-project.org/posting-guide.html
> >
> >
>
>
>
>
------------------------------------------------------------------------
------
> Notice: This e-mail message, together with any
attachment...{{dropped}}
------------------------------
Message: 60
Date: Tue, 7 Mar 2006 21:29:07 -0500
From: "Liaw, Andy" <andy_liaw at merck.com>
Subject: Re: [R] glm automation
To: "'ronggui'" <ronggui.huang at gmail.com>, "A Mani"
<a.manigs at gmail.com>
Cc: r-help <r-help at stat.math.ethz.ch>
Message-ID:
<39B6DDB9048D0F4DAD42CB26AAFF0AFAFED8E0 at usctmx1106.merck.com>
Content-Type: text/plain; charset=gb2312
From: ronggui
>
> 2006/3/8, A Mani <a.manigs at gmail.com>:
> > Hello,
> > I have two problems in automating multiple glm(s)
> operations.
> > The data file is tab delimited file with headers and two
> columns. like
> >
> > "ABC" "EFG"
> > 1 2
> > 2 3
> > 3 4
> > dat <- read.table("FILENAME", header=TRUE, sep="\t",
> na.strings="NA",
> > dec=".", strip.white=TRUE) dataf <- read.table("FILENAME",
> > header=FALSE, sep="\t", na.strings="NA", dec=".", strip.white=TRUE)
> > norm1 <- glm(dataf[1,1] ~ dataf[1,2], family= normal(log), data=dat)
> > norm2 <- glm(dataf[1,1] ~ dataf[1,2], family=
> normal(identity), data=dat)
> > and so on.
> It should be
> norm1 <- glm(dataf[,1] ~ dataf[,2], family= normal(log),
> data=dat) norm2 <- glm(dataf[,1] ~ dataf[,2], family=
> normal(identity), data=dat)
>
> you should read the document of "[" about how to use index.
I wish people would just give up on (ab)using model formula like that.
IMHO
it's really asking for trouble down the road. For example:
> n <- 5
> d1 <- data.frame(x=1:n, y=rnorm(n))
> d2 <- data.frame(u=n:1, v=rnorm(n))
> d3 <- data.frame(y=rnorm(n), x=n:1)
> f1 <- lm(d1[,2] ~ d1[,1], data=d1)
> f2 <- lm(y ~ x, data=d1)
> predict(f1)
1 2 3 4 5
-1.767697694 -1.326691900 -0.885686106 -0.444680312 -0.003674518
> predict(f2)
1 2 3 4 5
-1.767697694 -1.326691900 -0.885686106 -0.444680312 -0.003674518
> predict(f1, d2)
1 2 3 4 5
-1.767697694 -1.326691900 -0.885686106 -0.444680312 -0.003674518
Notice anything odd above?
> predict(f2, d3)
1 2 3 4 5
-0.003674518 -0.444680312 -0.885686106 -1.326691900 -1.767697694
Now that's more like it...
Andy
> > But glm does not work on the data unless I write ABC and
> EFG there...
> > I want to automate the script for multiple files.
> >
> > The other problem is to write the plot(GLM) to a file without
> > displaying it at stdout.
> >
> > Thanks,
> >
> >
> > A. Mani
> > Member, Cal. Math. Soc
> >
> > [[alternative HTML version deleted]]
> >
> > ______________________________________________
> > R-help at stat.math.ethz.ch mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide!
> > http://www.R-project.org/posting-guide.html
> >
>
>
> --
> ??????
> Deparment of Sociology
> Fudan University
>
>
Message: 61
Date: Tue, 7 Mar 2006 21:31:15 -0500
From: "Liaw, Andy" <andy_liaw at merck.com>
Subject: Re: [R] how to use the randomForest and rpart function?
To: "'Michael'" <comtech.usa at gmail.com>
Cc: R-help at stat.math.ethz.ch
Message-ID:
Content-Type: text/plain
Yes, I do know. That's why I pointed you to the reference linked from
the
help page.
BTW, there's also an R News article describing the initial version of
the
package. Have you perused that?
Andy
-----Original Message-----
From: Michael [mailto:comtech.usa at gmail.com]
Sent: Tuesday, March 07, 2006 9:27 PM
To: Liaw, Andy
Cc: R-help at stat.math.ethz.ch
Subject: Re: [R] how to use the randomForest and rpart function?
It did not have a legend showing on which color is for class1, which
color
is for class2, etc...
I've read the R-help page.
It lists a lot of options, but it did not say which ones are the key
parameters that people use most for improving performance...
Do you know?
On 3/7/06, Liaw, Andy <andy_liaw at merck.com <mailto:andy_liaw at merck.com>
>
wrote:
As ?plot.randomForest says, it plots error rates. In addition to
overall
error rates, it also plots error rates for each class.
As to the options in randomForest, read about the options in the help
page
and the reference linked from the help page.
Andy
From: Michael
>
> When I plot the randomForest object, it shows a graph with 3
> lines, green, red and black, what's the meaning of these three lines?
>
> On 3/7/06, Michael < comtech.usa at gmail.com
> wrote:
> >
> > Hi all,
> >
> > I am trying to play around with the randomForest function for
> > classification. I know its performance is great.
> >
> > I am currently using the default options.
> >
> > It has many options.
> >
> > How do I further tweak the options so that I can make its
> performance
> > even better?
> >
> > What are the options that are mostly used?
> >
> > Thanks a lot!
> >
> > M
> >
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at stat.math.ethz.ch <mailto:R-help at stat.math.ethz.ch> mailing
list
> https://stat.ethz.ch/mailman/listinfo/r-help
<https://stat.ethz.ch/mailman/listinfo/r-help>
> PLEASE do read the posting guide!
> http://www.R-project.org/posting-guide.html
<http://www.R-project.org/posting-guide.html>
>
>
----
--
Notice: This e-mail message, together with any
attachments,...{{dropped}}
Message: 62
Date: Wed, 8 Mar 2006 11:51:29 +0600
From: Zepu Zhang <zpzhang at uchicago.edu>
Subject: [R] problem installing RNetCDF
To: r-help at stat.math.ethz.ch
Message-ID: <4d95c89d.95a3c736.81eab00 at m4500-03.uchicago.edu>
Content-Type: text/plain; charset=us-ascii
Hello all,
I set 'UDUNITS_PATH' and 'NETCDF_PATH' successfully to my custom places
and
then
% R CMD INSTALL RNetCDF_1.1-3.tar.gz
and got this:
...
checking for executable suffix...
checking for object suffix... o
checking whether we are using the GNU C compiler... yes
checking whether gcc accepts -g... yes
checking for main in -lnetcdf... yes
checking for main in -ludunits... yes
checking for /Users/zpzhang/research/library/include/netcdf.h... yes
checking for /Users/zpzhang/research/library/include/udunits.h... yes
configure: creating ./config.status
config.status: creating R/load.R
config.status: creating src/Makevars
** libs
gcc -no-cpp-precomp -I/sw/Library/Frameworks/R.framework/Resources/
include -I/Users/zpzhang/research/library/include
-I/Users/zpzhang/research/
library/include -I/sw/include -fno-common -g -O2 -c RNetCDF.c -o
RNetCDF.o
gcc: unrecognized option '-no-cpp-precomp'
gcc -bundle -flat_namespace -undefined suppress -lcc_dynamic -L/sw/lib
-o
RNetCDF.so RNetCDF.o -L/Users/zpzhang/research/library/lib -lnetcdf -L/
Users/zpzhang/research/library/lib -ludunits -lcc_dynamic
-F/sw/Library/
Frameworks -framework R
gcc: couldn't run 'undle-gcc-4.0.2': No such file or directory
make: *** [RNetCDF.so] Error 1
ERROR: compilation failed for package 'RNetCDF'
....
Does anyone have any insight to this 'undle-gcc-4.02' problem?
By the way, the package contains one C source file only but the
installation
seems quite complicated. I understand it uses 'standard' procedures---
configure, etc.---but wouldn't a simple Makefile be much easier to cope
with? I
would be pretty straightfoward what variables and paths to change, etc.
By the way again, any comments on ncdf vs RNetCDF? I just couldn't like
the
inconsistent naming in ncdf (put.var.ncdf, att.put.ncdf, dim.get.ncdf,
get.var.ncdf, ...), otherwise its very recent update would have
attracted me to it.
Anyway these are not big packages and should be all fine.
Thanks for the help!
Message: 63
Date: Wed, 8 Mar 2006 00:36:27 -0600
From: Nestor Arguea <narguea at uwf.edu>
Subject: [R] Degrees of freedom using Box.test()
To: r-help <r-help at stat.math.ethz.ch>
Message-ID: <200603080036.28416.narguea at uwf.edu>
Content-Type: text/plain; charset="us-ascii"
After an RSiteSeach("Box.test") I found some discussion regarding the
degrees
of freedom in the computation of the Ljung-Box test using Box.test(),
but did
not find any posting about the proper degrees of freedom.
Box.test() uses "lag=number" as the degrees of freedom. However, I
believe
the correct degrees of freedom should be "number-p-q" where p and q are
the
number of estimated parameters (for instance, in a Box-Jenkins family of
models). This, according to the main source in documentation of
Box.test:
G. M. Ljung and G. E. P. Box, On a measure of Lack of Fit in Time Series
Models, Biometrika, Vol. 65, No. 2 (August, 1978), pp. 297-303.
One can still compute the correct p-value with
>1-pchisq(value,correctdf)
Nestor
(R 2.2.1 on Linux, Suse 9.3)
--
Nestor M. Arguea, Chair
Department of Marketing and Economics
University of West Florida
11000 University Parkway
Pensacola, FL 32514
Phone: (850)474-3071
Fax: (850)474-3069
Message: 64
Date: Wed, 8 Mar 2006 09:09:49 +0100
From: Robert Lundqvist <Robert.Lundqvist at ltu.se>
Subject: [R] info() function?
To: r-help at stat.math.ethz.ch
Message-ID: <Pine.GSO.4.58.0603080906360.11827 at delta8.math.ltu.se>
Content-Type: TEXT/PLAIN; charset=US-ASCII
I would like to have some function for getting an overview of the
variables in a worksheet. Class, dimesions, length, number of missing
values,... Guess it wouldn't be that hard to set up such a function, but
I
guess there are others who have made it already. Or is it already a
standard feature in the base package? Any suggestions?
Robert
Message: 65
Date: Wed, 8 Mar 2006 09:38:08 +0100
From: "Henrik Bengtsson" <hb at maths.lth.se>
Subject: Re: [R] info() function?
To: Robert.Lundqvist at ltu.se
Cc: r-help at stat.math.ethz.ch
Message-ID:
Content-Type: text/plain; charset=ISO-8859-1
>library(R.oo)
>ll()
member data.class dimension object.size
1 a numeric 1000 4028
2 author character 1 112
3 exp numeric 1 36
4 last.warning list 2 488
5 object function NULL 864
6 row character 1 72
7 USArrests data.frame c(50,4) 4076
8 VADeaths matrix c(5,4) 824
9 value character 1 72
with NA counts:
>naCount <- function(x, ...) ifelse(is.vector(x), sum(is.na(x)), NA)
>properties <- c("data.class", "dimension", "object.size", "naCount")
ll(properties=properties)
member data.class dimension object.size
1 a numeric 1000 4028
2 author character 1 112
3 exp numeric 1 36
4 last.warning list 2 488
5 naCount function NULL 864
6 object function NULL 864
7 properties character 4 212
8 row character 1 72
9 USArrests data.frame c(50,4) 4076
10 VADeaths matrix c(5,4) 824
11 value character 1 72
FYI: In next version of R.oo, there will probably be some kind of
option to set the default 'properties' argument so that this must not
be given explicitly by default.
Cheers
Henrik
On 3/8/06, Robert Lundqvist <Robert.Lundqvist at ltu.se> wrote:
> I would like to have some function for getting an overview of the
> variables in a worksheet. Class, dimesions, length, number of missing
> values,... Guess it wouldn't be that hard to set up such a function,
but I
> guess there are others who have made it already. Or is it already a
> standard feature in the base package? Any suggestions?
>
> Robert
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide!
>
>
--
Henrik Bengtsson
Mobile: +46 708 909208 (+1h UTC)
Message: 66
Date: Wed, 08 Mar 2006 09:58:46 +0100
From: Patrick Giraudoux <patrick.giraudoux at univ-fcomte.fr>
Subject: [R] package installation on Mac OS X 10.3.9
To: r-help <r-help at stat.math.ethz.ch>
Message-ID: <440E9CC6.1000508 at univ-fcomte.fr>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Dear listers,
I am tryin to install a package in a student training room.
Unsuccessfull! With this message:
> install.packages("pgirmess")
trying URL `http://cran.r-project.org/src/contrib/PACKAGES'
Content type `text/plain; charset=iso-8859-1' length 77400 bytes
opened URL
downloaded 75Kb
trying URL `http://cran.r-project.org/src/contrib/pgirmess_1.2.5.tar.gz'
Content type `application/x-tar' length 49962 bytes
opened URL
downloaded 48Kb
ERROR: failed to lock directory
'/Library/Frameworks/R.framework/Versions/2.0.1/Resources/library' for
modifying
Try removing
'
Delete downloaded files (y/N)? N
Can anybody advise a bit more clearly about the origin of this failure
and anticipate a way to work it around?
Cheers,
Patrick
Message: 67
Date: Wed, 8 Mar 2006 10:47:48 +0100
From: "Marcel Prokopczuk" <prokopczuk at uni-mannheim.de>
Subject: [R] removing of memory - optim()?
To: <r-help at stat.math.ethz.ch>
Message-ID: <000201c64295$655530c0$b43a9b86 at FIN36>
Content-Type: text/plain; charset="us-ascii"
Dear all,
I have the following problem: I am using Windows XP as OS and have a R
program which uses optim() in a loop. Although I overwrite every
variable in
the loop the memory R is using is increasing until my system breaks down
because the swap file is getting just to big. I suspect that optim() is
creating some variables which are not deleted automatically. So I tried
to
do the following: every 10th loop or so, I save the variables I want to
keep, delete the memory with rm(list=ls(all=TRUE)), and load back the
data I
saved before. But this is also not working. rm(list=ls(all=TRUE)) does
not
delete everything (I can see in the Task Manager of Windows that the
memory
occupied by the Rgui process stays huge).
Does anybody had similar problems and/or know how to solve it. Thanks in
advance for your help.
Marcel
Message: 68
Date: Wed, 08 Mar 2006 10:00:30 +0000
From: Patrick Burns <pburns at pburns.seanet.com>
Subject: Re: [R] Degrees of freedom using Box.test()
To: Nestor Arguea <narguea at uwf.edu>
Cc: r-help <r-help at stat.math.ethz.ch>
Message-ID: <440EAB3E.4080502 at pburns.seanet.com>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
You are saying that the penalty on the degrees of freedom
should be the same whether the model was fit with 100
observations or 1 million observations. You are also saying
that some tests should have negative degrees of freedom.
So I don't think your proposal is the right answer, though
presumably there should be some penalty.
There is a working paper on the Burns Statistics website
about robustness in Ljung-Box tests, but this issue is not one
that is covered.
Patrick Burns
patrick at burns-stat.com
+44 (0)20 8525 0696
http://www.burns-stat.com
(home of S Poetry and "A Guide for the Unwilling S User")
Nestor Arguea wrote:
>After an RSiteSeach("Box.test") I found some discussion regarding the
degrees
>of freedom in the computation of the Ljung-Box test using Box.test(),
but did
>not find any posting about the proper degrees of freedom.
>
>Box.test() uses "lag=number" as the degrees of freedom. However, I
believe
>the correct degrees of freedom should be "number-p-q" where p and q are
the
>number of estimated parameters (for instance, in a Box-Jenkins family
of
>models). This, according to the main source in documentation of
Box.test:
>
>G. M. Ljung and G. E. P. Box, On a measure of Lack of Fit in Time
Series
>Models, Biometrika, Vol. 65, No. 2 (August, 1978), pp. 297-303.
>
>One can still compute the correct p-value with
>
>
>
>>
>>
>
>
>Nestor
>(R 2.2.1 on Linux, Suse 9.3)
>
>
>
Message: 69
Date: Wed, 08 Mar 2006 11:10:30 +0100
From: Matias Mayor Fernandez <mmayorf at uniovi.es>
Subject: [R] Read.table
To: R-help at stat.math.ethz.ch
Message-ID: <000c01c64298$883673d0$1219239c at hispa17>
Content-Type: text/plain
Hi,
I have some column vector in txt or xls and I need to load into R as
numeric
vector.
I use the read.table (X=read.table(123.txt”) command but the program
say
that “X is not a numeric vector”
Where is the problem?
Matías
University of Oviedo,
Spain
[[alternative HTML version deleted]]
Message: 70
Date: Wed, 08 Mar 2006 11:22:20 +0100
From: Uwe Ligges <ligges at statistik.uni-dortmund.de>
Subject: Re: [R] Read.table
To: Matias Mayor Fernandez <mmayorf at uniovi.es>
Cc: R-help at stat.math.ethz.ch
Message-ID: <440EB05C.3010103 at statistik.uni-dortmund.de>
Content-Type: text/plain; charset=windows-1252; format=flowed
Matias Mayor Fernandez wrote:
> Hi,
>
>
>
> I have some column vector in txt or xls and I need to load into R as
numeric
> vector.
>
>
>
> I use the read.table (X=read.table(123.txt?) command but the program
say
> that ?X is not a numeric vector?
No, I think you got:
Error: syntax error in "(X=read.table(123.txt"
or you have used another call without the syntax error in it.
In any case, please be more specific what you did. You might also want
to copy the first few lines of file 123.txt in your mail.
Uwe Ligges
>
>
>
>
> Where is the problem?
>
>
>
> Mat?as
>
>
>
> University of Oviedo,
>
>
>
> Spain
>
>
> [[alternative HTML version deleted]]
>
>
>
>
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide!
Message: 71
Date: Wed, 8 Mar 2006 10:49:28 +0000 (GMT)
From: Simon Wood <sw283 at maths.bath.ac.uk>
Subject: Re: [R] predicted values in mgcv gam
To: Denis Chabot <chabotd at globetrotter.net>
Cc: R list <r-help at stat.math.ethz.ch>
Message-ID:
Content-Type: TEXT/PLAIN; charset=US-ASCII
Hi Denis,
Your first plot is of f(x) against x where \sum_i f(x_i) = 0 (x_i are
observed x's).
Your second plot is of \exp(\alpha + f(x)) against x where
\sum_i f(x_i)=0, and \alpha is an intercept parameter.
So the zero line on the first plot, corresponds to the \exp(\alpha) line
on the second plot (which is not the same as the mean of the response
data).
If you replace
> abline(h=mean.y,lty=5,col=grey(0.35))
by
> abline(h=coef(gam2)[1],lty=5,col=grey(0.35))
then everything should work.
best,
Simon
On Sun, 5 Mar 2006, Denis Chabot wrote:
> Hi,
>
> In fitting GAMs to assess environmental preferences, I use the part
> of the fit where the lower confidence interval is above zero as my
> criterion for positive association between the environmental variable
> and species abundance. However I like to plot this on the original
> scale of species abundance. To do so I extract the fit and SE using
> predict.gam.
>
> Lately I compared more carefully the plots I obtain in this way and
> those obtained with plot.gam and noticed differences which I do not
> understand.
>
> To avoid sending a large dataset I took an example from gam Help to
> illustrate this.
>
> Was I wrong to believe that the fit and its confidence band should
> behave the same way on both scales?
>
> Thanks in advance,
>
> Denis Chabot
> #######################
> library(mgcv)
> set.seed(0)
> n<-400
> sig<-2
> x0 <- runif(n, 0, 1)
> x1 <- runif(n, 0, 1)
> x2 <- runif(n, 0, 1)
> x3 <- runif(n, 0, 1)
> f0 <- function(x) 2 * sin(pi * x)
> f1 <- function(x) exp(2 * x)
> f2 <- function(x) 0.2*x^11*(10*(1-x))^6+10*(10*x)^3*(1-x)^10
> f <- f0(x0) + f1(x1) + f2(x2)
> g<-exp(f/4)
> y<-rpois(rep(1,n),g)
> mean.y <- mean(y)
>
> gam2 <- gam(y~ s(x2), poisson)
>
> # to plot on the response scale
> val.for.pred <- data.frame(x2=seq(min(x2), max(x2), length.out=100))
> pred.2.resp <- predict.gam(gam2, val.for.pred ,type="response",
> se.fit=TRUE)
> lower.band <- pred.2.resp$fit - 2*pred.2.resp$se.fit
> upper.band <- pred.2.resp$fit + 2*pred.2.resp$se.fit
> pred.2.resp <- data.frame(val.for.pred, pred.2.resp, lower.band,
> upper.band)
>
> # same thing on term scale
> pred.2.term <- predict.gam(gam2, val.for.pred ,type="terms",
> se.fit=TRUE)
> lower.band <- pred.2.term$fit - 2*pred.2.term$se.fit
> upper.band <- pred.2.term$fit + 2*pred.2.term$se.fit
> pred.2.term <- data.frame(val.for.pred, pred.2.term, lower.band,
> upper.band)
>
> # it is easier to compare two plots instead of looking at these two
> data.frames
>
> plot(gam2, residuals=T, pch=1, cex=0.7)
> abline(h=0)
>
> plot(y~x2, col=grey(0.5))
> lines(fit~x2, col="blue", data=pred.2.resp)
> lines(lower.band~x2, col="red", lty=2, data=pred.2.resp)
> lines(upper.band~x2, col="red", lty=2, data=pred.2.resp)
> abline(h=mean.y,lty=5,col=grey(0.35))
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide!
>
R-help at stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE read the posting guide!
End of R-help Digest, Vol 37, Issue 8
More information about the R-help
mailing list