[R] Ordering problem
Dimitris Rizopoulos
dimitris.rizopoulos at med.kuleuven.be
Fri Nov 25 14:01:41 CET 2005
another posibility would be to use something like:
v1 <- c(1, 2, 3); v2 <- c("a", "b", "c"); v3 <- c("1", "2", "3")
dat <- data.frame(v1, v2, v3)
############3
dat <- lapply(dat, as.character)
dat <- as.data.frame(lapply(dat, type.convert))
dat
sapply(dat, data.class)
I hope it helps.
Best,
Dimitris
----
Dimitris Rizopoulos
Ph.D. Student
Biostatistical Centre
School of Public Health
Catholic University of Leuven
Address: Kapucijnenvoer 35, Leuven, Belgium
Tel: +32/(0)16/336899
Fax: +32/(0)16/337015
Web: http://www.med.kuleuven.be/biostat/
http://www.student.kuleuven.be/~m0390867/dimitris.htm
----- Original Message -----
From: "John Logsdon" <j.logsdon at quantex-research.com>
To: <r-help at stat.math.ethz.ch>
Sent: Friday, November 25, 2005 1:25 PM
Subject: Re: [R] Ordering problem
> Thanks to Florence but it needs a little modification. However as I
> have
> now discovered the str() command, things are looking up.:))
>
> I have a character matrix so I() just leaves it as characters
> whereas I
> want the various columns to be integers or whatever they contain.
>
> To take Florence's example slightly extended:
>
>> v1<-c(1,2,3);v2<-c("a","b","c");v3<-c("1","2","3")
>
> Note that the third vector is a character with numerical contents.
>
>> data.frame(v1,v2,v3)
> v1 v2 v3
> 1 1 a 1
> 2 2 b 2
> 3 3 c 3
>
> so it looks OK, but
>
>> str(data.frame(v1,v2,v3))
> `data.frame': 3 obs. of 3 variables:
> $ v1: num 1 2 3
> $ v2: Factor w/ 3 levels "a","b","c": 1 2 3
> $ v3: Factor w/ 3 levels "1","2","3": 1 2 3
>
> reveals the nasty truth!
>
> whereas
>
>> str(data.frame(v1,v2,I(v3)))
> `data.frame': 3 obs. of 3 variables:
> $ v1: num 1 2 3
> $ v2: Factor w/ 3 levels "a","b","c": 1 2 3
> $ v3:Class 'AsIs' chr [1:3] "1" "2" "3"
>
> just keeps the character v3 as characters. I want it to be
> interpreted as
> numeric so:
>
>> str(data.frame(v1,v2,as.numeric(v3)))
> `data.frame': 3 obs. of 3 variables:
> $ v1 : num 1 2 3
> $ v2 : Factor w/ 3 levels "a","b","c": 1 2 3
> $ as.numeric.v3.: num 1 2 3
>
> actually gives me what I need.
>
> The only problem is that I have to do everything column by column
> and
> there are 15 cols all. So it makes particularly ugly coding to
> reproduce
> an as.is read from a .csv file.
>
> The other solutions from Baz and Carlos would also work of course -
> but
> they are still pretty horrible. Perhaps another way to do this is
> to
> write it out using cat then read it in again using as.is=TRUE!! ;)
>
> Thanks to one and all
>
> Best wishes
>
> John
>
> John Logsdon "Try to make things as
> simple
> Quantex Research Ltd, Manchester UK as possible but not
> simpler"
> j.logsdon at quantex-research.com
> a.einstein at relativity.org
> +44(0)161 445 4951/G:+44(0)7717758675 www.quantex-research.com
>
>
> On Fri, 25 Nov 2005, Florence Combes wrote:
>
>> John,
>>
>> at ?factor, you can see :
>>
>> " Be careful only to compare factors with the
>> same set of levels (in the same order). In particular,
>> 'as.numeric' applied to a factor is meaningless, and may happen
>> by
>> implicit coercion. To "revert" a factor 'f' to its original
>> numeric values, 'as.numeric(levels(f))[f]' is recommended and
>> slightly more efficient than 'as.numeric(as.character(f))'. "
>>
>> 'as.numeric(levels(f))[f]' worked well for me in the similar
>> situation i.e.
>> to get back numeric values from a factor type.
>> But see also the I() "option" of the data.frame() function, which
>> allows you
>> not to obtain a factor (from a character vector only) if it is not
>> what you
>> want.
>>
>> from ?data.frame :
>>
>> "Objects passed to 'data.frame' should have the same number of
>> rows, but atomic vectors, factors and character vectors
>> protected
>> by 'I' will be recycled a whole number of times if necessary."
>>
>>
>> see this example:
>> --------------------------------------------------
>> > v1<-c(1,2,3)
>> > v2<-c("a","b","c")
>> > df.A<-data.frame(v1,v2)
>> > str(df.A)
>> `data.frame': 3 obs. of 2 variables:
>> $ v1: num 1 2 3
>> $ v2: Factor w/ 3 levels "a","b","c": 1 2 3
>> > df.B<-data.frame(v1,I(v2))
>> > str(df.B)
>> `data.frame': 3 obs. of 2 variables:
>> $ v1: num 1 2 3
>> $ v2:Class 'AsIs' chr [1:3] "a" "b" "c"
>> -------------------------------------------------
>>
>> hope this helps,
>>
>> Florence.
>>
>>
>>
>>
>>
>> On 11/25/05, John Logsdon <j.logsdon at quantex-research.com> wrote:
>> >
>> > I have an ordering and factor problem to which there must be a
>> > simple
>> > solution! The version is R 2.0.1 (2004-11-15) on A Linux
>> > platform.
>> >
>> > A data frame H is read in from a .csv file using read.csv with
>> > as.is=TRUE.
>> >
>> > Another data frame HN is constructed from data and I want to
>> > compare two
>> > columns both named ss of the (sorted) data frames that are the
>> > same
>> > length.
>> >
>> > The problem is that HN$ss is always treated as a factor whatever
>> > I do
>> > while H$ss is treated as an integer, which is what I want.
>> > Somewhere R is
>> > making an implicit transformation but I can't see how to correct
>> > it.
>> >
>> > The data are all integers in the range 1:13 - in fact with no
>> > gaps. If I
>> > tabulate from H:
>> >
>> > > table(H$ss)
>> >
>> > 1 2 3 4 5 6 7 8 9 10 11 12 13
>> > 176 176 176 176 176 176 341 8726 8784 8777 8773 8749 8747
>> >
>> > and for HN:
>> >
>> > > table(HN$ss)
>> >
>> > 1 10 11 12 13 2 3 4 5 6 7 8 9
>> > 176 8777 8773 8749 8747 176 176 176 176 176 341 8726 8784
>> >
>> > At some time while constructing HN, I have to make it a character
>> > matrix -
>> > otherwise gsub doesn't work when removing surplus blanks for
>> > example - but
>> > I have turned it back into a data frame in the end.
>> >
>> > If I check the modes, both data frames are lists and both columns
>> > are
>> > numeric - HN is not reported as a factor. Yet it appears to be
>> > treated as
>> > a factor, for example:
>> >
>> > > table(formatC(H$ss,dig=0,width=2,format="f",flag="0"))
>> >
>> > 01 02 03 04 05 06 07 08 09 10 11 12 13
>> > 176 176 176 176 176 176 341 8726 8784 8777 8773 8749 8747
>> > > table(formatC(HN$ss,dig=0,width=2,format="f",flag="0"))
>> >
>> > yet:
>> >
>> > 1 10 11 12 13 2 3 4 5 6 7 8 9
>> > 176 8777 8773 8749 8747 176 176 176 176 176 341 8726 8784
>> > Warning messages:
>> > 1: "+" not meaningful for factors in: Ops.factor(x, ifelse(x ==
>> > 0, 1, 0))
>> > 2: "<" not meaningful for factors in: Ops.factor(x, 0)
>> >
>> > I have tried as.numeric but then I get the factor level rather
>> > than name
>> > returned:
>> >
>> > > table(formatC(as.numeric(HN$ss),dig=0,width=2,format="f",flag="0"))
>> >
>> > 01 02 03 04 05 06 07 08 09 10 11 12 13
>> > 176 8777 8773 8749 8747 176 176 176 176 176 341 8726 8784
>> >
>> > which obviously is a tabulation of the internal levels rather
>> > than the
>> > data.
>> >
>> > TIA
>> >
>> > John
>> >
>> > John Logsdon "Try to make things as
>> > simple
>> > Quantex Research Ltd, Manchester UK as possible but not
>> > simpler"
>> > j.logsdon at quantex-research.com
>> > a.einstein at relativity.org
>> > +44(0)161 445 4951/G:+44(0)7717758675
>> > www.quantex-research.com
>> >
>> > ______________________________________________
>> > R-help at stat.math.ethz.ch mailing list
>> > https://stat.ethz.ch/mailman/listinfo/r-help
>> > PLEASE do read the posting guide!
>> > http://www.R-project.org/posting-guide.html
>> >
>>
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide!
> http://www.R-project.org/posting-guide.html
>
Disclaimer: http://www.kuleuven.be/cwis/email_disclaimer.htm
More information about the R-help
mailing list