[R] factor : how does it work ?

Petr Pikal petr.pikal at precheza.cz
Fri Oct 7 09:33:56 CEST 2005


Hi

it seems that the problem is why you got all your numerics 
converted to factors. If taken from some spreadsheet there could 
have been some unnoticed ***spaces*** in blank cells which 
turned the numeric column into factor column.

HTH
Petr


On 6 Oct 2005 at 17:08, Florence Combes wrote:

Date sent:      	Thu, 6 Oct 2005 17:08:17 +0200
From:           	Florence Combes <fcombes at gmail.com>
To:             	Duncan Murdoch <murdoch at stats.uwo.ca>, r-help at stat.math.ethz.ch
Subject:        	Re: [R] factor : how does it work ?
Send reply to:  	Florence Combes <fcombes at gmail.com>
	<mailto:r-help-request at stat.math.ethz.ch?subject=unsubscribe>
	<mailto:r-help-request at stat.math.ethz.ch?subject=subscribe>

> > head(merged)
> ID Name Pcc_0h_A Pcc_0h_swapped_A
> 3302 301495 Q0010_01 |Q0010||Hypothetical ORF 12.276 11.716
> 6943 309175 Q0010_01 |Q0010||Hypothetical ORF 11.958 11.271
> 14065 298935 Q0017_01 |Q0017||Hypothetical ORF 14.098 13.122
> 6420 306615 Q0017_01 |Q0017||Hypothetical ORF 13.843 13.061
> 5066 296375 Q0032_01 |Q0032||Hypothetical ORF 12.451 11.467
> 12707 304055 Q0032_01 |Q0032||Hypothetical ORF 11.745 11.482
> Pcc_0h_M Pcc_0h_swapped_M
> 3302 -0.249 0.316
> 6943 -0.115 0.780
> 14065 -0.053 0.263
> 6420 0.009 0.323
> 5066 0.015 0.687
> 12707 0.074 0.768
> 
> > str(merged)
> `data.frame': 12202 obs. of 6 variables:
> $ ID : Factor w/ 12202 levels "295080","295081",..: 5076 11177 3046
> 9147 1009 7110 5136 11237 3106 9207 ... ..- attr(*, "names")= chr
> "3302" "6943" "14065" "6420" ... $ Name : Factor w/ 6101 levels
> "Q0010_01 ..",..: 1 1 2 2 3 3 4 4 5 5 ... ..- attr(*, "names")= chr
> "3302" "6943" "14065" "6420" ... $ Pcc_0h_A : Factor w/ 5386 levels
> "10.001","10.002",..: 1812 1547 3308 3114 1960 1370 NA NA NA NA ...
> ..- attr(*, "names")= chr "3302" "6943" "14065" "6420" ... $
> Pcc_0h_swapped_A: Factor w/ 5082 levels "10.001","10.002",..: 1256 885
> 2533 2477 1051 1064 NA NA NA NA ... ..- attr(*, "names")= chr "3302"
> "6943" "14065" "6420" ... $ Pcc_0h_M : Factor w/ 1940 levels "
> 0.000"," 0.001",..: 499 231 107 18 30 148 NA NA NA NA ... ..- attr(*,
> "names")= chr "3302" "6943" "14065" "6420" ... $ Pcc_0h_swapped_M:
> Factor w/ 2343 levels " 0.000"," 0.001",..: 632 1453 526 646 1319 1434
> NA NA NA NA ... ..- attr(*, "names")= chr "3302" "6943" "14065" "6420"
> ...
> 
> 
> 
> > > a last question, and thanks a million for your patience and your
> > > explanations ...
> > >
> > >
> > > I tried with a df called "merged" and a column named "Pcc_0h_A"
> > > (which
> > is
> > > numeric values):
> > >
> > >> length(as.vector(merged$Pcc_0h_A))
> > > [1] 12202
> > >>as.numeric(as.vector(merged$Pcc_0h_A)[1:10])
> > > [1] 12.276 11.958 14.098 13.843 12.451 11.745 NA NA NA NA
> > >> ord<-ordered(merged$Pcc_0h_A)
> > >> length(ord)
> > > [1] 12202
> > >> ord[1:10]
> > > [1] 12.276 11.958 14.098 13.843 12.451 11.745 <NA> <NA> <NA> <NA>
> > > 5386 Levels: 10.001 < 10.002 < 10.003 < 10.005 < 10.006 < 10.010 <
> > > ... < 9.999
> > >
> > > here I have <NA> instead of NA because ord is a factor and the
> > > notation
> > is
> > > different ?
> >
> > >
> > >> length(as.numeric(merged$Pcc_0h_A))
> > > [1] 12202
> > >> as.numeric(merged$Pcc_0h_A[1:10])
> > > [1] 1812 1547 3308 3114 1960 1370 NA NA NA NA
> > >
> > > are these the levels names converted into numbers ? I don't think
> > because
> > > levels are like 10.001, 10.002 etc and 1812, 1547 etc are not in
> > > this
> > form.
> 
> 
> 
> with the str(merged) value I guess that 1812, 1547 etc are a sort of
> rank , am I right ?
> 
> >
> > > thanks a million
> > >
> > > florence;
> > >
> >
> >
> 
>  [[alternative HTML version deleted]]
> 
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide!
> http://www.R-project.org/posting-guide.html

Petr Pikal
petr.pikal at precheza.cz




More information about the R-help mailing list