[R] "order" issue
Zoppoli, Gabriele (NIH/NCI) [G]
zoppolig at mail.nih.gov
Mon May 24 01:57:52 CEST 2010
after read.delim:
'data.frame': 60 obs. of 4 variables:
$ Cell : Factor w/ 60 levels "BR:BT_549","BR:HS578T",..: 23 51 20 25 34 16 44 3 60 55 ...
$ hsa-miR-204: num -4.37 -4.34 -4.33 -4.29 -4.26 ...
$ hsa-miR-210: num -0.223 1.575 1.66 1.668 0.373 ...
$ Tissue : Factor w/ 9 levels "Breast","CNS",..: 5 8 5 5 6 3 7 1 9 9 ...
before:
chr [1:60, 1:4] "ME:SK_MEL_5" "ME:SK_MEL_28" "ME:SK_MEL_2" ...
- attr(*, "dimnames")=List of 2
..$ : chr [1:60] "48" "47" "46" "50" ...
..$ : chr [1:4] "Product" "hsa.miR.204" "hsa.miR.210" "Tissue"
Looks like the issue is that, after the first time I "read.delim"med the txt file, I removed the first three raws by doing
x=x[-c(1:3),]
because the first three raws were characters (parameters like "probe name", "chromosomal position" ecc.)
So maybe R remembers that the columns used were characters and not numeric... How would you "explain" R (sorry for the naive definitions but I've learnt R over time by myself and I misuse some words, hope it's clear anyway) that a matrix is all numeric? by doing as.numeric(x), it transforms everything in a long colum of number, but loses the matrix structure...
Thank you all guys! You're really precious!
Now, how can you "explain" (sorry for my naive definitions...) R that now all of your values are numeric in a matrix? If you do as.numeric, everything becomes a long column of n
Gabriele Zoppoli, MD
Ph.D. Fellow, Experimental and Clinical Oncology and Hematology, University of Genova, Genova, Italy
Guest Researcher, LMP, NCI, NIH, Bethesda MD
Work: 301-451-8575
Mobile: 301-204-5642
Email: zoppolig at mail.nih.gov
________________________________________
From: William Dunlap [wdunlap at tibco.com]
Sent: Sunday, May 23, 2010 7:05 PM
To: Zoppoli, Gabriele (NIH/NCI) [G]; ted.harding at manchester.ac.uk
Cc: R-help at r-project.org
Subject: RE: [R] "order" issue
> -----Original Message-----
> From: r-help-bounces at r-project.org
> [mailto:r-help-bounces at r-project.org] On Behalf Of Zoppoli,
> Gabriele (NIH/NCI) [G]
> Sent: Sunday, May 23, 2010 3:44 PM
> To: ted.harding at manchester.ac.uk
> Cc: R-help at r-project.org
> Subject: Re: [R] "order" issue
>
> crazy stuff!!! I tried to reload the txt file, and now it's working...
When you "reloaded" the txt file (with what function?) it
probably was made into a "data.frame", with some columns
factors or characters and some columns numerics. It looks
like your original problem arose after you converted that
data.frame into a "matrix", all of whose columns must be
the same (character in this case). Sorting character
representations of numbers is different than sorting the
numbers as numbers.
> sort(c(1, 0.05, 0.0000, -0.10, -2))
[1] -2.00 -0.10 0.00 0.05 1.00
> sort(as.character(c(1, 0.05, 0.0000, -0.10, -2)))
[1] "-0.1" "-2" "0" "0.05" "1"
Use str(x) again to see if this is what is happening.
Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com
>
> this is the original (attached)
>
> thanks!
>
> Gabriele Zoppoli, MD
> Ph.D. Fellow, Experimental and Clinical Oncology and
> Hematology, University of Genova, Genova, Italy
> Guest Researcher, LMP, NCI, NIH, Bethesda MD
>
> Work: 301-451-8575
> Mobile: 301-204-5642
> Email: zoppolig at mail.nih.gov
> ________________________________________
> From: Ted Harding [Ted.Harding at manchester.ac.uk]
> Sent: Sunday, May 23, 2010 6:31 PM
> To: Zoppoli, Gabriele (NIH/NCI) [G]
> Cc: R help
> Subject: RE: [R] "order" issue
>
> On 23-May-10 21:39:06, Zoppoli, Gabriele (NIH/NCI) [G] wrote:
> > Hi everybody, this is a real dummy thing.
> >
> > I sorted a matrix based on a given column, and what I get is right,
> > until it comes to columns of negative and positive values; than,
> > "order" orders everything from max to min in the negative
> values, and
> > then AGAIN from max to min in the positive values!!!
> >
> > Why isn't everything order from max to min, and that's it?
> > Thank you!!!
> >
> > Attached is the txt file I use; try:
> >
> > x=x[order(x[,2]),]
> >
> > What I get is:
> >
> > print(x)
> >
> > Product A B Tissue
> > 44 ME:MDA_MB_435 -0.1915 -0.16744 Melanoma
> > 17 CNS:SNB_75 -0.23183 1.03945 CNS
> > 37 LE:K_562 -0.58218 1.8581 Leukemia
> > 43 ME:MALME_3M -0.67327 -1.33493 Melanoma
> > 49 ME:UACC_257 -0.72431 -1.84753 Melanoma
> > 42 ME:M14 -0.73942 -0.73904 Melanoma
> > 40 LE:SR -0.93541 2.95346 Leukemia
> > 25 CO:SW_620 -1.53265 -1.35446 Colon
> > 63 RE:CAKI_1 -2.48443 0.43245 Renal
> > 39 LE:RPMI_8226 -2.59561 -1.9448 Leukemia
> > 26 LC:A549 -2.66221 0.71215 Lung
> > 61 RE:A498 -2.89402 0.93287 Renal
> > 9 BR:HS578T -2.94118 1.1217 Breast
> > 34 LC:NCI_H522 -2.94381 0.3859 Lung
> > 66 RE:TK_10 -2.95281 1.26245 Renal
> > 52 OV:NCI_ADR_RES -3.04456 0.17046 Ovarian
> > 57 OV:SK_OV_3 -3.04477 2.15405 Ovarian
> > 53 OV:OVCAR_3 -3.0705 -0.31743 Ovarian
> > 14 CNS:SF_295 -3.09348 -1.00095 CNS
> > 54 OV:OVCAR_4 -3.13137 -0.47497 Ovarian
> > 36 LE:HL_60 -3.16745 -3.16745 Leukemia
> > 38 LE:MOLT_4 -3.20055 -1.72841 Leukemia
> > 11 BR:MDA_MB_231 -3.24907 1.58326 Breast
> > 59 PR:PC_3 -3.36612 1.39328 Prostate
> > 19 CO:HCT_116 -3.39764 0.43061 Colon
> > 12 BR:T47D -3.41228 1.13818 Breast
> > 22 CO:HCT_15 -3.45342 0.16357 Colon
> > 64 RE:RXF_393 -3.49615 2.59144 Renal
> > 28 LC:HOP_62 -3.4968 0.67884 Lung
> > 60 RE:786_0 -3.5086 1.75056 Renal
> > 35 LE:CCRF_CEM -3.54526 -2.09262 Leukemia
> > 29 LC:HOP_92 -3.60636 0.87116 Lung
> > 21 CO:HCC_2998 -3.61457 -0.32362 Colon
> > 13 CNS:SF_268 -3.63916 2.54378 CNS
> > 20 CO:COLO205 -3.64656 0.54344 Colon
> > 56 OV:OVCAR_8 -3.66053 -0.9594 Ovarian
> > 24 CO:KM12 -3.68703 2.19991 Colon
> > 55 OV:OVCAR_5 -3.7852 2.43038 Ovarian
> > 8 BR:BT_549 -3.80239 -0.43099 Breast
> > 15 CNS:SF_539 -3.86184 1.39114 CNS
> > 65 RE:SN12C -3.90776 0.85244 Renal
> > 31 LC:NCI_H23 -3.91625 -1.14955 Lung
> > 62 RE:ACHN -3.96246 -0.62365 Renal
> > 67 RE:UO_31 -3.99791 -1.09215 Renal
> > 10 BR:MCF7 -4.00187 1.46303 Breast
> > 51 OV:IGROV1 -4.02758 2.04324 Ovarian
> > 23 CO:HT29 -4.11624 -0.02799 Colon
> > 41 ME:LOXIMVI -4.2572 0.37259 Melanoma
> > 32 LC:NCI_H322M -4.28534 1.66783 Lung
> > 27 LC:EKVX -4.32847 1.66042 Lung
> > 58 PR:DU_145 -4.33961 1.57548 Prostate
> > 30 LC:NCI_H226 -4.37408 -0.22311 Lung
> > 33 LC:NCI_H460 0.0042 -0.6023 Lung
> > 18 CNS:U251 0.01263 1.66389 CNS
> > 16 CNS:SNB_19 0.16583 0.03737 CNS
> > 45 ME:MDA_N 0.21077 0.05502 Melanoma
> > 50 ME:UACC_62 0.52503 0.1605 Melanoma
> > 46 ME:SK_MEL_2 0.55255 -1.6667 Melanoma
> > 47 ME:SK_MEL_28 1.7425 1.45266 Melanoma
> > 48 ME:SK_MEL_5 1.74749 -1.47817 Melanoma
> >
> > Gabriele Zoppoli, MD
>
> Somewhat strange indeed! The only further question I can think of
> is to ask how what did "x" look like before your re-ordered it.
> Using the "x.txt" file you supplied, I get:
>
> x <- read.table("x.txt")
> str(x)
> # 'data.frame': 60 obs. of 4 variables:
> # $ Product: Factor w/ 60 levels
> "BR:BT_549","BR:HS578T",..: 37 10 30
> # 36 42 35 33 18 56 32 ...
> # $ A : num -0.192 -0.232 -0.582 -0.673 -0.724 ...
> # $ B : num -0.167 1.039 1.858 -1.335 -1.848 ...
> # $ Tissue : Factor w/ 9 levels "Breast","CNS",..: 6 2 4 6
> 6 6 4 3 9 4
> # ...
>
>
> so x[,2] and x[,3] are indeed numeric. Then (similar to yours):
>
> X<-x[order(x[,2]),]
> print(X)
> # Product A B Tissue
> # 30 LC:NCI_H226 -4.37408 -0.22311 Lung
> # 58 PR:DU_145 -4.33961 1.57548 Prostate
> # 27 LC:EKVX -4.32847 1.66042 Lung
> # 32 LC:NCI_H322M -4.28534 1.66783 Lung
> # 41 ME:LOXIMVI -4.25720 0.37259 Melanoma
> # 23 CO:HT29 -4.11624 -0.02799 Colon
> # 51 OV:IGROV1 -4.02758 2.04324 Ovarian
> # 10 BR:MCF7 -4.00187 1.46303 Breast
> # 67 RE:UO_31 -3.99791 -1.09215 Renal
> # 62 RE:ACHN -3.96246 -0.62365 Renal
> # 31 LC:NCI_H23 -3.91625 -1.14955 Lung
> # 65 RE:SN12C -3.90776 0.85244 Renal
> # 15 CNS:SF_539 -3.86184 1.39114 CNS
> # 8 BR:BT_549 -3.80239 -0.43099 Breast
> # 55 OV:OVCAR_5 -3.78520 2.43038 Ovarian
> # 24 CO:KM12 -3.68703 2.19991 Colon
> # 56 OV:OVCAR_8 -3.66053 -0.95940 Ovarian
> # 20 CO:COLO205 -3.64656 0.54344 Colon
> # 13 CNS:SF_268 -3.63916 2.54378 CNS
> # 21 CO:HCC_2998 -3.61457 -0.32362 Colon
> # 29 LC:HOP_92 -3.60636 0.87116 Lung
> # 35 LE:CCRF_CEM -3.54526 -2.09262 Leukemia
> # 60 RE:786_0 -3.50860 1.75056 Renal
> # 28 LC:HOP_62 -3.49680 0.67884 Lung
> # 64 RE:RXF_393 -3.49615 2.59144 Renal
> # 22 CO:HCT_15 -3.45342 0.16357 Colon
> # 12 BR:T47D -3.41228 1.13818 Breast
> # 19 CO:HCT_116 -3.39764 0.43061 Colon
> # 59 PR:PC_3 -3.36612 1.39328 Prostate
> # 11 BR:MDA_MB_231 -3.24907 1.58326 Breast
> # 38 LE:MOLT_4 -3.20055 -1.72841 Leukemia
> # 36 LE:HL_60 -3.16745 -3.16745 Leukemia
> # 54 OV:OVCAR_4 -3.13137 -0.47497 Ovarian
> # 14 CNS:SF_295 -3.09348 -1.00095 CNS
> # 53 OV:OVCAR_3 -3.07050 -0.31743 Ovarian
> # 57 OV:SK_OV_3 -3.04477 2.15405 Ovarian
> # 52 OV:NCI_ADR_RES -3.04456 0.17046 Ovarian
> # 66 RE:TK_10 -2.95281 1.26245 Renal
> # 34 LC:NCI_H522 -2.94381 0.38590 Lung
> # 9 BR:HS578T -2.94118 1.12170 Breast
> # 61 RE:A498 -2.89402 0.93287 Renal
> # 26 LC:A549 -2.66221 0.71215 Lung
> # 39 LE:RPMI_8226 -2.59561 -1.94480 Leukemia
> # 63 RE:CAKI_1 -2.48443 0.43245 Renal
> # 25 CO:SW_620 -1.53265 -1.35446 Colon
> # 40 LE:SR -0.93541 2.95346 Leukemia
> # 42 ME:M14 -0.73942 -0.73904 Melanoma
> # 49 ME:UACC_257 -0.72431 -1.84753 Melanoma
> # 43 ME:MALME_3M -0.67327 -1.33493 Melanoma
> # 37 LE:K_562 -0.58218 1.85810 Leukemia
> # 17 CNS:SNB_75 -0.23183 1.03945 CNS
> # 44 ME:MDA_MB_435 -0.19150 -0.16744 Melanoma
> # 33 LC:NCI_H460 0.00420 -0.60230 Lung
> # 18 CNS:U251 0.01263 1.66389 CNS
> # 16 CNS:SNB_19 0.16583 0.03737 CNS
> # 45 ME:MDA_N 0.21077 0.05502 Melanoma
> # 50 ME:UACC_62 0.52503 0.16050 Melanoma
> # 46 ME:SK_MEL_2 0.55255 -1.66670 Melanoma
> # 47 ME:SK_MEL_28 1.74250 1.45266 Melanoma
> # 48 ME:SK_MEL_5 1.74749 -1.47817 Melanoma
>
> and now the values in X[,2] are indeed in the correct numerical order,
> yet essentially the same command as your has been executed.
>
> I have not succeeded in repoducing your result by ordering on other
> columns of "x" or on the row-names of "x".
>
> So it is a mystery! The only thing I can think of is that the
> columns of "x" (as seen by R) are different from what you think
> they should be. Since your file "x.txt" looks like the value
> of "x" after your re-ordering, it is impossible to test such
> guesses on the original "x".
>
> Ted.
>
>
> --------------------------------------------------------------------
> E-Mail: (Ted Harding) <Ted.Harding at manchester.ac.uk>
> Fax-to-email: +44 (0)870 094 0861
> Date: 23-May-10 Time: 23:31:25
> ------------------------------ XFMail ------------------------------
>
More information about the R-help
mailing list