[BioC] Re: Fw: problem with GEO parser
Saurin D. Jani
jani at musc.edu
Mon Apr 11 20:09:44 CEST 2005
I forgot to tell you that this works only for 1 soft file in current dirctory.
you can have your softfile in current directory , turn on your R session and cut
and paste below parser code..! It will work because I just ran on my computer 2
mins. back and it works fine.
you have : GDS461.soft on this soft file,
> eset
Expression Set (exprSet) with
12625 genes
10 samples
phenoData object with 0 variables and 0 cases
varLabels
-----------cut starts here-----------
#-- reading soft file
softFile <- list.files(,"soft"); # from local directory
system("cp *.soft file1.soft");
system("grep -on \"ID_REF\" file1.soft > b.txt" );
system("grep \"dataset_platform\" file1.soft > d.txt");
ln <- as.matrix(readLines("b.txt"));
lm <- as.matrix(readLines("d.txt"));
system("rm b.txt");
system("rm d.txt");
system("rm file1.soft");
lnX <- as.matrix(unlist(strsplit(ln[2],":")))
Skpnum <- as.numeric(lnX[1]);
lmX <- as.matrix(unlist(strsplit(lm[1],"=")))
chiptype <- trimWhiteSpace(lmX[2]);
GDSN <- softFile;
emX <- read.table(softFile,skip = Skpnum,comment.char = "");
Colm <- ncol(emX);
Rnames <- as.matrix(emX["V1"]);
temp_emX <- emX;
temp2 <- temp_emX[3:Colm];
temp2 <- as.matrix(temp2);
rownames(temp2) <- Rnames;
#--making expressiong set out of soft file,
#soft file has normalized data,so I am assuming here
#that this data is also normalized
esetX <- as.matrix(temp2);
eset <- new("exprSet", exprs = esetX);
-----------cut ends here-----------
now paste in to your R session.
Saurin
--
|------------------------------------------------
| Saurin Jani,MS
| Statistical and Research Analyst
|
| Department of Cell Biology and Anatomy
| Medical University of South Carolina (MUSC)
| 173 Ashley Ave
| Charleston,SC - 29407 (US)
|
| Email: jani at musc.edu
| Phone: (843)792-5483
|------------------------------------------------
Quoting guillaume deplaine <guillaume.deplaine at neuf.fr>:
> Hello,
>
> In april I wrote you a message about my problem with your GEO parser. It's
> extremly important for me to open et cluster this file in R. I don't know
> why it's the problem with your program.
> Could you help me please.
> Thanks a lot
> ----- Original Message -----
> From: "guillaume deplaine" <guillaume.deplaine at neuf.fr>
> To: "Saurin D. Jani" <jani at musc.edu>
> Sent: Friday, April 01, 2005 1:33 PM
> Subject: problem with GEO parser
>
>
> > Dear colleague,
> >
> > You sent me a GEO parser you wrote some time ago.
> > I have a problem because when I run R, I can read soft file with the
> > command
> > line softFile<-list.files(,"GDS461.soft"). But after, with the command :
> > system("cp *.soft file1.soft") or with system(grep-on\"ID_REF\" file1.soft
>
> > >
> > b.txt"), R console said : cp (or grep) was not found.
> >
> > You wrote #put your GEO file but I don't know where GDS461.soft must be
> > written.
> > Perhaps it's a problem of version. I work with R 2.0.1 or I forget a space
> > in a command line.
> > I enclose GDS461.soft file to my message.
> >
> > Could you explain me the problem and where, in your script, GDS461.soft
> > must
> > be written.
> > Thanks for your help.
> >
> >
> > Guillaume Deplaine
> >
> > INSERM U36
> > Collège de France
> > 11, place Marcellin Berthelot
> > 75231 Paris Cedex 05
> >
> > Tél. : 01 44 27 16 54
> > Fax. : 01 44 27 16 91
> > Portable : 06 19 94 82 77
> > E-mail : guillaume.deplaine at neuf.fr
> > ----- Original Message -----
> > From: "Saurin D. Jani" <jani at musc.edu>
> > To: "Guillaume Deplaine" <guillaume.deplaine at college-de-france.fr>
> > Cc: <bioconductor at stat.math.ethz.ch>
> > Sent: Tuesday, March 29, 2005 4:18 PM
> > Subject: Re: [BioC] problem with GEO site
> >
> >
> >>> I was wishering if it's passible to do a clustering
> >>> analysis of this file with R ?
> >>
> >> you need to parse this file and make expression set in R. for that you
> >> need GEO
> >> parser and below is GEO parser that I wrote some time ago.
> >>
> >> ##================================================================
> >> ## GEO SOFT FILES
> >> ##================================================================
> >> # GEO soft file parser(1.0) - Saurin Jani
> >>
> >> #-- reading soft file
> >>
> >> softFile <- list.files(,"soft"); # from local directory
> >>
> >> system("cp *.soft file1.soft");
> >> system("grep -on \"ID_REF\";
> >> # put your GEO soft file , b.txt file will be created on your computer
> >>
> >> system("grep \"dataset_platform\" file1.soft > d.txt");
> >> ln <- as.matrix(readLines("b.txt"));
> >> lm <- as.matrix(readLines("d.txt"));
> >>
> >> system("rm b.txt");
> >> system("rm d.txt");
> >> system("rm file1.soft");
> >>
> >> lnX <- as.matrix(unlist(strsplit(ln[2],":")))
> >> Skpnum <- as.numeric(lnX[1]);
> >>
> >> lmX <- as.matrix(unlist(strsplit(lm[1],"=")))
> >> chiptype <- trimWhiteSpace(lmX[2]);
> >> GDSN <- softFile;
> >>
> >> emX <- read.table(softFile,skip = Skpnum,comment.char = "");
> >> Colm <- ncol(emX);
> >>
> >> Rnames <- as.matrix(emX["V1"]);
> >> temp_emX <- emX;
> >>
> >> temp2 <- temp_emX[3:Colm];
> >> temp2 <- as.matrix(temp2);
> >> rownames(temp2) <- Rnames;
> >>
> >> #--making expressiong set out of soft file, soft file has normalized
> >> data,so I am
> >> #---assuming here that this data is also normalized
> >>
> >> esetX <- as.matrix(temp2);
> >> eset <- new("exprSet", exprs = esetX);
> >>
> >>
> >> you can use eset for clustering.
> >>
> >>
> >> Saurin
> >> --
> >> |------------------------------------------------
> >> | Saurin Jani,MS
> >> | Statistical and Research Analyst
> >> |
> >> | Department of Cell Biology and Anatomy
> >> | Medical University of South Carolina (MUSC)
> >> | 173 Ashley Ave
> >> | Charleston,SC - 29407 (US)
> >> |
> >> | Email: jani at musc.edu
> >> | Phone: (843)792-5483
> >> |------------------------------------------------
> >>
> >>
> >> Quoting Guillaume Deplaine <guillaume.deplaine at college-de-france.fr>:
> >>
> >>> Hello,
> >>>
> >>> I found a file on GEO web site. this files was processed with MASS 4
> >>> until normalization. I was wishering if it's passible to do a clustering
> >>> analysis of this file with R ?
> >>>
> >>> My second question is if it's possible to retrieve raw data of this
> >>> file
> >>>
> >>> processed with MASS 4?
> >>>
> >>> Thanks for your answer
> >>>
> >>> _______________________________________________
> >>> Bioconductor mailing list
> >>> Bioconductor at stat.math.ethz.ch
> >>> https://stat.ethz.ch/mailman/listinfo/bioconductor
> >>>
> >>>
> >>
> >> _______________________________________________
> >> Bioconductor mailing list
> >> Bioconductor at stat.math.ethz.ch
> >> https://stat.ethz.ch/mailman/listinfo/bioconductor
> >
>
More information about the Bioconductor
mailing list