[R] reading files with name columns and row columns

William Dunlap wdunlap at tibco.com
Thu Sep 3 01:36:47 CEST 2015


  y <- as.matrix(read.table("FILE_NAME",header=T,row.names=1))
  colnames(y) <- gsub("X","", colnames(y))

Use read.table's check.names=FALSE argument so it won't mangle
the column names instead of trying to demangle them with gsub() afterwards.

E.g.,
  txt <- "   50%  100%\nA   5     8\nB  13    14\n"
  cat(txt)
  #   50%  100%
  #A   5     8
  #B  13    14
  read.table(text=txt, head=TRUE, row.names=1)
  #  X50. X100.
  #A    5     8
  #B   13    14
  read.table(text=txt, head=TRUE, row.names=1, check.names=FALSE)
  #  50% 100%
  #A   5    8
  #B  13   14


Bill Dunlap
TIBCO Software
wdunlap tibco.com

On Wed, Sep 2, 2015 at 4:08 PM, Bogdan Tanasa <tanasa at gmail.com> wrote:

> Thanks, Bert ! I solved the situation in the meanwhile, by using :
>
> y <- as.matrix(read.table("FILE_NAME",header=T,row.names=1))
>
> colnames(y) <- gsub("X","", colnames(y))
>
>
> On Wed, Sep 2, 2015 at 3:59 PM, Bert Gunter <bgunter.4567 at gmail.com>
> wrote:
>
> > Please read the Help file carefully before posting:
> >
> > "read.table is not the right tool for reading large matrices,
> > especially those with many columns: it is designed to read data frames
> > which may have columns of very different classes. Use scan instead for
> > matrices."
> >
> > But the answer to your question can be found in
> >
> > ?make.names
> >
> > for what constitutes a syntactically valid name in R.
> >
> >
> > Cheers,
> > Bert
> >
> > Bert Gunter
> >
> > "Data is not information. Information is not knowledge. And knowledge
> > is certainly not wisdom."
> >    -- Clifford Stoll
> >
> >
> > On Wed, Sep 2, 2015 at 3:11 PM, Bogdan Tanasa <tanasa at gmail.com> wrote:
> > > Dear all,
> > >
> > > would appreciate a piece of help with a simple question: I am reading
> in
> > R
> > > a file that is formatted as a matrix (an example is shown below,
> although
> > > it is more complex, a matrix of 1000 * 1000 ):
> > >
> > > the names of the columns are 0, 10000, 40000, 80000, etc
> > > the names of the rows are 0, 10000, 40000, 80000, etc
> > >
> > >            0 200000 400000
> > > 0          0       0       0
> > > 200000  0       0       0
> > > 400000  0       0       0
> > >
> > > shall I use the command :
> > >
> > > y <- read.table("file",row.names=1, header=T)
> > >
> > > the results is :
> > >
> > >> y[1:3,1:3]
> > >        X0 X200000 X400000
> > > 0       0       0       0
> > > 200000  0       0       0
> > > 400000  0       0       0
> > >
> > > The question is : why R adds an X to the names of the columns eg X0,
> > > X20000, X40000, when it shall be only 0, 20000, 40000 ? thanks !
> > >
> > > -- bogdan
> > >
> > >         [[alternative HTML version deleted]]
> > >
> > > ______________________________________________
> > > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > > https://stat.ethz.ch/mailman/listinfo/r-help
> > > PLEASE do read the posting guide
> > http://www.R-project.org/posting-guide.html
> > > and provide commented, minimal, self-contained, reproducible code.
> >
>
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

	[[alternative HTML version deleted]]



More information about the R-help mailing list