[R] Get rid of space padding

(Ted Harding) Ted.Harding at nessie.mcc.ac.uk
Thu Dec 23 02:00:33 CET 2004


On 23-Dec-04 Gene Cutler wrote:
> I'm currently using the below function from some library (MASS?) for
> writing my data out to file.  I'm using it instead of plain old "write"
> because it does buffering.  The problem that I'm having is that the
> numbers are space padded, but I need true tab-delineated files.  It
> looks like the spaces are coming from 'format', but I don't see
> an option for format to not pad numbers, the closest I see has to do
> with stripping spaces from strings.  Am I missing something obvious?
> 
> 
> write.matrix <- function (x, file = "", sep = "\t", blocksize=2000)
> {
>      x <- as.matrix(x)
>      p <- ncol(x)
>      cn <- colnames(x)
>      if (!missing(blocksize) && blocksize > 0) {
>          cat(cn, file = file, sep = c(rep(sep, p - 1), "\n"))
>          nlines <- 0
>          nr <- nrow(x)
>          while (nlines < nr) {
>              nb <- min(blocksize, nr - nlines)
>              cat(format(t(x[nlines + (1:nb), ])), file = file,
>                  append = TRUE, sep = c(rep(sep, p - 1), "\n"))
>              nlines <- nlines + nb
>          }
>      }
>      else cat(c(cn, format(t(x))), file = file,
>               sep = c(rep(sep, p - 1), "\n"))
> }

I think this may depend on your operating system. I just tried
your function as defined above:

> x<-rnorm(1000)
> x<-cbind(x,x,x)
> write.matrix(x,file="temp.write")

and then:

$ od -c temp.write | less
0000000   x  \t   x  \t   x  \n       0   .   3   8   7   7   4   7   9
0000020   2   1  \t       0   .   3   8   7   7   4   7   9   2   1  \t
0000040       0   .   3   8   7   7   4   7   9   2   1  \n   -   0   .
0000060   7   8   9   3   7   9   5   5   4  \t   -   0   .   7   8   9
0000100   3   7   9   5   5   4  \t   -   0   .   7   8   9   3   7   9
0000120   5   5   4  \n   -   1   .   3   3   0   4   9   1   1   8   9
0000140  \t   -   1   .   3   3   0   4   9   1   1   8   9  \t   -   1
[etc]

so, for me, the tabs are coming through as such.

(R-1.8.0, RH9 Linux)

What gives you the information that "\t" has expanded to spaces?
Often, writing a file out to a display, or importing it into an
editor (though you should be able to turn this off) expands tabs
to the appropriate numbers of spaces. So you may be seeing spaces
when the underlying file uses tabs. However, there are occasional
true spaces above where the number is not negative (which is what
arranges the line-up between negative and positive numbers).
Maybe this is your trouble:

$ less temp.write
x       x       x
 0.387747921     0.387747921     0.387747921
-0.789379554    -0.789379554    -0.789379554
-1.330491189    -1.330491189    -1.330491189
[etc.] (the tabs "\t" line up the "-"s; for positive numbers
an extra " " is needed).

Since 'format' is for "pretty-printing" it seems likely that
you will get such interspersed spacings whatever you do.
Why do you need to use format?

I just tried a version of your function in which each occurrence
of "format(...)" is replaced by "...", i.e. delete "format(" and ")".
It seems to work:

> write.matrix(x,file="temp.write")

$ less temp.write
x       x       x
0.387747920949271       0.387747920949271       0.387747920949271
-0.789379554483419      -0.789379554483419      -0.789379554483419
-1.33049118922334       -1.33049118922334       -1.33049118922334

$ od -c temp.write
0000000   x  \t   x  \t   x  \n   0   .   3   8   7   7   4   7   9   2
0000020   0   9   4   9   2   7   1  \t   0   .   3   8   7   7   4   7
0000040   9   2   0   9   4   9   2   7   1  \t   0   .   3   8   7   7
0000060   4   7   9   2   0   9   4   9   2   7   1  \n   -   0   .   7

so now it's just "\t" with no spaces. Of course this is  now printing
more decimal places, but you can control this with 'round'.

Please state: Your R version, your OS, and in what context you
are seeing spaces rather than tabs --  and why it matters!

Hoping this helps,
Ted.


--------------------------------------------------------------------
E-Mail: (Ted Harding) <Ted.Harding at nessie.mcc.ac.uk>
Fax-to-email: +44 (0)870 094 0861  [NB: New number!]
Date: 23-Dec-04                                       Time: 01:00:33
------------------------------ XFMail ------------------------------




More information about the R-help mailing list