[R] Get rid of space padding
(Ted Harding)
Ted.Harding at nessie.mcc.ac.uk
Thu Dec 23 09:46:17 CET 2004
On 23-Dec-04 Gene Cutler wrote:
>
> On Dec 22, 2004, at 5:00 PM, (Ted Harding) wrote:
>>
>> so, for me, the tabs are coming through as such.
>>
>> (R-1.8.0, RH9 Linux)
>>
>> What gives you the information that "\t" has expanded to spaces?
>> Often, writing a file out to a display, or importing it into an
>> editor (though you should be able to turn this off) expands tabs
>>
>
>
> I am getting tabs, but the values are being padded out to the tabs.
> Here is one sample line with spaces replaced by '.' and tabs replaced
> by '^'.
>
> 5...................^45597241............^16734145............^2.7128169
> 8016131e-06^0.622804755173039...^0.91743119266055....^GB-4858-1-
> A.........^GB-4873-1-A.........
>
> This is with R 2.0.1 both on Mandrake linux and Mac OS X.
>
> Also, I know it's not an issue of a text editor altering the data as
> these file get read in by perl scripts and R as well as my text editor
> of choice.
OK Gene, getting a clue here. I can reproduce similar behaviour
using your version of write.matrix by seting some elements to
character variables:
x<-matrix(rnorm(30),ncol=3)
x[1,1]<-"A"
x[2,2]<-"B"
x[3,3]<-"C"
write.matrix(x,file="temp.write")
A 0.0855822398994265 1.02493287358937
2.17769486851001 B -0.310203876654049
-1.46891720382270 -0.756931913255919 C
0.177454935470461 -1.06532248163526 -0.413338129170855
and 'od -c temp.write' gives:
0000000 A
0000020 \t 0 . 0 8 5 5 8 2 2 3 9 8 9
0000040 9 4 2 6 5 \t 1 . 0 2 4 9 3 2 8 7
0000060 3 5 8 9 3 7 \n 2 . 1 7 7 6 9
0000100 4 8 6 8 5 1 0 0 1 \t B
0000120 \t -
0000140 0 . 3 1 0 2 0 3 8 7 6 6 5 4 0 4
0000160 9 \n - 1 . 4 6 8 9 1 7 2 0 3 8 2
0000200 2 7 0 \t - 0 . 7 5 6 9 3 1 9 1
0000220 3 2 5 5 9 1 9 \t C
0000240 \n 0 . 1 7 7
0000260 4 5 4 9 3 5 4 7 0 4 6 1 \t - 1
0000300 . 0 6 5 3 2 2 4 8 1 6 3 5 2 6
0000320 \t - 0 . 4 1 3 3 3 8 1 2 9 1 7 0
0000340 8 5 5 \n - 0 . 8 2 4 0 5 9 9 6 4
where, as in your example, "short" results are padded out to
the position of the next tab with spaces.
However, when (as I suggested last time) I modify your function
'write.matrix' so as to remove occurrences of "format("...")"
(leaving only the ... ) then it seems to be OK.
Now the first few lines of 'cat temp.write' are
A 0.0855822398994265 1.02493287358937
2.17769486851001 B -0.310203876654049
-1.46891720382270 -0.756931913255919 C
0.177454935470461 -1.06532248163526 -0.413338129170855
and 'od -c temp.write' gives
0000000 A \t 0 . 0 8 5 5 8 2 2 3 9 8 9 9
0000020 4 2 6 5 \t 1 . 0 2 4 9 3 2 8 7 3
0000040 5 8 9 3 7 \n 2 . 1 7 7 6 9 4 8 6
0000060 8 5 1 0 0 1 \t B \t - 0 . 3 1 0 2
0000100 0 3 8 7 6 6 5 4 0 4 9 \n - 1 . 4
0000120 6 8 9 1 7 2 0 3 8 2 2 7 0 \t - 0
0000140 . 7 5 6 9 3 1 9 1 3 2 5 5 9 1 9
0000160 \t C \n 0 . 1 7 7 4 5 4 9 3 5 4 7
0000200 0 4 6 1 \t - 1 . 0 6 5 3 2 2 4 8
0000220 1 6 3 5 2 6 \t - 0 . 4 1 3 3 3 8
so that all the spaces have now disappeared, leaving only tabs.
The revised definition of "write.matrix" is:
write.matrix <- function (x, file = "", sep = "\t", blocksize=2000)
{
x <- as.matrix(x)
p <- ncol(x)
cn <- colnames(x)
if (!missing(blocksize) && blocksize > 0) {
cat(cn, file = file, sep = c(rep(sep, p - 1), "\n"))
nlines <- 0
nr <- nrow(x)
while (nlines < nr) {
nb <- min(blocksize, nr - nlines)
cat(t(x[nlines + (1:nb), ]), file = file,
append = TRUE, sep = c(rep(sep, p - 1), "\n"))
nlines <- nlines + nb
}
}
else cat(c(cn, t(x)), file = file,
sep = c(rep(sep, p - 1), "\n"))
}
Hence, back to my earlier question: Why do you need "format" in
your function? This is what is generating the effect which you
don't want!
Best wishes,
Ted.
--------------------------------------------------------------------
E-Mail: (Ted Harding) <Ted.Harding at nessie.mcc.ac.uk>
Fax-to-email: +44 (0)870 094 0861 [NB: New number!]
Date: 23-Dec-04 Time: 08:46:17
------------------------------ XFMail ------------------------------
More information about the R-help
mailing list