[R] data file import - numbers and letters in a matrix(!)
Adaikalavan Ramasamy
ramasamy at cancer.org.uk
Thu Apr 12 17:34:00 CEST 2007
Here is the contents of my "testdata.txt" :
-----------------------------------------------------
START OF HEIGHT DATA
S= 0 y=0.0 x=0.00000000
S= 0 y=0.1 x=0.00055643
S= 9 y=4.9 x=1.67278117
S= 9 y=5.0 x=1.74873257
S=10 y=0.0 x=0.00000000
S=10 y=0.1 x=0.00075557
S=99 y=5.3 x=1.94719490
END OF HEIGHT DATA
-----------------------------------------------------
If you have access to a shell command, you can try changing the input
file for read.delim using
cat testdata.txt | grep -v "^START" | grep -v "^END" | sed 's/ //g' |
sed 's/S=//' | sed 's/y=/\t/' | sed 's/x=/\t/'
or here is my ugly fix in R
my.read.file <- function(file=file){
v1 <- readLines( con=file, n=-1)
v2 <- v1[ - grep( "^START|^END", v1 ) ]
v3 <- gsub(" ", "", v2)
v4 <- gsub( "S=|y=|x=", " ", v3 )
v5 <- gsub("^ ", "", v4)
m <- t( sapply( strsplit(v5, split=" "), as.numeric ) )
colnames(m) <- c("S", "y", "x" )
return(m)
}
my.read.file( "testdata.txt" )
Regards, Adai
Felix Wave wrote:
> Hello,
> I have a problem with the import of a date file. I seems verry tricky.
> I have a text file (end of the mail). Every file has a different number of measurments
> witch start with "START OF HEIGHT DATA" and ende with "END OF HEIGHT DATA".
>
> I imported the file in a matrix but the letters before the numbers are my problem
> (S= ,S=,x=,y=).
> Because through the letters and the space after "S=" I got a different number
> of columns in my matrix and with letters in my matrix I can't count.
>
>
> My question. Is it possible to import the file to got 3 columns only with numbers and
> no letters like x=, y=?
>
> Thank's a lot
> Felix
>
>
>
>
> My R Code:
> ----------
>
> # na.strings = "S="
>
> Measure1 <- matrix(scan("data.dat", n= 5063 * 4, skip = 20, what = character() ), 5063, 3, byrow = TRUE)
> Measure2 <- matrix(scan("data.dat", n= 5063 * 4, skip = 5220, what = character() ), 5063, 3, byrow = TRUE)
>
>
>
> My data file:
> -----------
>
> FILEDATE:02.02.2007
> ...
>
> START OF HEIGHT DATA
> S= 0 y=0.0 x=0.00000000
> S= 0 y=0.1 x=0.00055643
> ...
> S= 9 y=4.9 x=1.67278117
> S= 9 y=5.0 x=1.74873257
> S=10 y=0.0 x=0.00000000
> S=10 y=0.1 x=0.00075557
> ...
> S=99 y=5.3 x=1.94719490
> END OF HEIGHT DATA
> ...
>
> START OF HEIGHT DATA
> S= 0 y=0.0 x=0.00000000
> S= 0 y=0.1 x=0.00055643
>
>
>
> The imported matrix:
> [,1] [,2] [,3] [,4]
> [6,] "S=" "9" "y=4.9" "x=1.67278117"
> [7,] "S=" "9" "y=5.0" "x=1.74873257"
> [8,] "S=10" "y=0.0" "x=0.00000000" "S=10"
> [9,] "y=0.1" "x=0.00075557" "S=10" "y=0.2"
> [10,] "x=0.00277444" "S=10" "y=0.3" "x=0.00605958"
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>
>
More information about the R-help
mailing list