data file import - numbers and letters in a matrix(!)
Gabor Grothendieck
ggrothendieck at gmail.com
Thu Apr 12 16:19:32 CEST 2007
Try pasting this into an R session:
Lines.raw <- "FILEDATE:02.02.2007
...
START OF HEIGHT DATA
S= 0 y=0.0 x=0.00000000
S= 0 y=0.1 x=0.00055643
...
S= 9 y=4.9 x=1.67278117
S= 9 y=5.0 x=1.74873257
S=10 y=0.0 x=0.00000000
S=10 y=0.1 x=0.00075557
...
S=99 y=5.3 x=1.94719490
END OF HEIGHT DATA
...
START OF HEIGHT DATA
S= 0 y=0.0 x=0.00000000
S= 0 y=0.1 x=0.00055643
"
# next line would be replaced by
# somthing like: Lines <- readLines("myfile.dat")
Lines <- readLines(textConnection(Lines.raw))
# extract those lines that contain an =
Lines <- grep("=", Lines, value = TRUE)
# get col names by removing all but letters & spaces from line 1
cn <- gsub("[^a-zA-Z ]", "", Lines[1])
cn <- scan(textConnection(cn), what = "")
# remove anything that is not a number, dot or space and read in
Lines <- gsub("[^ .0-9]", "", Lines)
DF <- read.table(textConnection(Lines), col.names = cn)
closeAllConnections()
DF
On 4/12/07, Felix Wave <felix-wave at vr-web.de> wrote:
> Hello,
> I have a problem with the import of a date file. I seems verry tricky.
> I have a text file (end of the mail). Every file has a different number of measurments
> witch start with "START OF HEIGHT DATA" and ende with "END OF HEIGHT DATA".
>
> I imported the file in a matrix but the letters before the numbers are my problem
> (S= ,S=,x=,y=).
> Because through the letters and the space after "S=" I got a different number
> of columns in my matrix and with letters in my matrix I can't count.
>
>
> My question. Is it possible to import the file to got 3 columns only with numbers and
> no letters like x=, y=?
>
> Thank's a lot
> Felix
>
>
>
>
> My R Code:
> ----------
>
> # na.strings = "S="
>
> Measure1 <- matrix(scan("data.dat", n= 5063 * 4, skip = 20, what = character() ), 5063, 3, byrow = TRUE)
> Measure2 <- matrix(scan("data.dat", n= 5063 * 4, skip = 5220, what = character() ), 5063, 3, byrow = TRUE)
>
>
>
> My data file:
> -----------
>
> FILEDATE:02.02.2007
> ...
>
> START OF HEIGHT DATA
> S= 0 y=0.0 x=0.00000000
> S= 0 y=0.1 x=0.00055643
> ...
> S= 9 y=4.9 x=1.67278117
> S= 9 y=5.0 x=1.74873257
> S=10 y=0.0 x=0.00000000
> S=10 y=0.1 x=0.00075557
> ...
> S=99 y=5.3 x=1.94719490
> END OF HEIGHT DATA
> ...
>
> START OF HEIGHT DATA
> S= 0 y=0.0 x=0.00000000
> S= 0 y=0.1 x=0.00055643
>
>
>
> The imported matrix:
> >
> [,1] [,2] [,3] [,4]
> [6,] "S=" "9" "y=4.9" "x=1.67278117"
> [7,] "S=" "9" "y=5.0" "x=1.74873257"
> [8,] "S=10" "y=0.0" "x=0.00000000" "S=10"
> [9,] "y=0.1" "x=0.00075557" "S=10" "y=0.2"
> [10,] "x=0.00277444" "S=10" "y=0.3" "x=0.00605958"
>
