[R] gsub with regular expression
Gabor Grothendieck
ggrothendieck at gmail.com
Fri Jun 25 17:11:21 CEST 2010
On Fri, Jun 25, 2010 at 10:48 AM, Sebastian Kruk
<residuo.solow at gmail.com> wrote:
> If I have a text with 7 words per line and I would like to put first
> and second word joined in a vector and the rest of words one per
> column in a matrix how can I do it?
>
> First 2 lines of my text file:
> "2008/12/31 12:23:31 numero 343.233.233 Rodeo Vaca Ruido"
> "2010/02/01 02:35:31 palabra 111.111.222 abejorro Rodeo Vaca"
>
> Results:
>
> Vector:
> 2008/12/31 12:23:31
> 2010/02/01 02:35:31
>
> Matrix
> "numero" 343.233.233 "Rodeo" "Vaca" "Ruido"
> "palabra" 111.111.222 "abejorro" "Rodeo" "Vaca"
>
Here are two solutions. Both solutions are three statements long
(read in the data, display the vector, display the matrix). Replace
textConnection(text) with "myfile.dat", say, in each.
1. Here is a sub solution:
L <- readLines(textConnection(Lines))
sub("(\\S+ \\S+) .*", "\\1", L)
sub("\\S+ \\S+ ", "", L)
2. Here is a solution using zoo:
Lines <- "2008/12/31 12:23:31 numero 343.233.233 Rodeo Vaca Ruido
2010/02/01 02:35:31 palabra 111.111.222 abejorro Rodeo Vaca"
library(zoo)
z <- read.zoo(textConnection(Lines), index = 1:2,
FUN = function(x) paste(x[,1], x[,2]))
time(z) # the vector
coredata(z) # the matrix
Another possibility would be to convert to chron or POSIXct at the
same time as reading it in:
# chron
library(chron)
z <- read.zoo(textConnection(Lines), index = 1:2,
FUN = function(x) as.chron(paste(x[,1], x[,2]), format = "%Y/%m/%d %H:%M:%S"))
# POSIXct
z <- read.zoo(textConnection(Lines), index = 1:2,
FUN = function(x) as.POSIXct(paste(x[,1], x[,2]), format = "%Y/%m/%d
%H:%M:%S"))
More information about the R-help
mailing list