[R] Parsing a data file - Help

Chuck Cleland ccleland at optonline.net
Wed Jun 11 20:38:42 CEST 2008


On 6/11/2008 1:29 PM, A Ezhil wrote:
> Hi All,
> 
> I have the data in the following format:
> 
> idkt	saap		lahto		pidg
> 5266	19911111	19911114	3078A
> 5266	19921005	19921030	2968A
> 6666	19930208	19930209	3074A
> 6666	20020329	20020402	F322
> 6666	20020402	20020409	F322
> 6866	19810713	19810917	29800
> 6866	19811109	19811120	29550
> 6866	19820203	19820219	29550
> 
> I would like to parse the data and reformat into a single row for each unique idkt, something like:
> 5266  19911111	19911114 3078A 19921005	19921030 2968A
> 
> I have tried with 
> 
> f <- read.table("file.txt", sep="\t", header=TRUE);
> attach(f);
> fac <- factor(f[,1]);
> id <- matrix(length(fac), 4);
> for(i in fac) id[i] <- f[idkt %in% fac[i], ]; 
> 
> I am not able make the list id into a single row. Could you please help how I can do this?

   If you can create a variable that differentiates multiple records 
from the same idkt, you can use reshape() like this:

f <- "idkt saap lahto pidg
5266 19911111 19911114 3078A
5266 19921005 19921030 2968A
6666 19930208 19930209 3074A
6666 20020329 20020402 F322
6666 20020402 20020409 F322
6866 19810713 19810917 29800
6866 19811109 19811120 29550
6866 19820203 19820219 29550"

fdata <- read.table(textConnection(f), sep=" ", header=TRUE)

fdata$time <- unlist(lapply(table(fdata$idkt), function(x){1:x}))

reshape(fdata, idvar = "idkt", timevar = "time", direction="wide")

   idkt   saap.1  lahto.1 pidg.1   saap.2  lahto.2 pidg.2   saap.3 
lahto.3 pidg.3
1 5266 19911111 19911114  3078A 19921005 19921030  2968A       NA 
NA   <NA>
3 6666 19930208 19930209  3074A 20020329 20020402   F322 20020402 
20020409   F322
6 6866 19810713 19810917  29800 19811109 19811120  29550 19820203 
19820219  29550

> Thanks in advance.
> 
> Kind regards,
> Ezhil
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

-- 
Chuck Cleland, Ph.D.
NDRI, Inc. (www.ndri.org)
71 West 23rd Street, 8th floor
New York, NY 10010
tel: (212) 845-4495 (Tu, Th)
tel: (732) 512-0171 (M, W, F)
fax: (917) 438-0894



More information about the R-help mailing list