[R] Parsing a data file - Help
Chuck Cleland
ccleland at optonline.net
Wed Jun 11 20:38:42 CEST 2008
On 6/11/2008 1:29 PM, A Ezhil wrote:
> Hi All,
>
> I have the data in the following format:
>
> idkt saap lahto pidg
> 5266 19911111 19911114 3078A
> 5266 19921005 19921030 2968A
> 6666 19930208 19930209 3074A
> 6666 20020329 20020402 F322
> 6666 20020402 20020409 F322
> 6866 19810713 19810917 29800
> 6866 19811109 19811120 29550
> 6866 19820203 19820219 29550
>
> I would like to parse the data and reformat into a single row for each unique idkt, something like:
> 5266 19911111 19911114 3078A 19921005 19921030 2968A
>
> I have tried with
>
> f <- read.table("file.txt", sep="\t", header=TRUE);
> attach(f);
> fac <- factor(f[,1]);
> id <- matrix(length(fac), 4);
> for(i in fac) id[i] <- f[idkt %in% fac[i], ];
>
> I am not able make the list id into a single row. Could you please help how I can do this?
If you can create a variable that differentiates multiple records
from the same idkt, you can use reshape() like this:
f <- "idkt saap lahto pidg
5266 19911111 19911114 3078A
5266 19921005 19921030 2968A
6666 19930208 19930209 3074A
6666 20020329 20020402 F322
6666 20020402 20020409 F322
6866 19810713 19810917 29800
6866 19811109 19811120 29550
6866 19820203 19820219 29550"
fdata <- read.table(textConnection(f), sep=" ", header=TRUE)
fdata$time <- unlist(lapply(table(fdata$idkt), function(x){1:x}))
reshape(fdata, idvar = "idkt", timevar = "time", direction="wide")
idkt saap.1 lahto.1 pidg.1 saap.2 lahto.2 pidg.2 saap.3
lahto.3 pidg.3
1 5266 19911111 19911114 3078A 19921005 19921030 2968A NA
NA <NA>
3 6666 19930208 19930209 3074A 20020329 20020402 F322 20020402
20020409 F322
6 6866 19810713 19810917 29800 19811109 19811120 29550 19820203
19820219 29550
> Thanks in advance.
>
> Kind regards,
> Ezhil
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
--
Chuck Cleland, Ph.D.
NDRI, Inc. (www.ndri.org)
71 West 23rd Street, 8th floor
New York, NY 10010
tel: (212) 845-4495 (Tu, Th)
tel: (732) 512-0171 (M, W, F)
fax: (917) 438-0894
More information about the R-help
mailing list