[R] Merging vector data into one file
jim holtman
jholtman at gmail.com
Mon Feb 1 13:43:39 CET 2010
You never really said what your data structure looks like. It appears
that the 'single row' might be a named vector. It would be good to
follow the post policy and supply sample data. Here is one way of
doing it, depending on exactly what your data looks like:
> # create sample data of a list of named vectors with counts
> x <- replicate(5,table(sample(letters,20,TRUE)),simplify=FALSE)
> x
[[1]]
a b c d e f h i j k o p q t y z
1 1 1 1 1 1 1 3 2 1 1 1 2 1 1 1
[[2]]
a b c d g h i j o q r s t u v w x y
1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 1 2 1
[[3]]
a b c e g j k n p q r s t v y z
1 1 1 1 1 1 3 2 1 1 1 1 1 2 1 1
[[4]]
c d g j k l m o q t u v y z
1 1 1 2 1 1 2 1 2 1 1 2 2 2
[[5]]
b d f g i j k m n o q s w y
1 2 1 1 2 1 1 1 1 2 3 1 1 2
> # create a 'long' table of variables and counts
> x.long <- do.call(rbind, lapply(x, stack))
> head(x.long)
values ind
1 1 a
2 1 b
3 1 c
4 1 d
5 1 e
6 1 f
> tapply(x.long$value, x.long$ind, sum) # summarize
a b c d e f h i j k o p q t y z g r s u v w x n l m
3 4 4 5 2 2 2 6 7 6 5 2 9 4 7 4 4 2 3 2 6 2 2 3 1 3
>
On Sun, Jan 31, 2010 at 11:22 PM, Mark Altaweel <maltaweel at anl.gov> wrote:
> Hi,
>
> I had another question. If you had say a vector (e.g., called data) with 235
> elements and each element looked like the following
>
> data[[1]]
> Column A-B Column Z-S Column A-S....
> 1 2 5 .......
>
>
> data[[2]]
> Column Z-B Column A-S Column A-B....
> 2 1 3 .......
>
>
>
> Anyway, each element consists of one row that lists the names of the columns
> and the second row is the counts of those columns. What I wanted to do is
> merge all the elements from the vector so that I aggregate the counts for
> every column in the vector elements. So if a column name (e.g., Column A-B)
> is present in two elements, then I would want those elements that have the
> same column name to aggregated their counts; however, if the column name is
> unique then I simply just want to integrate that column with the total. The
> example below shows the result of what I mean using the two elements above:
>
>
> result=data[[1]] + data[[2]].......
>
> Column A-B Column Z-S Column A-S Column Z-B.........
> 4 2 6
> 2.......
>
>
>
> So what I want would take the 235 elements, aggregate the column names and
> counts, and produce one output variable (e.g., called result) that has all
> the column names and counts present in the 235 elements. I tried using
> sapply, (e.g., sapply(data,function(.df){sum(.df)}) ), but this only just
> provided aggregate counts without producing the column names. I tried an
> aggregate() function, but that didnt aggregate my data exactly the way I
> wanted, perhaps I got the syntax wrong though. Anyway, is there a better and
> easier way to do this?
>
> Thanks in a advance again.
>
> Mark
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
--
Jim Holtman
Cincinnati, OH
+1 513 646 9390
What is the problem that you are trying to solve?
More information about the R-help
mailing list