[R] Assoociative array?

rkevinburton at charter.net rkevinburton at charter.net
Tue Jul 15 14:48:54 CEST 2008


It seems the R console took them out. Here is hat I tried:

> for(i in 1:length(sc))
+ {
+     sum(sc[[i]]]$Quantity)
Error: unexpected ']' in:
"{
    sum(sc[[i]]]"
> }
Error: unexpected '}' in "}"
> 
> 
> 

What I entered is in the sum that is after the '+'

Thank you.

Kevin

---- jim holtman <jholtman at gmail.com> wrote: 
> You don't have a closing parens on the 'sum'
> 
> On Mon, Jul 14, 2008 at 11:25 PM,  <rkevinburton at charter.net> wrote:
> > One more question? I am trying to iterate through this array
> >
> > I have:
> >
> > sc <- split(x, list(x$Category, x$SubCategory), drop=TRUE)
> >
> > I think I understand 'length(sc)' It would be the total number of non empty category and sub category pairs (in this case 2415).
> >
> > I don't seems to be able to iterate through this list. My first try is:
> >
> > for(i in 1:length(sc))
> > {
> >     sum(sc[[i]]$Quantity
> > }
> >
> > This gives an error:
> >
> > Error: unexpected ']' in:
> > "{
> >    sum(sc[[i]]]"
> >> }
> > Error: unexpected '}' in "}"
> >>
> >
> > sc[[1]] refers to an array of data corresponding to a specific Category/SubCategory pair. Since this is a vector sc[[1]]$Category and sc[[1]]$SubCategory are the same. Is there anyway to access just the Category and SubCategory? R seems to be able to access this informaiton. I would just like to be able to access this. Or is it just as efficient to sc[[1]]$Category[1]? When I do this in R I get:
> >
> >> sc[[4]]$Category[1]
> > [1] ADDITIONAL GUEST
> > 46 Levels: (Unknown) 10" Plates 7" Plates   (Dessert) ... WOMEN
> >>
> >
> > What are 'Levels'?
> >
> > Thank you for your assistance.
> >
> > Kevin
> >
> > ---- jim holtman <jholtman at gmail.com> wrote:
> >> On Sun, Jul 13, 2008 at 5:45 PM,  <rkevinburton at charter.net> wrote:
> >> > Thank you I will try drop=TRUE.
> >> >
> >> > In the mean time do you know how I can access the members (for lack of a better term) of the results of a split? In the sample you provided below you have:
> >> >
> >> > z <- split(x, list(x$cat, x$a), drop=TRUE)
> >>
> >> You can do 'str(z)' to see the structure of 'z'.  In most cases, you
> >> should be able to reference by the keys, if they exist:
> >>
> >> > n <- 20
> >> > set.seed(1)
> >> > x <- data.frame(a=sample(LETTERS[1:2], n,TRUE), b=sample(letters[1:4], n, TRUE), val=runif(n))
> >> > z <- split(x, list(x$a, x$b), drop=TRUE)
> >> > str(z)
> >> List of 8
> >>  $ A.a:'data.frame':    2 obs. of  3 variables:
> >>   ..$ a  : Factor w/ 2 levels "A","B": 1 1
> >>   ..$ b  : Factor w/ 4 levels "a","b","c","d": 1 1
> >>   ..$ val: num [1:2] 0.647 0.245
> >>  $ B.a:'data.frame':    3 obs. of  3 variables:
> >>   ..$ a  : Factor w/ 2 levels "A","B": 2 2 2
> >>   ..$ b  : Factor w/ 4 levels "a","b","c","d": 1 1 1
> >>   ..$ val: num [1:3] 0.5530 0.0233 0.5186
> >>  $ A.b:'data.frame':    3 obs. of  3 variables:
> >>   ..$ a  : Factor w/ 2 levels "A","B": 1 1 1
> >>   ..$ b  : Factor w/ 4 levels "a","b","c","d": 2 2 2
> >>   ..$ val: num [1:3] 0.530 0.693 0.478
> >>  $ B.b:'data.frame':    4 obs. of  3 variables:
> >>   ..$ a  : Factor w/ 2 levels "A","B": 2 2 2 2
> >>   ..$ b  : Factor w/ 4 levels "a","b","c","d": 2 2 2 2
> >>   ..$ val: num [1:4] 0.789 0.477 0.438 0.407
> >>  $ A.c:'data.frame':    3 obs. of  3 variables:
> >>   ..$ a  : Factor w/ 2 levels "A","B": 1 1 1
> >>   ..$ b  : Factor w/ 4 levels "a","b","c","d": 3 3 3
> >>   ..$ val: num [1:3] 0.8612 0.0995 0.6620
> >>  $ B.c:'data.frame':    1 obs. of  3 variables:
> >>   ..$ a  : Factor w/ 2 levels "A","B": 2
> >>   ..$ b  : Factor w/ 4 levels "a","b","c","d": 3
> >>   ..$ val: num 0.783
> >>  $ A.d:'data.frame':    1 obs. of  3 variables:
> >>   ..$ a  : Factor w/ 2 levels "A","B": 1
> >>   ..$ b  : Factor w/ 4 levels "a","b","c","d": 4
> >>   ..$ val: num 0.821
> >>  $ B.d:'data.frame':    3 obs. of  3 variables:
> >>   ..$ a  : Factor w/ 2 levels "A","B": 2 2 2
> >>   ..$ b  : Factor w/ 4 levels "a","b","c","d": 4 4 4
> >>   ..$ val: num [1:3] 0.7323 0.0707 0.3163
> >>
> >> Here are some examples of accessing the data:
> >>
> >> > z$B.d
> >>    a b        val
> >> 9  B d 0.73231374
> >> 15 B d 0.07067905
> >> 17 B d 0.31627171
> >> > # or just the value (it is a vector)
> >> > z$B.d$val
> >> [1] 0.73231374 0.07067905 0.31627171
> >> > # or by name
> >> > z[["B.d"]]$val
> >> [1] 0.73231374 0.07067905 0.31627171
> >> > # or by absolute number
> >> > z[[8]]$val
> >> [1] 0.73231374 0.07067905 0.31627171
> >> > # take the mean
> >> > mean(z$B.d$val)
> >> [1] 0.3730882
> >> > # get the length
> >> > length(z$B.d$val)
> >> [1] 3
> >> >
> >>
> >>
> >>
> >> >
> >> > Now I can print out 'z[1], z[2] etc' This is nice but what if I want the access/iterate through all of the members of a particular column in z. You have given some methods like z[[1]]$b to access the specific columns in z. I notice for your example z[[1]]$b prints out two values. Can I assume that z[[1]]$b is a vecotr? So if I want to find the mean i can 'mean(z[[1]]$b)' and it will give me the mean value of the b columns in z? (similarily sum, and range, etc.). Does nrows(z[[1]]$b) return two in your example below? I would like to find out how many elements are in z[1]. Or would it be just as fast to do 'nrows(z[1])'?
> >> >
> >> > Thank you for this extended session on data frames, matrices, and vectors. I feel much more comfortable with the concepts now.
> >> >
> >> > Kevin
> >> > ---- jim holtman <jholtman at gmail.com> wrote:
> >> >> The reason for the empty levels was I did not put drop=TRUE on the
> >> >> split to remove unused levels.  Here is the revised script:
> >> >>
> >> >> > set.seed(1)  # start with a known number
> >> >> > x <- data.frame(cat=sample(LETTERS[1:3],20,TRUE),a=sample(letters[1:4], 20, TRUE), b=runif(20))
> >> >> > x
> >> >>    cat a          b
> >> >> 1    A d 0.82094629
> >> >> 2    B a 0.64706019
> >> >> 3    B c 0.78293276
> >> >> 4    C a 0.55303631
> >> >> 5    A b 0.52971958
> >> >> 6    C b 0.78935623
> >> >> 7    C a 0.02333120
> >> >> 8    B b 0.47723007
> >> >> 9    B d 0.73231374
> >> >> 10   A b 0.69273156
> >> >> 11   A b 0.47761962
> >> >> 12   A c 0.86120948
> >> >> 13   C b 0.43809711
> >> >> 14   B a 0.24479728
> >> >> 15   C d 0.07067905
> >> >> 16   B c 0.09946616
> >> >> 17   C d 0.31627171
> >> >> 18   C a 0.51863426
> >> >> 19   B c 0.66200508
> >> >> 20   C b 0.40683019
> >> >> > # drop unused groups from the split
> >> >> > (z <- split(x, list(x$cat, x$a), drop=TRUE))
> >> >> $B.a
> >> >>    cat a         b
> >> >> 2    B a 0.6470602
> >> >> 14   B a 0.2447973
> >> >>
> >> >> $C.a
> >> >>    cat a          b
> >> >> 4    C a 0.55303631
> >> >> 7    C a 0.02333120
> >> >> 18   C a 0.51863426
> >> >>
> >> >> $A.b
> >> >>    cat a         b
> >> >> 5    A b 0.5297196
> >> >> 10   A b 0.6927316
> >> >> 11   A b 0.4776196
> >> >>
> >> >> $B.b
> >> >>   cat a         b
> >> >> 8   B b 0.4772301
> >> >>
> >> >> $C.b
> >> >>    cat a         b
> >> >> 6    C b 0.7893562
> >> >> 13   C b 0.4380971
> >> >> 20   C b 0.4068302
> >> >>
> >> >> $A.c
> >> >>    cat a         b
> >> >> 12   A c 0.8612095
> >> >>
> >> >> $B.c
> >> >>    cat a          b
> >> >> 3    B c 0.78293276
> >> >> 16   B c 0.09946616
> >> >> 19   B c 0.66200508
> >> >>
> >> >> $A.d
> >> >>   cat a         b
> >> >> 1   A d 0.8209463
> >> >>
> >> >> $B.d
> >> >>   cat a         b
> >> >> 9   B d 0.7323137
> >> >>
> >> >> $C.d
> >> >>    cat a          b
> >> >> 15   C d 0.07067905
> >> >> 17   C d 0.31627171
> >> >>
> >> >> > # access the value ('b' in this instance); two ways- should be the same
> >> >> > z[[1]]$b
> >> >> [1] 0.6470602 0.2447973
> >> >> > z$B.a$b
> >> >> [1] 0.6470602 0.2447973
> >> >> >
> >> >> >
> >> >> >
> >> >> >
> >> >>
> >> >>
> >> >> On Sun, Jul 13, 2008 at 1:26 AM,  <rkevinburton at charter.net> wrote:
> >> >> > This is almost it. Maybe it is as good as can be expected. The only problem that I see is that this seems to form a Category/SubCategory pair where none existed in the original data. For example, A might have two sub-categories a and b, and B might have two categories c and d. As far as I can tell the method that you outlined forms a Category/SubCategory pair like B a or B b where none existed. This results in alot of empty lists and it seems to take a long time to generate. But if that is as good as it gets then I can live with it.
> >> >> >
> >> >> > I know that I said one more question. But I have run into a problem. c <- split(x, x$Category) returns a vector of the rows in each of the categories. Now I would like to access the "Quantity" column within this split vector. I can see it listed. I just can't access it. I have tried c[1]$Quantity and c[1,2] both which give me errors. Any ideas?
> >> >> >
> >> >> > Sorry this is so hard for me. I am more used to C type arrays and C type arrays of structures. This seems to be somewhat different.
> >> >> >
> >> >> > Thank you.
> >> >> >
> >> >> > Kevin
> >> >> > ---- jim holtman <jholtman at gmail.com> wrote:
> >> >> >> Is this something like what you were asking for?  The output of a
> >> >> >> 'split' will be a list of the dataframe subsets for the categories you
> >> >> >> have specified.
> >> >> >>
> >> >> >> > x <- data.frame(g1=sample(LETTERS[1:2],30,TRUE),
> >> >> >> +     g2=sample(letters[1:2], 30, TRUE),
> >> >> >> +     g3=1:30)
> >> >> >> > y <- split(x, list(x$g1, x$g2))
> >> >> >> > str(y)
> >> >> >> List of 4
> >> >> >>  $ A.a:'data.frame':    7 obs. of  3 variables:
> >> >> >>   ..$ g1: Factor w/ 2 levels "A","B": 1 1 1 1 1 1 1
> >> >> >>   ..$ g2: Factor w/ 2 levels "a","b": 1 1 1 1 1 1 1
> >> >> >>   ..$ g3: int [1:7] 3 4 6 8 9 13 24
> >> >> >>  $ B.a:'data.frame':    7 obs. of  3 variables:
> >> >> >>   ..$ g1: Factor w/ 2 levels "A","B": 2 2 2 2 2 2 2
> >> >> >>   ..$ g2: Factor w/ 2 levels "a","b": 1 1 1 1 1 1 1
> >> >> >>   ..$ g3: int [1:7] 10 11 16 17 18 20 25
> >> >> >>  $ A.b:'data.frame':    6 obs. of  3 variables:
> >> >> >>   ..$ g1: Factor w/ 2 levels "A","B": 1 1 1 1 1 1
> >> >> >>   ..$ g2: Factor w/ 2 levels "a","b": 2 2 2 2 2 2
> >> >> >>   ..$ g3: int [1:6] 2 12 23 26 27 29
> >> >> >>  $ B.b:'data.frame':    10 obs. of  3 variables:
> >> >> >>   ..$ g1: Factor w/ 2 levels "A","B": 2 2 2 2 2 2 2 2 2 2
> >> >> >>   ..$ g2: Factor w/ 2 levels "a","b": 2 2 2 2 2 2 2 2 2 2
> >> >> >>   ..$ g3: int [1:10] 1 5 7 14 15 19 21 22 28 30
> >> >> >> > y
> >> >> >> $A.a
> >> >> >>    g1 g2 g3
> >> >> >> 3   A  a  3
> >> >> >> 4   A  a  4
> >> >> >> 6   A  a  6
> >> >> >> 8   A  a  8
> >> >> >> 9   A  a  9
> >> >> >> 13  A  a 13
> >> >> >> 24  A  a 24
> >> >> >>
> >> >> >> $B.a
> >> >> >>    g1 g2 g3
> >> >> >> 10  B  a 10
> >> >> >> 11  B  a 11
> >> >> >> 16  B  a 16
> >> >> >> 17  B  a 17
> >> >> >> 18  B  a 18
> >> >> >> 20  B  a 20
> >> >> >> 25  B  a 25
> >> >> >>
> >> >> >> $A.b
> >> >> >>    g1 g2 g3
> >> >> >> 2   A  b  2
> >> >> >> 12  A  b 12
> >> >> >> 23  A  b 23
> >> >> >> 26  A  b 26
> >> >> >> 27  A  b 27
> >> >> >> 29  A  b 29
> >> >> >>
> >> >> >> $B.b
> >> >> >>    g1 g2 g3
> >> >> >> 1   B  b  1
> >> >> >> 5   B  b  5
> >> >> >> 7   B  b  7
> >> >> >> 14  B  b 14
> >> >> >> 15  B  b 15
> >> >> >> 19  B  b 19
> >> >> >> 21  B  b 21
> >> >> >> 22  B  b 22
> >> >> >> 28  B  b 28
> >> >> >> 30  B  b 30
> >> >> >>
> >> >> >> > y[[2]]
> >> >> >>    g1 g2 g3
> >> >> >> 10  B  a 10
> >> >> >> 11  B  a 11
> >> >> >> 16  B  a 16
> >> >> >> 17  B  a 17
> >> >> >> 18  B  a 18
> >> >> >> 20  B  a 20
> >> >> >> 25  B  a 25
> >> >> >> >
> >> >> >> >
> >> >> >> >
> >> >> >>
> >> >> >>
> >> >> >> On Sat, Jul 12, 2008 at 8:51 PM,  <rkevinburton at charter.net> wrote:
> >> >> >> > OK. Now I know that I am dealing with a data frame. One last question on this topic. a <- read.csv() gives me a dataframe. If I have 'c <- split(x, x$Category), then what is  returned by split in this case? c[1] seems to be OK but c[2] is not right in my mind. If I run ci <- split(nrow(a), a$Category). And then ci[1] seems to be the rows associated with the first category, c[2] is the indices/rows associated with the second category, etc. But this seems different than c[1], c[2], etc.
> >> >> >> >
> >> >> >> > Using the techniques below I can get the information on the categories. Now as an extra level of complexity there are SubCategories within each Category. Assume that the SubCategory names are not unique within the dataset so if I want the SubCategory data I need to retrive the indices (or data) for the Category and SubCategory pair. In other words if I have a Category that ranges from 'A' to 'Z', it is possible that I might have a subcategory A a, A b (where a and b are the sub category names). I also might have B a, B b. I want all of the sub categories A a. NOT the subcategories a (because that might include B a which would be different). I am guessing that this will take more than a simple 'split'.
> >> >> >> >
> >> >> >> > Thank you.
> >> >> >> >
> >> >> >> > Kevin
> >> >> >> >
> >> >> >> > ---- Duncan Murdoch <murdoch at stats.uwo.ca> wrote:
> >> >> >> >> On 12/07/2008 3:59 PM, rkevinburton at charter.net wrote:
> >> >> >> >> > I am sorry but if read.csv returns a dataframe and a dataframe is like a matrix and I have a set of input like below and a[1,] gives me the first row, what is the second index? From what I read and your input I am guessing that it is the column number. So a[1,1] would return the DayOfYear column for the first row, right? What does a$DayOfYear return?
> >> >> >> >>
> >> >> >> >> a$DayOfYear would be the same as a[,1] or a[,"DayOfYear"], i.e. it would
> >> >> >> >> return the entire first column.
> >> >> >> >>
> >> >> >> >> Duncan Murdoch
> >> >> >> >>
> >> >> >> >> >
> >> >> >> >> > Thank you for your patience.
> >> >> >> >> >
> >> >> >> >> > Kevin
> >> >> >> >> >
> >> >> >> >> > ---- Duncan Murdoch <murdoch at stats.uwo.ca> wrote:
> >> >> >> >> >> On 12/07/2008 12:31 PM, rkevinburton at charter.net wrote:
> >> >> >> >> >>> I am using a simple R statement to read in the file:
> >> >> >> >> >>>
> >> >> >> >> >>> a <- read.csv("Sample.dat", header=TRUE)
> >> >> >> >> >>>
> >> >> >> >> >>> There is alot of data but the first few lines look like:
> >> >> >> >> >>>
> >> >> >> >> >>> DayOfYear,Quantity,Fraction,Category,SubCategory
> >> >> >> >> >>> 1,82,0.0000390392720794458,(Unknown),(Unknown)
> >> >> >> >> >>> 2,78,0.0000371349173438631,(Unknown),(Unknown)
> >> >> >> >> >>> . . .
> >> >> >> >> >>> 71,2,0.0000009521773677913,WOMEN,Piratesses
> >> >> >> >> >>> 72,4,0.0000019043547355827,WOMEN,Piratesses
> >> >> >> >> >>> 73,3,0.0000014282660516870,WOMEN,Piratesses
> >> >> >> >> >>> 74,14,0.0000066652415745395,WOMEN,Piratesses
> >> >> >> >> >>> 75,2,0.0000009521773677913,WOMEN,Piratesses
> >> >> >> >> >>>
> >> >> >> >> >>> If I read the data in as above, the command
> >> >> >> >> >>>
> >> >> >> >> >>> a[1]
> >> >> >> >> >>>
> >> >> >> >> >>> results in the output
> >> >> >> >> >>>
> >> >> >> >> >>> [ reached getOption("max.print") -- omitted 16193 rows ]]
> >> >> >> >> >>>
> >> >> >> >> >>> Shouldn't this be the first row?
> >> >> >> >> >> No, the first row would be a[1,].  read.csv() returns a dataframe, and
> >> >> >> >> >> those are indexed with two indices to treat them like a matrix, or with
> >> >> >> >> >> one index to treat them like a list of their columns.
> >> >> >> >> >>
> >> >> >> >> >> Duncan Murdoch
> >> >> >> >> >>
> >> >> >> >> >>> a$Category[1]
> >> >> >> >> >>>
> >> >> >> >> >>> results in the output
> >> >> >> >> >>>
> >> >> >> >> >>> [1] (Unknown)
> >> >> >> >> >>> 4464 Levels:   Tags ... WOMEN
> >> >> >> >> >>>
> >> >> >> >> >>> But
> >> >> >> >> >>>
> >> >> >> >> >>> a$Category[365]
> >> >> >> >> >>>
> >> >> >> >> >>> gives me:
> >> >> >> >> >>>
> >> >> >> >> >>> [1] 7 Plates   (Dessert),Western\n120,5,0.0000023804434194784,7 Plates   (Dessert)
> >> >> >> >> >>> 4464 Levels:   Tags ... WOMEN
> >> >> >> >> >>>
> >> >> >> >> >>> There is something fundamental about either vectors of the read.csv command that I am missing here.
> >> >> >> >> >>>
> >> >> >> >> >>> Thank you.
> >> >> >> >> >>>
> >> >> >> >> >>> Kevin
> >> >> >> >> >>>
> >> >> >> >> >>> ---- jim holtman <jholtman at gmail.com> wrote:
> >> >> >> >> >>>> Please provide commented, minimal, self-contained, reproducible code,
> >> >> >> >> >>>> or at least a before/after of what you data would look like.  Taking a
> >> >> >> >> >>>> guess at what you are asking, here is one way of doing it:
> >> >> >> >> >>>>
> >> >> >> >> >>>>
> >> >> >> >> >>>>> x <- data.frame(cat=sample(LETTERS[1:3],20,TRUE),a=1:20, b=runif(20))
> >> >> >> >> >>>>> x
> >> >> >> >> >>>>    cat  a          b
> >> >> >> >> >>>> 1    B  1 0.65472393
> >> >> >> >> >>>> 2    C  2 0.35319727
> >> >> >> >> >>>> 3    B  3 0.27026015
> >> >> >> >> >>>> 4    A  4 0.99268406
> >> >> >> >> >>>> 5    C  5 0.63349326
> >> >> >> >> >>>> 6    A  6 0.21320814
> >> >> >> >> >>>> 7    C  7 0.12937235
> >> >> >> >> >>>> 8    A  8 0.47811803
> >> >> >> >> >>>> 9    A  9 0.92407447
> >> >> >> >> >>>> 10   A 10 0.59876097
> >> >> >> >> >>>> 11   A 11 0.97617069
> >> >> >> >> >>>> 12   A 12 0.73179251
> >> >> >> >> >>>> 13   B 13 0.35672691
> >> >> >> >> >>>> 14   C 14 0.43147369
> >> >> >> >> >>>> 15   C 15 0.14821156
> >> >> >> >> >>>> 16   C 16 0.01307758
> >> >> >> >> >>>> 17   B 17 0.71556607
> >> >> >> >> >>>> 18   B 18 0.10318424
> >> >> >> >> >>>> 19   C 19 0.44628435
> >> >> >> >> >>>> 20   B 20 0.64010105
> >> >> >> >> >>>>> # create a list of the indices of the data grouped by 'cat'
> >> >> >> >> >>>>> split(seq(nrow(x)), x$cat)
> >> >> >> >> >>>> $A
> >> >> >> >> >>>> [1]  4  6  8  9 10 11 12
> >> >> >> >> >>>>
> >> >> >> >> >>>> $B
> >> >> >> >> >>>> [1]  1  3 13 17 18 20
> >> >> >> >> >>>>
> >> >> >> >> >>>> $C
> >> >> >> >> >>>> [1]  2  5  7 14 15 16 19
> >> >> >> >> >>>>
> >> >> >> >> >>>>> # or do you want the data
> >> >> >> >> >>>>> split(x, x$cat)
> >> >> >> >> >>>> $A
> >> >> >> >> >>>>    cat  a         b
> >> >> >> >> >>>> 4    A  4 0.9926841
> >> >> >> >> >>>> 6    A  6 0.2132081
> >> >> >> >> >>>> 8    A  8 0.4781180
> >> >> >> >> >>>> 9    A  9 0.9240745
> >> >> >> >> >>>> 10   A 10 0.5987610
> >> >> >> >> >>>> 11   A 11 0.9761707
> >> >> >> >> >>>> 12   A 12 0.7317925
> >> >> >> >> >>>>
> >> >> >> >> >>>> $B
> >> >> >> >> >>>>    cat  a         b
> >> >> >> >> >>>> 1    B  1 0.6547239
> >> >> >> >> >>>> 3    B  3 0.2702601
> >> >> >> >> >>>> 13   B 13 0.3567269
> >> >> >> >> >>>> 17   B 17 0.7155661
> >> >> >> >> >>>> 18   B 18 0.1031842
> >> >> >> >> >>>> 20   B 20 0.6401010
> >> >> >> >> >>>>
> >> >> >> >> >>>> $C
> >> >> >> >> >>>>    cat  a          b
> >> >> >> >> >>>> 2    C  2 0.35319727
> >> >> >> >> >>>> 5    C  5 0.63349326
> >> >> >> >> >>>> 7    C  7 0.12937235
> >> >> >> >> >>>> 14   C 14 0.43147369
> >> >> >> >> >>>> 15   C 15 0.14821156
> >> >> >> >> >>>> 16   C 16 0.01307758
> >> >> >> >> >>>> 19   C 19 0.44628435
> >> >> >> >> >>>>
> >> >> >> >> >>>>
> >> >> >> >> >>>> On Sat, Jul 12, 2008 at 3:32 AM,  <rkevinburton at charter.net> wrote:
> >> >> >> >> >>>>> I have search the archive and I could not find what I need so I will try to ask the question here.
> >> >> >> >> >>>>>
> >> >> >> >> >>>>> I read a table in (read.table)
> >> >> >> >> >>>>>
> >> >> >> >> >>>>> a <- read.table(.....)
> >> >> >> >> >>>>>
> >> >> >> >> >>>>> The table has column names like DayOfYear, Quantity, and Category.
> >> >> >> >> >>>>>
> >> >> >> >> >>>>> The values in the row for Category are strings (characters).
> >> >> >> >> >>>>>
> >> >> >> >> >>>>> I want to get all of the rows grouped by Category. The number of unique category names could be around 50. Say for argument sake the number of categories is exactly 50. Can I somehow get a vector of length 50 containing the rows corresponding to the category (another vector)? I realize I can access any row a[i]$Category (right?). But I wanta vector containing the rows corresponding to each distinct Category name.
> >> >> >> >> >>>>>
> >> >> >> >> >>>>> Thank you.
> >> >> >> >> >>>>>
> >> >> >> >> >>>>> Kevin
> >> >> >> >> >>>>>
> >> >> >> >> >>>>> ______________________________________________
> >> >> >> >> >>>>> R-help at r-project.org mailing list
> >> >> >> >> >>>>> https://stat.ethz.ch/mailman/listinfo/r-help
> >> >> >> >> >>>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> >> >> >> >> >>>>> and provide commented, minimal, self-contained, reproducible code.
> >> >> >> >> >>>>>
> >> >> >> >> >>>>
> >> >> >> >> >>>> --
> >> >> >> >> >>>> Jim Holtman
> >> >> >> >> >>>> Cincinnati, OH
> >> >> >> >> >>>> +1 513 646 9390
> >> >> >> >> >>>>
> >> >> >> >> >>>> What is the problem you are trying to solve?
> >> >> >> >> >>> ______________________________________________
> >> >> >> >> >>> R-help at r-project.org mailing list
> >> >> >> >> >>> https://stat.ethz.ch/mailman/listinfo/r-help
> >> >> >> >> >>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> >> >> >> >> >>> and provide commented, minimal, self-contained, reproducible code.
> >> >> >> >>
> >> >> >> >
> >> >> >> >
> >> >> >>
> >> >> >>
> >> >> >>
> >> >> >> --
> >> >> >> Jim Holtman
> >> >> >> Cincinnati, OH
> >> >> >> +1 513 646 9390
> >> >> >>
> >> >> >> What is the problem you are trying to solve?
> >> >> >
> >> >> >
> >> >>
> >> >>
> >> >>
> >> >> --
> >> >> Jim Holtman
> >> >> Cincinnati, OH
> >> >> +1 513 646 9390
> >> >>
> >> >> What is the problem you are trying to solve?
> >> >
> >> >
> >>
> >>
> >>
> >> --
> >> Jim Holtman
> >> Cincinnati, OH
> >> +1 513 646 9390
> >>
> >> What is the problem you are trying to solve?
> >>
> >> ______________________________________________
> >> R-help at r-project.org mailing list
> >> https://stat.ethz.ch/mailman/listinfo/r-help
> >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> >> and provide commented, minimal, self-contained, reproducible code.
> >
> >
> 
> 
> 
> -- 
> Jim Holtman
> Cincinnati, OH
> +1 513 646 9390
> 
> What is the problem you are trying to solve?



More information about the R-help mailing list