[R] Populating then sorting a matrix and/or data.frame

David Winsemius dwinsemius at comcast.net
Fri Nov 12 01:02:19 CET 2010


On Nov 11, 2010, at 6:38 PM, Noah Silverman wrote:

> That makes perfect sense.  All of my numbers are being coerced into
> strings by the c() function.  Subsequently, my data.frame contains all
> strings.
>
> I can't know the length of the data.frame ahead of time, so can't
> predefine it like your example.
> One thought would be to make it arbitrarily long filled with 0 and
> delete off the unused rows.  But this seems rather wasteful.

Although it might be faster, though. Here is a non-c() method using  
instead the list function (with options(stringsAsFactors=FALSE). List  
does not coerce to same mode and rbind.dta.frame will accept a list as  
a row argument:

results <- data.frame(a=vector(mode="character", length=0) ,
                       b=vector(mode="numeric", length=0),
                      cc=vector(mode="numeric", length=0), # note:   
avoid "c" as name
                       d=vector(mode="numeric", length=0))
  n = 10
  for(i in 1:n){
      a = LETTERS[i];
      b = i;
      cc = 3*i + 2
      d = rnorm(1);
      results <- rbind(results, list(a=a,b=b,cc=cc,d=c))
               }
  results
    a  b cc  d
2  A  1  5  5
21 B  2  8  8
3  C  3 11 11
4  D  4 14 14
5  E  5 17 17
6  F  6 20 20
7  G  7 23 23
8  H  8 26 26
9  I  9 29 29
10 J 10 32 32

OOOPs used d=c and there was a "c" vector hanging around to be picked  
up.

-- 
David.

>
> -N
>
> On 11/11/10 2:02 PM, Peter Langfelder wrote:
>> On Thu, Nov 11, 2010 at 1:19 PM, William Dunlap <wdunlap at tibco.com>  
>> wrote:
>>> Peter,
>>>
>>> Your example doesn't work for me unless I
>>> set options(stringsAsFactors=TRUE) first.
>>> (If I do set that, then all columns of 'results'
>>> have class "character", which I doubt the user
>>> wants.)
>> You probably mean stringsAsFactors=FALSE.
>>
>> What you say makes sense, because the c() function produces a vector
>> in which all components have the same type, wnd it will be character.
>> If you don't want to have characters, my solution would be
>>
>> n = 10
>> results <- data.frame(a = rep("", n), b = rep(0, n), c = rep(0, n), d
>> = rep(0, n))
>> for(i in 1:n){
>>   a = LETTERS[i];
>>   b = i;
>>   c = 3*i + 2
>>   d = rnorm(1);
>>   results$a[i] = a
>>   results$b[i] = b
>>   results$c[i] = c
>>   results$d[i] = d
>> }
>>
>>> results
>>   a  b  c           d
>> 1  A  1  5 -1.31553805
>> 2  B  2  8  0.09198054
>> 3  C  3 11 -0.05860804
>> 4  D  4 14  0.77796136
>> 5  E  5 17  1.28924697
>> 6  F  6 20  0.47631483
>> 7  G  7 23 -1.23727076
>> 8  H  8 26  0.83595295
>> 9  I  9 29  0.69435349
>> 10 J 10 32 -0.30922930
>>
>>> mode(results[, 1])
>> [1] "character"
>>> mode(results[, 2])
>> [1] "numeric"
>>> mode(results[, 3])
>> [1] "numeric"
>>> mode(results[, 4])
>> [1] "numeric"
>>
>>
>>
>> or alternatively
>>
>> n = 10
>> num <- data.frame(b = rep(0, n), c = rep(0, n), d = rep(0, n))
>> labels = rep("", n);
>> for(i in 1:n){
>>   a = LETTERS[i];
>>   b = i;
>>   c = 3*i + 2
>>   d = rnorm(1);
>>   labels[i] = a
>>   num[i, ] = c(b, c, d)
>> }
>> results = data.frame(a = labels, num)
>>
>>> results
>>   a  b  c           d
>> 1  A  1  5 -0.47150097
>> 2  B  2  8 -1.30507313
>> 3  C  3 11 -1.09860425
>> 4  D  4 14  0.91326330
>> 5  E  5 17 -0.09732841
>> 6  F  6 20 -0.75134162
>> 7  G  7 23  0.31360908
>> 8  H  8 26 -1.54406716
>> 9  I  9 29 -0.36075743
>> 10 J 10 32 -0.23758269
>>> mode(results[, 1])
>> [1] "character"
>>> mode(results[, 2])
>> [1] "numeric"
>>> mode(results[, 3])
>> [1] "numeric"
>>> mode(results[, 4])
>> [1] "numeric"
>>
>>
>> Peter
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

David Winsemius, MD
West Hartford, CT



More information about the R-help mailing list