[R] How to add no data entries into current dataframe?

Hiroyuki Sato hiroysato at gmail.com
Fri Mar 25 07:53:54 CET 2016


Hello Ulrik and Jeff

Thank you for replying.

I succeed to create data frame following steps.

s <- structure(list(ID = c(101L, 102L, 103L, 103L), DATE = c(20160301L,
20160301L, 20160301L, 20160302L), VAR = c(1L, 1L, 1L, 1L), CODE =
structure(c(1L,
2L, 3L, 3L), .Label = c("PDT1", "PDT2", "PDT3"), class = "factor")),
.Names = c("ID",
"DATE", "VAR", "CODE"), class = "data.frame", row.names = c(NA,
-4L))

missing.id <- c(100:105)
missing.id <- missing.id[! missing.id %in% s $ID]

df2 <- data.frame(ID=missing.id,DATE="20160301",CODE="PDT1",VAR=0)

r <- dcast(rbind(s,df2),ID ~ CODE, value.var="VAR",sum)

r
   ID PDT1 PDT2 PDT3
1 100    0    0    0
2 101    1    0    0
3 102    0    1    0
4 103    0    0    2
5 104    0    0    0
6 105    0    0    0

Thanks again.

P.S.
I send this e-mail from gmail not inbox. so content should be plain text.



2016-03-25 14:58 GMT+09:00 Ulrik Stervbo <ulrik.stervbo at gmail.com>:
> You could make a vector with all possible IDs. Use %in% to get just those
> that are missing.
>
> missing.id <- c (101:1000)
> missing.id <- missing.id[! missing.id %in% s $ID]
>
> Df2 <- data.frame(ID = missing.id,
> CODE = paste0 (PDT, missing.id),
> VAR = 0)
>
> Modify your original data.frame so you can rind df2 and cast as you already
> do.
>
> I haven't tested the above and you might have to tweak it a bit.
>
> Hope it helps
> Ulrik
> ,
>
> Jeff Newmiller <jdnewmil at dcn.davis.ca.us> schrieb am Fr., 25. März 2016
> 06:37:
>>
>> Suggested reading
>>
>> An Introduction to R, section 5.3
>>
>> The Posting Guide, mentioned at the bottom of this message, which mentions
>> that this is a pain text mailing list so don't post in HTML (it gets
>> mangled).
>> --
>> Sent from my phone. Please excuse my brevity.
>>
>> On March 24, 2016 10:09:46 PM PDT, Hiroyuki Sato <hiroysato at gmail.com>
>> wrote:
>>>
>>> Hello Ulrik
>>>
>>> Thank you for replying.
>>>
>>> The real data has many IDs( about 3,000 IDS). So I want to find missing
>>> values with function or something.
>>> If 104 not in s, then add 104 value with all column zero.
>>>
>>> And also real data has many columns( 80 ~ 5,000, columns. it is not fixed
>>> length ).
>>> So I would like to add values with function or something too
>>>
>>> ex) I can't write following statement.
>>>   data.frame(ID = c(104, 105), PDT1 = 0, PDT2  = 0, PDT3 = 0, ... PDT5000
>>> =
>>> 0)
>>>
>>> Do you have any good idea?
>>>
>>> Regards.
>>>
>>>
>>>
>>>
>>> 2016年3月25日(金) 13:58 Ulrik Stervbo <ulrik.stervbo at gmail.com>:
>>>
>>>>  Hi Hiroyuki,
>>>>
>>>>  The row bind function rbind() is what you need
>>>>
>>>>  s <- dcast(s,ID ~ CODE, value.var="VAR",sum)
>>>>  df2 <-
>>>> data.frame(ID = c(104, 105), PDT1 = 0, PDT2  = 0, PDT3 = 0)
>>>>  rbind(s, df2)
>>>>
>>>>  hope this helps
>>>>  Ulrik
>>>>
>>>>  On Fri, 25 Mar 2016 at 05:52 Hiroyuki Sato <hiroysato at gmail.com> wrote:
>>>>
>>>>>  Hello members
>>>>>
>>>>>  Question
>>>>>
>>>>>  Could you tell me how to add ID 100, 104, 105 values with zero?
>>>>>
>>>>>  1, Source data
>>>>>
>>>>>
>>>>>  ID 100, 104 and 105 has no values.
>>>>>
>>>>>
>>>>>>  s
>>>>>
>>>>>  ID DATE VAR CODE
>>>>>  1 101 20160301 1 PDT1
>>>>>  2 102 20160301 1 PDT2
>>>>>  3 103 20160301 1 PDT3
>>>>>  4 103 20160302 1 PDT3
>>>>>
>>>>>  s <- structure(list(ID = c(101L, 102L, 103L, 103L), DATE =
>>>>> c(20160301L,
>>>>>  20160301L, 20160301L, 20160302L), VAR = c(1L, 1L, 1L, 1L), CODE =
>>>>>
>>>>> structure(c(1L,
>>>>>  2L, 3L, 3L), .Label = c("PDT1", "PDT2", "PDT3"), class = "factor")),
>>>>>  .Names
>>>>>  = c("ID",
>>>>>  "DATE", "VAR", "CODE"), class = "data.frame", row.names = c(NA,
>>>>>  -4L))
>>>>>
>>>>>  src <- 100:106
>>>>>
>>>>>
>>>>>  2, Expect output
>>>>>
>>>>>  ID PDT1 PDT2 PDT3
>>>>>  1 100 0 0 0
>>>>>  2 101 1 0 0
>>>>>  3 102 0 1 0
>>>>>  4 103 0 0 2
>>>>>  5 104 0 0 0
>>>>>  6 105 0 0 0
>>>>>
>>>>>  3, Convert process.
>>>>>
>>>>>  I can convert data "s" like following.
>>>>>
>>>>>>  library(reshape2)
>>>>>>  dcast(s,ID ~ CODE, value.var="VAR",sum)
>>>>>
>>>>>  ID PDT1 PDT2 PDT3
>>>>>  1 101 1 0 0
>>>>>  2 102 0 1 0
>>>>>  3 103 0 0 2
>>>>>
>>>>>  Could you tell me how to add 100, 104, 105 values into convert
>>>>> results?
>>>>>
>>>>>
>>>>>  Regards.
>>>>>
>>>>>          [[alternative HTML version deleted]]
>>>>>
>>>>> ________________________________
>>>>>
>>>>>  R-help at r-project.org mailing
>>>>> list -- To UNSUBSCRIBE and more, see
>>>>>  https://stat.ethz.ch/mailman/listinfo/r-help
>>>>>  PLEASE do read the posting guide
>>>>>  http://www.R-project.org/posting-guide.html
>>>>>  and provide commented, minimal, self-contained, reproducible code.
>>>>
>>>>
>>>>
>>>
>>>  [[alternative HTML version deleted]]
>>>
>>>
>>>
>>> ________________________________
>>>
>>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.



-- 
Hiroyuki Sato



More information about the R-help mailing list