[R] create function for compare two dataframe.

Hiroyuki Sato hiroysato at gmail.com
Mon Feb 29 12:04:27 CET 2016


Hello Petr.

Thank you for replying.
Your step is better than my step!.

I added one step

> s.dcast <- dcast(s.m, ID+variable~dat)
> subset(s.dcast,df1!=df2)
   ID variable df1 df2
1 ID1     VAL1   2   0
5 ID2     VAL2   3   2
9 ID3     VAL3   4   2

This output is what I wanted!!.


P.S.

When I asked this question, I used indent for readability.
So maybe it was added extra space.
I executed same command. It has no extra space.

> str(s1)
'data.frame': 3 obs. of  4 variables:
 $ ID  : Factor w/ 3 levels "ID1","ID2","ID3": 1 2 3
 $ VAL1: int  2 0 0
 $ VAL2: int  2 3 2
 $ VAL3: int  3 3 4
> str(s2)
'data.frame': 3 obs. of  4 variables:
 $ ID  : Factor w/ 3 levels "ID1","ID2","ID3": 1 2 3
 $ VAL1: int  0 0 0
 $ VAL2: int  2 2 2
 $ VAL3: int  3 3 2

Best regards.


2016年2月29日(月) 18:16 PIKAL Petr <petr.pikal at precheza.cz>:

> Hi
>
> You does not need to create function, you can use functions already
> available.
>
> > s1<- read.table("clipboard", header=T, sep=",")
> > s2<- read.table("clipboard", sep=",")
>
> You presented second table without names.
> > names(s2) <- names(s1)
>
> > s1
>      ID VAL1 VAL2 VAL3
> 1   ID1    2    2    3
> 2   ID2    0    3    3
> 3   ID3    0    2    4
> > s2
>      ID VAL1 VAL2 VAL3
> 1   ID1    0    2    3
> 2   ID2    0    2    3
> 3   ID3    0    2    2
>
> You need to add a column in which you specify data frame and merge them
> > s1$dat<-"df1"
> > s2$dat<-"df2"
> > s<-merge(s1,s2, all=T)
>
> Now you need to reshape your data
>
> > library(reshape2)
>
> > s.m<-melt(s)
> Using ID, dat as id variables
> > s.m
>       ID dat variable value
> 1    ID1 df1     VAL1     2
> 2    ID2 df2     VAL1     0
> 3    ID2 df1     VAL1     0
> 4    ID3 df2     VAL1     0
> 5    ID3 df1     VAL1     0
> 6    ID1 df2     VAL1     0
> 7    ID1 df1     VAL2     2
> 8    ID2 df2     VAL2     2
> 9    ID2 df1     VAL2     3
> 10   ID3 df2     VAL2     2
> 11   ID3 df1     VAL2     2
> 12   ID1 df2     VAL2     2
> 13   ID1 df1     VAL3     3
> 14   ID2 df2     VAL3     3
> 15   ID2 df1     VAL3     3
> 16   ID3 df2     VAL3     2
> 17   ID3 df1     VAL3     4
> 18   ID1 df2     VAL3     3
>
> And cast the new structure.
> > dcast(s.m, ID+variable~dat)
>       ID variable df1 df2
> 1    ID1     VAL1   2  NA
> 2    ID1     VAL2   2  NA
> 3    ID1     VAL3   3  NA
> 4    ID2     VAL1   0   0
> 5    ID2     VAL2   3   2
> 6    ID2     VAL3   3   3
> 7    ID3     VAL1   0   0
> 8    ID3     VAL2   2   2
> 9    ID3     VAL3   4   2
> 10   ID1     VAL1  NA   0
> 11   ID1     VAL2  NA   2
> 12   ID1     VAL3  NA   3
>
> This was the point that I was rather surprised but I found a reason. Your
> ID variable does not have 3 but four values - although ID1 looks the same,
> in one there is an extra space, therefore you have different ID1 in s1 from
> ID1 in s2.
>
> That is why it is recommended to use dput() for exchanging data with
> others.
>
> > str(s)
> 'data.frame':   6 obs. of  5 variables:
>  $ ID  : Factor w/ 4 levels "  ID1","  ID2",..: 1 2 2 3 3 4
>  $ VAL1: int  2 0 0 0 0 0
>  $ VAL2: int  2 2 3 2 2 2
>  $ VAL3: int  3 3 3 2 4 3
>  $ dat : chr  "df1" "df2" "df1" "df2" ...
>
> > s1$ID
> [1]   ID1   ID2   ID3
> Levels:   ID1   ID2   ID3
> > s2$ID
> [1] ID1     ID2   ID3
> Levels:   ID2   ID3 ID1
>
> > str(s1)
> 'data.frame':   3 obs. of  5 variables:
>  $ ID  : Factor w/ 3 levels "  ID1","  ID2",..: 1 2 3
>  $ VAL1: int  2 0 0
>  $ VAL2: int  2 3 2
>  $ VAL3: int  3 3 4
>  $ dat : chr  "df1" "df1" "df1"
> > str(s2)
> 'data.frame':   3 obs. of  5 variables:
>  $ ID  : Factor w/ 3 levels "  ID2","  ID3",..: 3 1 2
>  $ VAL1: int  0 0 0
>  $ VAL2: int  2 2 2
>  $ VAL3: int  3 3 2
>  $ dat : chr  "df2" "df2" "df2"
> >
>
> Cheers
> Petr
>
>
> > -----Original Message-----
> > From: R-help [mailto:r-help-bounces at r-project.org] On Behalf Of
> > Hiroyuki Sato
> > Sent: Monday, February 29, 2016 9:44 AM
> > To: r-help at r-project.org
> > Subject: [R] create function for compare two dataframe.
> >
> > Hello
> >
> > I would like to create a funciton which is create new dataframe for
> > compare reslut of two dataframes.
> >
> >   No.  COLUMN DF1  DF2
> >   "1"  "VAL1" "2"  "0" # <- compare ID1,VAL1
> >   "2"  "VAL2" "3"  "2" # <- comapre ID2,VAL2
> >   "3"  "VAL3" "4"  "2" # <- compare ID3,VAL3
> >
> > s1 <- read.table("sample1.txt",header=T,sep=',')
> > s2 <- read.table("sample2.txt",header=T,sep=',')
> > comp_data(df1,df2)
> >
> > sample1.txt
> >   ID,VAL1,VAL2,VAL3
> >   ID1,2,2,3
> >   ID2,0,3,3
> >   ID3,0,2,4
> >
> > sample2.txt
> >   ID1,0,2,3
> >   ID2,0,2,3
> >   ID3,0,2,2
> >
> > I created the functions, but I got the following error.
> > Could you tell me how to add new frame data?
> > Or alternative way?
> >
> >   1: In `[<-.factor`(`*tmp*`, ri, value = "3") :
> >     invalid factor level, NA generated
> >   2: In `[<-.factor`(`*tmp*`, ri, value = "VAL3") :
> >     invalid factor level, NA generated
> >   3: In `[<-.factor`(`*tmp*`, ri, value = "4") :
> >     invalid factor level, NA generated
> >
> >
> >
> >   comp_data <- function(df1,df2) {
> >     #
> >     # create null data.frame
> >     out <- data.frame(matrix(rep(NA,4),nrow=1))[numeric(0), ]
> >     colnames(out) <- c("ID","Site","df1","df2")
> >
> >     # column names
> >     col_names <- colnames(df1)
> >
> >     # col_size
> >     col_size <- ncol(df1)
> >     row_size <- nrow(df1)
> >
> >     for( col in 2:col_size ){
> >       for( row in 1:row_size ){
> >         if( df1[row,col] != df2[row,col] ){
> >           out <-
> > rbind(out,c(df1[row,1],col_names[col],df1[row,col],df2[row,col]))
> >         }
> >       }
> >     }
> >     out
> >   }
> >
> > Best regards.
> >
> >       [[alternative HTML version deleted]]
> >
> > ______________________________________________
> > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-
> > guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
> ________________________________
> Tento e-mail a jakékoliv k němu připojené dokumenty jsou důvěrné a jsou
> určeny pouze jeho adresátům.
> Jestliže jste obdržel(a) tento e-mail omylem, informujte laskavě
> neprodleně jeho odesílatele. Obsah tohoto emailu i s přílohami a jeho kopie
> vymažte ze svého systému.
> Nejste-li zamýšleným adresátem tohoto emailu, nejste oprávněni tento email
> jakkoliv užívat, rozšiřovat, kopírovat či zveřejňovat.
> Odesílatel e-mailu neodpovídá za eventuální škodu způsobenou modifikacemi
> či zpožděním přenosu e-mailu.
>
> V případě, že je tento e-mail součástí obchodního jednání:
> - vyhrazuje si odesílatel právo ukončit kdykoliv jednání o uzavření
> smlouvy, a to z jakéhokoliv důvodu i bez uvedení důvodu.
> - a obsahuje-li nabídku, je adresát oprávněn nabídku bezodkladně přijmout;
> Odesílatel tohoto e-mailu (nabídky) vylučuje přijetí nabídky ze strany
> příjemce s dodatkem či odchylkou.
> - trvá odesílatel na tom, že příslušná smlouva je uzavřena teprve
> výslovným dosažením shody na všech jejích náležitostech.
> - odesílatel tohoto emailu informuje, že není oprávněn uzavírat za
> společnost žádné smlouvy s výjimkou případů, kdy k tomu byl písemně zmocněn
> nebo písemně pověřen a takové pověření nebo plná moc byly adresátovi tohoto
> emailu případně osobě, kterou adresát zastupuje, předloženy nebo jejich
> existence je adresátovi či osobě jím zastoupené známá.
>
> This e-mail and any documents attached to it may be confidential and are
> intended only for its intended recipients.
> If you received this e-mail by mistake, please immediately inform its
> sender. Delete the contents of this e-mail with all attachments and its
> copies from your system.
> If you are not the intended recipient of this e-mail, you are not
> authorized to use, disseminate, copy or disclose this e-mail in any manner.
> The sender of this e-mail shall not be liable for any possible damage
> caused by modifications of the e-mail or by delay with transfer of the
> email.
>
> In case that this e-mail forms part of business dealings:
> - the sender reserves the right to end negotiations about entering into a
> contract in any time, for any reason, and without stating any reasoning.
> - if the e-mail contains an offer, the recipient is entitled to
> immediately accept such offer; The sender of this e-mail (offer) excludes
> any acceptance of the offer on the part of the recipient containing any
> amendment or variation.
> - the sender insists on that the respective contract is concluded only
> upon an express mutual agreement on all its aspects.
> - the sender of this e-mail informs that he/she is not authorized to enter
> into any contracts on behalf of the company except for cases in which
> he/she is expressly authorized to do so in writing, and such authorization
> or power of attorney is submitted to the recipient or the person
> represented by the recipient, or the existence of such authorization is
> known to the recipient of the person represented by the recipient.
>

	[[alternative HTML version deleted]]



More information about the R-help mailing list