[R] Problem with comparing multiple data sets

Mohammad Alimohammadi mxalimohamma at ualr.edu
Wed May 27 16:37:24 CEST 2015


Thanks John,

I really hope it can be answered. Yes all 3 data sets have the same items.

On Wed, May 27, 2015 at 9:32 AM, John Kane <jrkrideau at inbox.com> wrote:

> Thanks Mohammad.
> The data appear to have come through just fine. This probably means you
> can ignore some of the questions I just sent you -- our emails are crossing.
>
> I probably will not get a chance  to look at this til this afternoon
> (10:25 here now). We can hope someone with more skill than I have will have
> solved the problem by then.
>
> This is starting to sound a bit like a psychometric inter-rater
> reliability study.  Does each data set contain the same set of items ?
>
>
> John Kane
> Kingston ON Canada
>
> -----Original Message-----
> From: mxalimohamma at ualr.edu
> Sent: Wed, 27 May 2015 09:18:12 -0500
> To: jrkrideau at inbox.com, r-help at r-project.org
> Subject: Re: [R] Problem with comparing multiple data sets
>
> Hi John,
>
> I created the original data set with dput . This time I only loaded 50
> values for each data set (dat1, dat2, dat3).
>
> About your question, all 0,1 and 2 are indicator of a specific class. The
> task is to compare 3 independent classification of a certain term and and
> determine the actual class of the term by finding the most frequent
> assigned number for that term.
>
> I thought it might be easier to combine them into 1 data frame but either
> way is fine.
>
> Let me know if it shows up clean. I saved the dput in txt file and copied
> here from that file. I assume this is the right way to do it. I might be
> wrong.
>
> ==============================================
>
> dat1
>
> structure(list(class.1 = c(0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L,
>
> 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L,
>
> 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 2L, 1L, 1L, 1L,
>
> 1L, 2L, 2L, 1L, 1L, 2L, 1L, 2L), terms = structure(c(1L, 1L,
>
> 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
>
> 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
>
> 1L, 1L, 1L, 4L, 4L, 4L, 4L, 4L, 3L, 3L, 3L, 3L, 2L, 2L, 2L), .Label =
> c("#dac",
>
> "#mac,#security", "accountability,anonymous", "data
> security,encryption,security"
>
> ), class = "factor")), .Names = c("class.1", "terms"), class =
> "data.frame", row.names = c(NA,
>
> -49L))
>
> dat2
>
> structure(list(class.2 = c(2L, 2L, 2L, 2L, 0L, 0L, 2L, 0L, 0L,
>
> 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L,
>
> 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 2L, 0L, 2L, 2L, 2L, 1L, 1L, 2L,
>
> 2L, 0L, 0L, 0L, 0L, 1L, 1L, 1L), terms = structure(c(1L, 1L,
>
> 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
>
> 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
>
> 1L, 1L, 1L, 4L, 4L, 4L, 4L, 4L, 3L, 3L, 3L, 3L, 2L, 2L, 2L), .Label =
> c("#dac",
>
> "#mac,#security", "accountability,anonymous", "data
> security,encryption,security"
>
> ), class = "factor")), .Names = c("class.2", "terms"), class =
> "data.frame", row.names = c(NA,
>
> -49L))
>
> dat3
>
> structure(list(class.3 = c(0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L,
>
> 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L,
>
> 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 2L, 1L, 1L, 1L,
>
> 1L, 0L, 0L, 0L, 0L, 2L, 1L, 2L), terms = structure(c(1L, 1L,
>
> 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
>
> 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
>
> 1L, 1L, 1L, 4L, 4L, 4L, 4L, 4L, 3L, 3L, 3L, 3L, 2L, 2L, 2L), .Label =
> c("#dac",
>
> "#mac,#security", "accountability,anonymous", "data
> security,encryption,security"
>
> ), class = "factor")), .Names = c("class.3", "terms"), class =
> "data.frame", row.names = c(NA,
>
> -49L))
>
> =============================================
>
> On Wed, May 27, 2015 at 8:05 AM, John Kane <jrkrideau at inbox.com> wrote:
>
>         Hi Mohammad,
>
>  I went back and reread your original statement of the problem about and I
> think I kinda grasp it. It is actually quite clear and I misunderstood it
> completely.
>
>  At the moment I have no idea how to approach it.  As Jim Lemon said, it
> looks easy but may not be.  I'll go back and re-examine Jim's approach.
>
>  You might want to create three sample data sets of the original data
> layouts and upload them, in dput() format, to the list.  It may be easier
> to tackle from that approach.
>
>  In any case, in the existing data set is a 2 a numeric value 2 or just an
> on/off indicator?
>
>  John Kane
>  Kingston ON Canada
>
>  > -----Original Message-----
>  > From: mxalimohamma at ualr.edu
>
> > Sent: Tue, 26 May 2015 20:11:08 -0500
>  > To: r-help at r-project.org
>  > Subject: Re: [R] Problem with comparing multiple data sets
>  >
>  > Thank you John. Yes. as you mentioned this is not really what I am
>  > looking
>  > for.
>  >
>  > It's interesting because I was really thinking that it should be pretty
>  > easy. All I need to do is just compare class1, class2 and class3 for
> each
>  > text and put the most frequent number next to it in each row. Repeat it
>  > for
>  > all the rows. Apparently it's not that simple.
>  >
>  > Sorry I didn't notice that I sent it only to you! Thanks for letting me
>  > know.
>  >
>  > I appreciate if anybody can help on this.
>  >
>  > Thank you.
>  >
>  >
>  >
>  >
>  > On Tue, May 26, 2015 at 7:27 PM, John Kane <jrkrideau at inbox.com> wrote:
>  >
>  >> Hi Mohammad,
>  >>
>  >> The data came through beautifully despite the fact that you posted in
>  >> HTML.  Please, post in plain text.
>  >>
>  >> Oh, just as I was ready to push Send, I  noticed you only replied to
> me.
>  >> You really should reply to the R-help list since there are a lot more
>  >> and
>  >> better people to help there. Besides it's a world-wide list. Others can
>  >> play with the problem while we sleep :) .
>  >>
>  >> I will just reply to you but I really suggest sending all of this to
> the
>  >> list.
>  >>
>  >> Now I am wondering what to do with the data. As a first swipe I just
>  >> added
>  >> up all the values in each class by each text value. Results are below.
>  >> Not
>  >> what you want by any means but perhaps a small step.
>  >>
>  >> Then I started to think are we really interested in the sum or should
> we
>  >> be looking at incidence, that is should we be looking at the frequency
>  >> rather than the sum?
>  >>
>  >> Is
>  >> class.1 class.2   class  #dac
>  >>   0           2              0
>  >>
>  >> a value of 2 (sum) or a hit of 1 (count or freq) ?
>  >>
>  >> Anyway below is what I have tried so far -- it may not be anywhere near
>  >> what you want but if it makes any sense then I think we just need to
>  >> pick
>  >> off the highest values for each combination of terms and class to give
>  >> you
>  >> what you want.
>  >>
>  >> I suspect our real data-munging gurus can do  all this faster and
> better
>  >> than I can but hopefully it is a start.
>  >>
>  >> Where your data set is dat1
>  >> #=====================================
>  >> # If reshape2 is not installed.
>  >> install.packages("reshape2")
>  >> #=====================================
>  >>
>  >> library(reshape2)
>  >>  mdat  <-  melt(dat1, id.vars= c("terms"),
>  >>        variable.name [http://variable.name] = "class",
>  >>        value.name [http://value.name] = "value",
>  >>        na.rm = FALSE)
>  >>
>  >> mdat1  <-  aggregate(value ~ terms + class, data = mdat, sum)
>  >>
>  >> mdat1[order(mdat1$terms, mdat1$class), ]
>  >>
>  >> #=====================================
>  >>
>  >>
>  >> John Kane
>  >> Kingston ON Canada
>  >>
>  >> -----Original Message-----
>  >> From: mxalimohamma at ualr.edu
>  >> Sent: Tue, 26 May 2015 09:50:43 -0500
>  >> To: jrkrideau at inbox.com
>  >> Subject: Re: [R] Problem with comparing multiple data sets
>  >>
>  >> Thank you John for being patient with me.
>  >>
>  >> My original post was to compare 3 sets of data which had difference in
>  >> their class value for the same text. However, I thought it might be
>  >> easier
>  >> to combine those 3 data sets into one that shows the 3 different
> classes
>  >> and then find the most frequent class value for the text. So that's
> what
>  >> I
>  >> did. Now I only want to add the most frequent class value in a new
>  >> column.
>  >>
>  >> I tried to create a dput version of the data set (Only a small part of
>  >> it)
>  >> so you can see. I hope it works.
>  >>
>  >>> Tweet1<- read.csv(file="part1_complete.csv",head=TRUE,sep= ",")
>  >>
>  >>> dput(head(Tweet1, 100))
>  >>
>  >> structure(list(class.1 = c(0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L,
>  >>
>  >> 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L,
>  >>
>  >> 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 2L, 1L, 1L, 1L,
>  >>
>  >> 1L, 2L, 2L, 1L, 1L, 2L, 1L, 2L, 0L, 1L, 2L, 2L, 2L, 1L, 1L, 1L,
>  >>
>  >> 1L, 2L, 1L, 1L, 1L, 0L, 2L, 2L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L,
>  >>
>  >> 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
>  >>
>  >> 2L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 2L, 2L, 2L), class.2 = c(2L,
>  >>
>  >> 2L, 2L, 2L, 0L, 0L, 2L, 0L, 0L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
>  >>
>  >> 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L,
>  >>
>  >> 2L, 0L, 2L, 2L, 2L, 1L, 1L, 2L, 2L, 0L, 0L, 0L, 0L, 1L, 1L, 1L,
>  >>
>  >> 0L, 1L, 1L, 1L, 0L, 1L, 1L, 0L, 0L, 1L, 0L, 0L, 1L, 0L, 0L, 0L,
>  >>
>  >> 1L, 0L, 0L, 1L, 0L, 0L, 1L, 1L, 1L, 2L, 2L, 1L, 1L, 1L, 1L, 1L,
>  >>
>  >> 1L, 1L, 2L, 2L, 2L, 1L, 1L, 1L, 0L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
>  >>
>  >> 1L, 1L, 1L), class.3 = c(0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L,
>  >>
>  >> 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L,
>  >>
>  >> 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 2L, 1L, 1L, 1L,
>  >>
>  >> 1L, 0L, 0L, 0L, 0L, 2L, 1L, 2L, 0L, 2L, 2L, 0L, 2L, 1L, 1L, 1L,
>  >>
>  >> 1L, 0L, 0L, 0L, 2L, 1L, 0L, 0L, 1L, 0L, 0L, 2L, 2L, 2L, 2L, 2L,
>  >>
>  >> 0L, 2L, 2L, 1L, 0L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 1L,
>  >>
>  >> 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 2L, 2L), terms = structure(c(9L,
>  >>
>  >> 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L,
>  >>
>  >> 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L,
>  >>
>  >> 9L, 9L, 9L, 9L, 69L, 69L, 69L, 69L, 69L, 40L, 40L, 40L, 40L,
>  >>
>  >> 15L, 15L, 15L, 15L, 15L, 15L, 15L, 15L, 98L, 98L, 98L, 98L, 98L,
>  >>
>  >> 98L, 98L, 98L, 98L, 98L, 98L, 98L, 98L, 98L, 23L, 87L, 87L, 87L,
>  >>
>  >> 87L, 87L, 87L, 87L, 87L, 87L, 87L, 87L, 87L, 87L, 87L, 87L, 87L,
>  >>
>  >> 87L, 87L, 87L, 87L, 87L, 87L, 87L, 87L, 87L, 87L, 87L, 87L, 87L,
>  >>
>  >> 87L, 87L), .Label = c("#accountability",
>  >> "#accountability,#anonymity,anonymity",
>  >>
>  >> "#accountability,recovery", "#anonymity,anonymity",
>  >> "#anonymous,anonymous",
>  >>
>  >> "#attacker,security", "#authentication,access control",
> "#confidential",
>  >>
>  >> "#dac", "#encryption,#privacy,#security", "#identifier",
>  >> "#identifier,identifier",
>  >>
>  >> "#intrusion,#security,security", "#mac", "#mac,#security",
>  >> "#mac,password",
>  >>
>  >> "#mac,security", "#password,privacy", "#password,security",
>  >> "#prevention,prevention",
>  >>
>  >> "#privacy,#security,password", "#privacy,identifiable",
>  >> "#privacy,information privacy,privacy",
>  >>
>  >> "#privacy,intrusion", "#privacy,location privacy,privacy",
>  >> "#privacy,password,security",
>  >>
>  >> "#privacy,personal data", "#privacy,personal information,privacy",
>  >>
>  >> "#privacy,security", "#pseudonym", "#pseudonymity",
>  >> "#security,authentication,identity management",
>  >>
>  >> "#security,identity management,security", "#security,mac,security",
>  >>
>  >> "#security,malicious,security", "#security,personal information",
>  >>
>  >> "#security,retention", "#token", "#token,token",
>  >> "accountability,anonymous",
>  >>
>  >> "accountability,audit trail", "accountability,confidential",
>  >>
>  >> "accountability,security", "accountability,token", "adversary,pin",
>  >>
>  >> "anonymity,authentication", "anonymity,security",
>  >> "anonymous,disclosure",
>  >>
>  >> "anonymous,password", "authentication,password,security",
>  >> "authorization,mac",
>  >>
>  >> "authorization,permission", "confidential,disclosure",
>  >> "confidential,disclosure,security",
>  >>
>  >> "confidential,mac", "confidential,personal information",
>  >> "confidential,pin",
>  >>
>  >> "confidential,privilege", "confidentiality,security", "consent",
>  >>
>  >> "dac", "dac,pcm", "data aggregation,privacy", "data controller",
>  >>
>  >> "data protection,encryption", "data protection,recovery", "data
>  >> protection,security",
>  >>
>  >> "data quality,security", "data security,encryption,security",
>  >>
>  >> "data security,mac,security", "data security,personal data,security",
>  >>
>  >> "data security,prevention,security", "detection", "detection,mac",
>  >>
>  >> "detection,password", "deterrence,prevention", "digital signature",
>  >>
>  >> "disclosure,password", "disclosure,private information",
>  >> "disclosure,security",
>  >>
>  >> "encryption,password,recovery", "encryption,private data", "id
>  >> management,privacy",
>  >>
>  >> "id management,security", "identifier", "identifier,token", "location
>  >> privacy,privacy",
>  >>
>  >> "mac,password,security", "mac,permission", "mac,prevention",
>  >>
>  >> "mac,privacy", "mac,pseudonym", "malicious,prevention",
>  >> "non-repudiation",
>  >>
>  >> "password,prevention,security", "password,private information",
>  >>
>  >> "password,recovery", "password,user id", "permission,personal data",
>  >>
>  >> "permission,privacy,privacy policy", "personal data", "personal
>  >> identification number,pin",
>  >>
>  >> "personal information", "personal information,security", "prevention",
>  >>
>  >> "prevention,privilege", "privacy,privacy policy", "privacy,privacy
>  >> preferences",
>  >>
>  >> "private information,security", "recovery,retention", "recovery,token",
>  >>
>  >> "retention,token", "sensitive data", "token"), class = "factor")),
>  >> .Names
>  >> = c("class.1",
>  >>
>  >> "class.2", "class.3", "terms"), row.names = c(NA, 100L), class =
>  >> "data.frame")
>  >>
>  >> On Mon, May 25, 2015 at 2:04 PM, John Kane <jrkrideau at inbox.com>
> wrote:
>  >>
>  >>         Hi Mohammad,
>  >>
>  >>  If you are just starting with R a sense of total confusion is often
> the
>  >> first feeling.  Welcome :).
>  >>
>  >>  If you are a SAS or SPSS user this may help
>  >>
> https://science.nature.nps.gov/im/datamgmt/statistics/r/documents/r_for_sas_spss_users.pdf
> [
> https://science.nature.nps.gov/im/datamgmt/statistics/r/documents/r_for_sas_spss_users.pdf
> ]
>  >> [
>  >>
> https://science.nature.nps.gov/im/datamgmt/statistics/r/documents/r_for_sas_spss_users.pdf
> [
> https://science.nature.nps.gov/im/datamgmt/statistics/r/documents/r_for_sas_spss_users.pdf
> ]
>  >> ]
>  >>
>  >>  If anything,  I am even more lost than before.
>  >>
>  >>  Did Jim Lemon's approach help? Confuse ?
>  >>
>  >>  Perhaps one of the problems is that the data did not come through
>  >> cleanly.  You posted in HTML and the R-help list strips out all HTML so
>  >> the
>  >> result often is mangled beyond any real use.
>  >>
>  >>  I may have imagined that your data are more complicated than they
>  >> really
>  >> are if all you really want is some kind of frequency count possibly by
>  >> some
>  >> conditioning variable. Is this it?
>  >>
>  >>   It seems too simple but that is what I read that Excel is doing (as
>  >> incompetently as usual---I had not realised it was possible to be even
>  >> less
>  >> impressed with Excel than I already  was.)
>  >>
>  >>  Can you send us some more data in dput() format. See the links I
>  >> provided
>  >> earlier or have a look at ?dput for more information.
>  >>
>  >>  If you have lot of data, a representative sample is fine.  It is often
>  >> enough to do something like :
>  >>  dput(head(mydata, 100))
>  >>  which supplies 100 rows of data.
>  >>
>  >>  Just output the dput() data, copy and paste into your email,  et voilà
>  >> we have the exact same data.
>  >>
>  >>  The reason for dput() is that it provides a snapshot of exactly how
> the
>  >> data exists on your machine. Given all sorts of differences between
>  >> OS's,
>  >> personal settings, human languages and so on. what I or another R-help
>  >> reader see  or read in may not correspond to what you have. Using
> dput()
>  >> avoids all of this.
>  >>
>  >>  Here is a simple example of what I mean. If you look at dat1 and dat2
>  >> they 'look' the same but ... I could read in data either way depending
>  >> on
>  >> all sorts of variable and have no idea which, if either is how you see
>  >> the
>  >> data.
>  >>
>  >>   Data are supplied in dput() format, just copy and paste into R.
>  >>  =====
>  >>  dat1  <- structure(list(aa = structure(1:10, .Label = c("1", "2", "3",
>  >>  "4", "5", "6", "7", "8", "9", "10"), class = "factor"), bb = c(10L,
>  >>  9L, 8L, 7L, 6L, 5L, 4L, 3L, 2L, 1L)), .Names = c("aa", "bb"),
> row.names
>  >> =
>  >> c(NA,
>  >>  -10L), class = "data.frame")
>  >>
>  >>  dat2  <-  structure(list(aa = 1:10, bb = c(10L, 9L, 8L, 7L, 6L, 5L,
> 4L,
>  >>  3L, 2L, 1L)), .Names = c("aa", "bb"), row.names = c(NA, -10L), class =
>  >> "data.frame")
>  >>
>  >>  dat1
>  >>  dat2  # looks a lot like dat1
>  >>
>  >>  with(dat1, aa*bb)
>  >>  with(dat2 , aa*bb)
>  >>
>  >>  str(dat1)
>  >>  str(dat2)
>  >>
>  >>  =======
>  >>
>  >>  John Kane
>  >>  Kingston ON Canada
>  >>
>  >>  -----Original Message-----
>  >>  From: mxalimohamma at ualr.edu
>  >>  Sent: Mon, 25 May 2015 12:14:46 -0500
>  >>  To: jrkrideau at inbox.com
>  >>  Subject: Re: [R] Problem with comparing multiple data sets
>  >>
>  >>  Hi John.
>  >>
>  >>  Thank you for your response.
>  >>
>  >>  Here is a small portion of my actual data set. What I am supposed to
> do
>  >> is to use a function similar to mode function in excel to find the most
>  >> frequent value (class) for each term.
>  >>
>  >>    V1 V2 V3 V4
>  >>
>  >>  1 class 1 class 2 class 3 terms
>  >>
>  >>  2 0 2 0 #dac
>  >>
>  >>  3 0 2          0 #dac
>  >>
>  >>  4 0 2 0 #dac
>  >>
>  >>  5 0 2 0 #dac
>  >>
>  >>  6 1 0 1 #dac
>  >>
>  >>  7 0 0 0 #dac
>  >>
>  >>  ....
>  >>
>  >>  Since I just started using R. I don't know where I am going with this.
>  >> I
>  >> appreciate any help.
>  >>
>  >>  On Sat, May 23, 2015 at 8:23 AM, John Kane <jrkrideau at inbox.com>
> wrote:
>  >>
>  >>          Hi Mohammad
>  >>
>  >>   Welcome to the R-help list.
>  >>
>  >>   There probably is a fairly easy way to what you want but I think we
>  >> probably need a bit more background information on what you are trying
>  >> to
>  >> achieve.  I know I'm not exactly clear on your decision rule(s).
>  >>
>  >>   It would also be very useful to see some actual sample data in
> useable
>  >> R
>  >> format.Have a look at these links
>  >>
> http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example
> [
> http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example
> ]
>  >> [
>  >>
> http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example
> [
> http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example
> ]]
>  >> [
>  >>
> http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example
> [
> http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example
> ]
>  >> [
>  >>
> http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example
> [
> http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example]
> ]]
>  >> and http://adv-r.had.co.nz/Reproducibility.html [
> http://adv-r.had.co.nz/Reproducibility.html] [
>  >> http://adv-r.had.co.nz/Reproducibility.html [
> http://adv-r.had.co.nz/Reproducibility.html]] [
>  >> http://adv-r.had.co.nz/Reproducibility.html [
> http://adv-r.had.co.nz/Reproducibility.html] [
>  >> http://adv-r.had.co.nz/Reproducibility.html [
> http://adv-r.had.co.nz/Reproducibility.html]]] for some hints on what you
>  >> might want to include in your question.
>  >>
>  >>   In particular, read up about dput()  in those links and/or see ?dput.
>  >> This is the generally preferred way to supply sample or illustrative
>  >> data
>  >> to the R-help list.  It basically creates a perfect copy of the data as
>  >> it
>  >> exists on 'your' machine so that R-help readers see exactly what you
> do.
>  >>
>  >>   John Kane
>  >>   Kingston ON Canada
>  >>
>  >>   > -----Original Message-----
>  >>   > From: mxalimohamma at ualr.edu
>  >>   > Sent: Fri, 22 May 2015 12:37:50 -0500
>  >>   > To: r-help at r-project.org
>  >>   > Subject: [R] Problem with comparing multiple data sets
>  >>   >
>  >>   > Hi everyone,
>  >>   >
>  >>   > I am very new to R and I have a task to do. I appreciate any help.
> I
>  >> have
>  >>   > 3
>  >>   > data sets. Each data set has 4 columns. For example:
>  >>   >
>  >>   > Class  Comment   Term   Text
>  >>   > 0           com1        aac    text1
>  >>   > 2           com2        aax    text2
>  >>   > 1           com3        vvx    text3
>  >>   >
>  >>   > Now I need t compare the class section between 3 data sets and
>  >> assign
>  >> the
>  >>   > most available class to that text. For example if text1 is assigned
>  >> to
>  >>   > class 0 in data set 1&2 but assigned as 2 in data set 3 then it
>  >> should
>  >> be
>  >>   > assigned to class 0. If they are all the same so the class will be
>  >> the
>  >>   > same. The ideal thing would be to keep the same format and just
>  >> update
>  >>   > the
>  >>   > class. Is there any easy way to do this?
>  >>   >
>  >>   > Thanks a lot.
>  >>   >
>  >>
>  >>  >       [[alternative HTML version deleted]]
>  >>   >
>  >>   > ______________________________________________
>  >>   > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>  >>
>  >>  > https://stat.ethz.ch/mailman/listinfo/r-help [
> https://stat.ethz.ch/mailman/listinfo/r-help] [
>  >> https://stat.ethz.ch/mailman/listinfo/r-help [
> https://stat.ethz.ch/mailman/listinfo/r-help]] [
>  >> https://stat.ethz.ch/mailman/listinfo/r-help [
> https://stat.ethz.ch/mailman/listinfo/r-help] [
>  >> https://stat.ethz.ch/mailman/listinfo/r-help [
> https://stat.ethz.ch/mailman/listinfo/r-help]]]
>  >>   > PLEASE do read the posting guide
>  >>   > http://www.R-project.org/posting-guide.html [
> http://www.R-project.org/posting-guide.html] [
>  >> http://www.R-project.org/posting-guide.html [
> http://www.R-project.org/posting-guide.html]] [
>  >> http://www.R-project.org/posting-guide.html [
> http://www.R-project.org/posting-guide.html] [
>  >> http://www.R-project.org/posting-guide.html [
> http://www.R-project.org/posting-guide.html]]]
>  >>   > and provide commented, minimal, self-contained, reproducible code.
>  >>
>  >>   ____________________________________________________________
>  >>   FREE 3D EARTH SCREENSAVER - Watch the Earth right on your desktop!
>  >>   Check it out at http://www.inbox.com/earth [
> http://www.inbox.com/earth]
>  >> [http://www.inbox.com/earth [http://www.inbox.com/earth]]
>  >> [http://www.inbox.com/earth [http://www.inbox.com/earth] [
> http://www.inbox.com/earth [http://www.inbox.com/earth]]]
>  >>
>  >>  --
>  >>
>  >>  Mohammad Alimohammadi | Graduate Assistant
>  >>  University of Arkansas at Little Rock | College of Science
>  >> and Mathematics (CSAM)
>  >>
>  >>  501.346.8007 | mxalimohamma at ualr.edu | ualr.edu [http://ualr.edu] [
> http://ualr.edu [http://ualr.edu]] [
>  >> http://ualr.edu/ [http://ualr.edu/] [http://ualr.edu/ [
> http://ualr.edu/]]]
>  >>
>  >>  Public URL: http://scholar.google.com/citations?user=MsfN_i8AAAAJ [
> http://scholar.google.com/citations?user=MsfN_i8AAAAJ] [
>  >> http://scholar.google.com/citations?user=MsfN_i8AAAAJ [
> http://scholar.google.com/citations?user=MsfN_i8AAAAJ]] [
>  >> http://scholar.google.com/citations?user=MsfN_i8AAAAJ [
> http://scholar.google.com/citations?user=MsfN_i8AAAAJ] [
>  >> http://scholar.google.com/citations?user=MsfN_i8AAAAJ [
> http://scholar.google.com/citations?user=MsfN_i8AAAAJ]]]
>  >>
>  >>  ____________________________________________________________
>  >>  FREE ONLINE PHOTOSHARING - Share your photos online with your friends
>  >> and
>  >> family!
>  >>  Visit http://www.inbox.com/photosharing [
> http://www.inbox.com/photosharing] [
>  >> http://www.inbox.com/photosharing [http://www.inbox.com/photosharing]]
> to find out more!
>  >>
>  >> --
>  >>
>  >> Mohammad Alimohammadi | Graduate Assistant
>  >> University of Arkansas at Little Rock | College of Science and
>  >> Mathematics
>  >> (CSAM)
>  >>
>  >> 501.346.8007 | mxalimohamma at ualr.edu | ualr.edu [http://ualr.edu] [
> http://ualr.edu/ [http://ualr.edu/]]
>  >>
>  >> Public URL: http://scholar.google.com/citations?user=MsfN_i8AAAAJ [
> http://scholar.google.com/citations?user=MsfN_i8AAAAJ] [
>  >> http://scholar.google.com/citations?user=MsfN_i8AAAAJ [
> http://scholar.google.com/citations?user=MsfN_i8AAAAJ]]
>  >>
>  >> ____________________________________________________________
>  >> FREE 3D EARTH SCREENSAVER - Watch the Earth right on your desktop!
>  >> Check it out at http://www.inbox.com/earth [http://www.inbox.com/earth
> ]
>  >>
>  >>
>  >>
>  >
>  >
>  > --
>  > Mohammad Alimohammadi | Graduate Assistant
>  > University of Arkansas at Little Rock | College of Science and
>  > Mathematics
>  > (CSAM)
>  > 501.346.8007 | mxalimohamma at ualr.edu | ualr.edu [http://ualr.edu]
>  >
>  > Public URL: http://scholar.google.com/citations?user=MsfN_i8AAAAJ [
> http://scholar.google.com/citations?user=MsfN_i8AAAAJ]
>  >
>  >       [[alternative HTML version deleted]]
>  >
>  > ______________________________________________
>  > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>  > https://stat.ethz.ch/mailman/listinfo/r-help [
> https://stat.ethz.ch/mailman/listinfo/r-help]
>  > PLEASE do read the posting guide
>  > http://www.R-project.org/posting-guide.html [
> http://www.R-project.org/posting-guide.html]
>  > and provide commented, minimal, self-contained, reproducible code.
>
>  ____________________________________________________________
>
> Can't remember your password? Do you need a strong and secure password?
>  Use Password manager! It stores your passwords & protects your account.
>  Check it out at http://mysecurelogon.com/password-manager [
> http://mysecurelogon.com/password-manager]
>
> --
>
> Mohammad Alimohammadi | Graduate Assistant
> University of Arkansas at Little Rock | College of Science and Mathematics
> (CSAM)
>
> 501.346.8007 | mxalimohamma at ualr.edu | ualr.edu [http://ualr.edu/]
>
> Public URL: http://scholar.google.com/citations?user=MsfN_i8AAAAJ [
> http://scholar.google.com/citations?user=MsfN_i8AAAAJ]
>
> ____________________________________________________________
> FREE 3D MARINE AQUARIUM SCREENSAVER - Watch dolphins, sharks & orcas on
> your desktop!
> Check it out at http://www.inbox.com/marineaquarium
>
>
>


-- 
Mohammad Alimohammadi | Graduate Assistant
University of Arkansas at Little Rock | College of Science and Mathematics
(CSAM)
501.346.8007 | mxalimohamma at ualr.edu | ualr.edu

Public URL: http://scholar.google.com/citations?user=MsfN_i8AAAAJ

	[[alternative HTML version deleted]]



More information about the R-help mailing list