[R] compute differences
Petr PIKAL
petr.pikal at precheza.cz
Wed Sep 23 16:58:47 CEST 2009
Hi
You can use outer. If your data are in data frame test then
DIFF <- as.vector(t(outer(test$val, test$val, "-")))
returns a vector, You just need to add suitable names to rows.
CASE <- as.vector(t(outer(test$ID, test$ID, paste, sep="-")))
data.frame(CASE, DIFF)
will put it together.
Regards
Petr
r-help-bounces at r-project.org napsal dne 23.09.2009 16:42:45:
> Alessandro Carletti wrote:
>
>
>
> Hi,
>
> I have a problem.
>
> I have a data frame looking like:
>
>
>
> ID val
>
>
>
> A? .3
>
> B? 1.2
>
> C? 3.4
>
> D? 2.2
>
> E? 2.0
>
>
>
> I need to CREATE the following TABLE:
>
>
>
> CASE?? DIFF
>
>
>
> A-A??? 0
>
> A-B??? -0.9
>
> A-C??? -3.1
>
> A-D??? -1.9
>
> A-E??? -1.7
>
> B-A??? ...
>
> B-B??? ...
>
> B-C
>
> B-D
>
> B-E
>
> C-A
>
> ...
>
>
>
> WHERE CASE IS THE COUPLE OF ELEMENTS CONSIDEREDM AND DIFF IS THE
> computed DIFFERENCE between their values.
>
>
>
> Could you give me suggestions?
>
>
>
> Solution:
>
> Besides the suggestions given by others, you can use the sqldf package
> to do this (leveraging knowledge in SQL if you know SQL). If you join
> your data frame with itself, without a join condition, you will get the
> Cartesian product of the two data frames, which seems to be exactly what
> you need. A warning is in order. Generally when you join 2 (or more)
> data frames you DO NOT want the Cartesian product by want to join the
> data frames by some key. The solution to your particular problem,
> however, can be implemented easily using the Cartesian product.
>
>
>
> mydata <- data.frame(id=rep(c('A','B','C','D','E'), each=2),
> val=sample(1:5, 10, replace=T))
>
> mydata
>
> library(sqldf)
>
> # merge data frame with itself to create a Cartesian Product - this is
> normally NOT what you want.
>
> # Note 'case' is a key word in SQL so I use cases for the variable name.
> Likewise diff is a used in R so I use diffr
>
> mydata2 <- sqldf("select a.id as id1, a.val as val1, b.id as id2, b.val
> as val2, a.id || ' - ' || b.id as cases,
>
> a.val - b.val as diffr from mydata a, mydata b")
>
> dim(mydata2) # check dimensions of the merged dataset
>
> head(mydata2) # examine the first 6 records
>
> # if you want only the columns casses and diffr, then use this SQL code
>
> mydata3 <- sqldf("select a.id || ' - ' || b.id as cases, a.val - b.val
> as diffr from mydata a, mydata b")
>
> dim(mydata3) # check dimensions of the merged dataset
>
> head(mydata3) # examine the first 6 records
>
>
>
> Hope this helps.
>
>
>
> Jude
>
> ___________________________________________
> Jude Ryan
> Director, Client Analytical Services
> Strategy & Business Development
> UBS Financial Services Inc.
> 1200 Harbor Boulevard, 4th Floor
> Weehawken, NJ 07086-6791
> Tel. 201-352-1935
> Fax 201-272-2914
> Email: jude.ryan at ubs.com
>
>
>
> Please do not transmit orders or instructions regarding a UBS
> account electronically, including but not limited to e-mail,
> fax, text or instant messaging. The information provided in
> this e-mail or any attachments is not an official transaction
> confirmation or account statement. For your protection, do not
> include account numbers, Social Security numbers, credit card
> numbers, passwords or other non-public information in your e-mail.
> Because the information contained in this message may be privileged,
> confidential, proprietary or otherwise protected from disclosure,
> please notify us immediately by replying to this message and
> deleting it from your computer if you have received this
> communication in error. Thank you.
>
> UBS Financial Services Inc.
> UBS International Inc.
> UBS Financial Services Incorporated of Puerto Rico
> UBS AG
>
>
> UBS reserves the right to retain all messages. Messages are protected
> and accessed only in legally justified
> cases.______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list