[R] Help with transpose please.

Bill Poling B|||@Po||ng @end|ng |rom ze||@@com
Tue Jun 26 14:15:30 CEST 2018


Yep, thanks Dan, that’s got it. Thank you to everyone who responded as well.

WHP

ClaimServiceID  ClaimID     Dx1  Dx10  Dx11 Dx12     Dx2     Dx3     Dx4     Dx5     Dx6    Dx7    Dx8    Dx9
1        183056004 78044473    C562  <NA>  <NA> <NA>    C778    C784    C786   C7961   C7982  C7989   <NA>   <NA>
2        183056008 78044473    C562  <NA>  <NA> <NA>    C778    C784    C786   C7961   C7982  C7989   <NA>   <NA>
3        183139945 78078925  M79606  <NA>  <NA> <NA>   M7989    R600    <NA>    <NA>    <NA>   <NA>   <NA>   <NA>

From: Daniel Nordlund [mailto:djnordlund using gmail.com]
Sent: Saturday, June 23, 2018 6:04 AM
To: Bill Poling <Bill.Poling using zelis.com>; r-help (r-help using r-project.org) <r-help using r-project.org>
Subject: Re: [R] Help with transpose please.

On 6/22/2018 4:43 AM, Bill Poling wrote:
> Good morning.
>
>
> I have data in the form:
>
> head(Edit041IA, n=25)
> ClaimServiceID ClaimID DiagnosisCode
> 1 183056004 78044473 C562
> 2 183056004 78044473 C778
> 3 183056004 78044473 C784
> 4 183056004 78044473 C786
> 5 183056004 78044473 C7961
> 6 183056004 78044473 C7982
> 7 183056004 78044473 C7989
> 8 183056008 78044473 C562
> 9 183056008 78044473 C778
> 10 183056008 78044473 C784
> 11 183056008 78044473 C786
> 12 183056008 78044473 C7961
> 13 183056008 78044473 C7982
> 14 183056008 78044473 C7989
> 15 183139945 78078925 M79606
> 16 183139945 78078925 M7989
> 17 183139945 78078925 R600
> 18 183236728 78119632 H02831
> 19 183236728 78119632 H02832
> 20 183236728 78119632 H02834
> 21 183236728 78119632 H02835
> 22 183236728 78119632 H04123
> 23 183236728 78119632 Z411
> 24 183236728 78119632 H2513
> 25 183236728 78119632 H43813
>
> And wish to transpose to single record for single claimServiceID, ClaimID, and Dx1,Dx2,Dx3, etc:
>
> There can be multiple claimServiceIDs for a ClaimID so I want the unique ClaimServiceID as the identifier when I join this data back into a longer single record length file by that column.
>
> claimServiceID ClaimID Dx1 Dx2 Dx3 ...etc
> 1 183056004 78044473 C562 C778 C784 C786 C7961 ...etc
> 2 183056008 78044473 C562 C778 C784 C786 C7961 ...etc
>
>
> (If you would prefer the complete dput of the 1272 records I will gladly provide .Names = c("ClaimServiceID",
>
> "ClaimID", "DiagnosisCode"), class = "data.frame", row.names = c(NA,
>
> -1272L))
>
>
>
> At the moment the classes are:
>
> classes <- as.character(sapply(Edit041IA, class))
>
> classes
>
> # [1] "integer" "integer" "character" <---but do not have to be if that helps its just that's how the csv load pulled them in
>
> The max number of columns based on this transpose of the DiagnosisCode column (in this dataset) is 12 if that is important to know.
>
> I have looked at a variety of webpages and cannot get this right,
>
> dta2 <- melt(dta1, id=c("ClaimServiceID","ClaimID"))
> View(dta2)
> # https://www.r-bloggers.com/pivot-tables-in-r/<https://www.r-bloggers.com/pivot-tables-in-r/>
>
> # https://stackoverflow.com/questions/18449938/pivot-on-data-table-similar-to-rehape-melt-function<https://stackoverflow.com/questions/18449938/pivot-on-data-table-similar-to-rehape-melt-function>
>
>
> dta3 <- cast(Edit041IA, ClaimServiceID ~ DiagnosisCode, ClaimID)
> View(dta3)
> dta3 <- cast(Edit041IA, DiagnosisCode ~ ClaimServiceID, ClaimID)
> View(dta3)
>
> dta3 <- melt(Edit041IA, id=c("ClaimServiceID"))
> View(dta3)
>
> dta3 <- aggregate(Edit041IA, by=list(ClaimServiceID, ClaimID, DiagnosisCode))
> View(dta3)
>
>
> dta3 <- aggregate(Edit041IA, by=list(ClaimServiceID))
> View(dta3)
> # https://www.r-statistics.com/tag/transpose/<https://www.r-statistics.com/tag/transpose/>
>
> dta3 <- aggregate(Edit041IA, by=list(DiagnosisCode, ClaimServiceID, ClaimID))
> View(dta3)
>
>
> I am sure it's a basic, simple procedure, but I am pressed for time on this one, any support will be greatly appreciated, thank you.
>
> WHP
>
>
>
>
>
> Confidentiality Notice This message is sent from Zelis. ...{{dropped:15}}
>
> ______________________________________________
> R-help using r-project.org<mailto:R-help using r-project.org> mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help<https://stat.ethz.ch/mailman/listinfo/r-help>
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html<http://www.R-project.org/posting-guide.html>
> and provide commented, minimal, self-contained, reproducible code.
>

Bill,

you have received some good suggestions and since you are pressed for
time this may be too late. However, here is a solution using ave()
function and cast() from the reshape package.

# create diagnosis variable names
dxnames <- paste('Dx',ave(rep(1, nrow(have)), have[,1:2], FUN =
seq_along), sep='')
# cast the data into wide format
cast(cbind(have,dxnames), ClaimServiceID + ClaimID ~ dxnames,
value='DiagnosisCode')


Hope this is helpful,

Dan

--
Daniel Nordlund
Port Townsend, WA USA

Confidentiality Notice This message is sent from Zelis. This transmission may contain information which is privileged and confidential and is intended for the personal and confidential use of the named recipient only. Such information may be protected by applicable State and Federal laws from this disclosure or unauthorized use. If the reader of this message is not the intended recipient, or the employee or agent responsible for delivering the message to the intended recipient, you are hereby notified that any disclosure, review, discussion, copying, or taking any action in reliance on the contents of this transmission is strictly prohibited. If you have received this transmission in error, please contact the sender immediately. Zelis, 2018.

	[[alternative HTML version deleted]]



More information about the R-help mailing list