[R] Unique in discerning missing values NA
Pancho Mulongeni
p.mulongeni at namibia.pharmaccess.org
Fri Jul 5 12:40:34 CEST 2013
Yes thanks, this is what I ended up doing, but I though there would be a 'internal' way to disregard NAs in unique.
Thanks for the tip on dput
-----Original Message-----
From: Rui Barradas [mailto:ruipbarradas at sapo.pt]
Sent: 05 July 2013 11:39
To: Pancho Mulongeni
Cc: r-help at r-project.org
Subject: Re: [R] Unique in discerning missing values NA
Hello,
Your data example is difficult to read into an R session. Next time, post the output of ?dput. Like this:
dput(menPatients[1:40, 1]) # post the output of this
The help page for unique says that "Missing values are regarded as equal" so you should expect one NA to still be present in the final result.
If you want to remove NAs, use ?is.na. With fake data,
x1 <- c(1:3, NA, 4, NA, 2:9)
x2 <- unique(x1)
x3 <- x2[!is.na(x2)]
x3
Hope this helps,
Rui Barradas
Em 05-07-2013 10:28, Pancho Mulongeni escreveu:
> Hi,
> I am trying to remove duplicate Patient numbers in a clinical record,
> I used unique menPatients[1:40,1]
> [1] abr1160(C)/001 ABR1363(A)/001 ABR1363(A)/001 ABR1363(A)/001 abr1772(B)/001
> [6] AFR0003/001 AFR0003/001 afr0290(C)/001 afr1861(B)/001 Aga0007/001
> [11] AGA1548(A)/001 AGA1548(A)/001 AGA1548(A)/001 AGU1680(A)/001 AGU1680(A)/001
> [16] AIS0492/001 AIS0492/001 AKO4268(C)/001 AKO4268(C)/001 AKT0042(B)/001
> [21] AKT0042(B)/001 AKT0042(B)/001 AKT0042(B)/001 AKT0042(B)/001
> AKT0042(B)/001 [26] AKT0042(B)/001 alb4423(C)/001 ALF1651(A)/001 alf1722(B)/001 ALF1735(A)/001
> [31] ALF1735(A)/001 ALP4321(C)/001 <NA> <NA> ALU4262(B)/001
> [36] ALV4286(C)/001 ALW2579(C)/001 <NA> ALW4330(B)/001 AMA0011/001
> 3886 Levels: 0750/002 0751/001 0984/002 ABE2560(C)/001 ...
> zul1737(B)/001
>
> testData<-menPatients[1:40,1]
>
> I then used unique, please note the NA at position 32 in testData
> testUnique<-unique(testData)
> testUnique
> [1] abr1160(C)/001 ABR1363(A)/001 abr1772(B)/001 AFR0003/001 afr0290(C)/001
> [6] afr1861(B)/001 Aga0007/001 AGA1548(A)/001 AGU1680(A)/001 AIS0492/001
> [11] AKO4268(C)/001 AKT0042(B)/001 alb4423(C)/001 ALF1651(A)/001 alf1722(B)/001
> [16] ALF1735(A)/001 ALP4321(C)/001 <NA> ALU4262(B)/001 ALV4286(C)/001
> [21] ALW2579(C)/001 ALW4330(B)/001 AMA0011/001
>
> The missing value NA originally at position 32 in testdata is still there, it is in position 18. Why is this? How can I prevent this?
> I tried using incomprables=c(NA), but this did not work.
>
> Thanks
>
>
> Pancho Mulongeni
> Research Assistant
> PharmAccess Foundation
> 1 Fouché Street
> Windhoek West
> Windhoek
> Namibia
>
> Tel: +264 61 419 000
> Fax: +264 61 419 001/2
> Mob: +264 81 4456 286
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
More information about the R-help
mailing list