[R] Unique in discerning missing values NA

Rui Barradas ruipbarradas at sapo.pt
Fri Jul 5 12:38:46 CEST 2013


Hello,

Your data example is difficult to read into an R session. Next time, 
post the output of ?dput. Like this:

dput(menPatients[1:40, 1])  # post the output of this


The help page for unique says that "Missing values are regarded as 
equal"  so you should expect one NA to still be present in the final result.
If you want to remove NAs, use ?is.na. With fake data,

x1 <- c(1:3, NA, 4, NA, 2:9)
x2 <- unique(x1)
x3 <- x2[!is.na(x2)]
x3


Hope this helps,

Rui Barradas


Em 05-07-2013 10:28, Pancho Mulongeni escreveu:
> Hi,
> I am trying to remove duplicate Patient numbers in a clinical record, I used unique
> menPatients[1:40,1]
>   [1] abr1160(C)/001 ABR1363(A)/001 ABR1363(A)/001 ABR1363(A)/001 abr1772(B)/001
>   [6] AFR0003/001    AFR0003/001    afr0290(C)/001 afr1861(B)/001 Aga0007/001
> [11] AGA1548(A)/001 AGA1548(A)/001 AGA1548(A)/001 AGU1680(A)/001 AGU1680(A)/001
> [16] AIS0492/001    AIS0492/001    AKO4268(C)/001 AKO4268(C)/001 AKT0042(B)/001
> [21] AKT0042(B)/001 AKT0042(B)/001 AKT0042(B)/001 AKT0042(B)/001 AKT0042(B)/001
> [26] AKT0042(B)/001 alb4423(C)/001 ALF1651(A)/001 alf1722(B)/001 ALF1735(A)/001
> [31] ALF1735(A)/001 ALP4321(C)/001 <NA>           <NA>           ALU4262(B)/001
> [36] ALV4286(C)/001 ALW2579(C)/001 <NA>           ALW4330(B)/001 AMA0011/001
> 3886 Levels: 0750/002 0751/001 0984/002 ABE2560(C)/001 ... zul1737(B)/001
>
> testData<-menPatients[1:40,1]
>
> I then used unique, please note the NA at position 32 in testData
> testUnique<-unique(testData)
> testUnique
>   [1] abr1160(C)/001 ABR1363(A)/001 abr1772(B)/001 AFR0003/001    afr0290(C)/001
>   [6] afr1861(B)/001 Aga0007/001    AGA1548(A)/001 AGU1680(A)/001 AIS0492/001
> [11] AKO4268(C)/001 AKT0042(B)/001 alb4423(C)/001 ALF1651(A)/001 alf1722(B)/001
> [16] ALF1735(A)/001 ALP4321(C)/001 <NA>           ALU4262(B)/001 ALV4286(C)/001
> [21] ALW2579(C)/001 ALW4330(B)/001 AMA0011/001
>
> The missing value NA originally at position 32 in testdata is still there, it is in position 18. Why is this? How can I prevent this?
> I tried using incomprables=c(NA), but this did not work.
>
> Thanks
>
>
> Pancho Mulongeni
> Research Assistant
> PharmAccess Foundation
> 1 Fouché Street
> Windhoek West
> Windhoek
> Namibia
>
> Tel:   +264 61 419 000
> Fax:  +264 61 419 001/2
> Mob: +264 81 4456 286
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



More information about the R-help mailing list