[R-sig-Epi] Identify medicines names

Felipe Barletta |e||pe@e@b@r|ett@ @end|ng |rom gm@||@com
Mon Apr 5 20:25:20 CEST 2021


Hi friends,

Hi friends,

I need to identify medicines names in a data set.
I have a list of antibiotic names and I need to identify those names in a
sample.

When the name of the medicine is simple, my solution worked, see:

#List of medicines that - object called patterns.
patterns <- c("Oritavancina", "Oxacilina", "Pefloxacino", "Penicilina",
              "Pexiganan", "Piperacilina", "Piperacilina-tazobactam",
              "Pirazinamida", "Plazomicina", "Polimixina B",
              "Posilozid")


# Sample Data frame where I need to find the names from the list above.
df <- data.frame(name =
                     c("CLORETO DE POTASSIO DRAGEA 600MG",
                       "CLORETO DE SODIO 0,9% SERINGA PREENCHIDA 5ML",
                       "CLORETO DE SODIO SOLUCAO INJETAVEL 0,9% 10ML",
                       "CODEINA FOSFATO SOLUCAO ORAL 3MGML 10ML ISCMPA @",
                       "CODEINA FOSFATO SOLUCAO ORAL 3MGML 5ML ISCMPA @",
                       "DipiRONA SOLUCAO INJETAVEL 500MGML 2ML",
                       "DipiRONA SOLUCAO INJETAVEL 500MGML 2ML",
                       "FUROSEMIDA SOLUCAO INJETAVEL 10MGML 2ML",
                       "HIDROCORTISONA SUCCINATO SODICO PO LIOFILO
INJETAVEL 100MG",
                       "ONDANSETRONA CLORIDRATO SOLUCAO INJETAVEL 2MGML
4ML",
                       "ONDANSETRONA CLORIDRATO SOLUCAO INJETAVEL 2MGML
4ML",
                       "Penicilina G BENZATINA PO LIOFILO INJETAVEL
1200000UI",
                       "Penicilina G BENZATINA PO LIOFILO INJETAVEL
1200000UI",
                       "PIPERACILINA SODICA 4G + TAZOBACTAM SODICA 0,5G PO
LIOFILO INJETAVEL"))




results <- regex_left_join(df,
                           patterns,
                           by = "name")

head(results)

# Identify with grep() - other way.
matches  <- unlist(sapply(patterns, function(p) grep(p, df$name,
                                                     value = FALSE,
                                                     ignore.case = TRUE)
                          )
                   )

anti <- df[matches,]

However, when the name is composed it does not work (for example:
Piperacillin-tazobactam)

Can anyone help me in this issue?

	[[alternative HTML version deleted]]



More information about the R-sig-Epi mailing list