[R] convert microns to nm in a messy dataset

peter dalgaard pd@|gd @end|ng |rom gm@||@com
Fri May 10 15:29:57 CEST 2019


From nm to micron, _divide_ by 1000.... (as you likely know)

What are the units of the first value? Looks like micron in your example, but is there a rule?

Basically, it is a "last observation carried forward" type problem, so something like this:


my.data <- structure(list(V1 = c("2019/05/10", "#", "#", "#", "2019/05/10",
"2019/05/10", "2019/05/10", "#", "#", "#", "2019/05/10", "#", "#", "#",
"2019/05/10", "#", "#", "#", "2019/05/10", "2019/05/10"), V19 =
c("0.2012800083", "45", "Sq", "µm", "0.3634383236", "0.4360454777",
"0.3767733568", "45", "Sq", "nm", "102.013048", "45", "Sq", "µm",
"0.1413840498", "45", "Sq", "nm", "65.4459715", "46.45802917")), row.names =
c(NA, 20L), class = "data.frame")

y <- my.data$V19 
u <- ifelse(y=="nm" | y=="µm", y, NA)
num <- my.data$V1 != "#"
uu <- zoo::na.locf(u, na.rm=FALSE)
data.frame(val = as.numeric(y[num]), units = uu[num])

giving 
          val units
1   0.2012800  <NA>
2   0.3634383    µm
3   0.4360455    µm
4   0.3767734    µm
5 102.0130480    nm
6   0.1413840    µm
7  65.4459715    nm
8  46.4580292    nm

and you can surely take it from there.

-pd


> On 10 May 2019, at 13:54 , Ivan Calandra <calandra using rgzm.de> wrote:
> 
> Dear useRs,
> 
> Below is a sample of my dataset (I have more rows and columns).
> 
> As you can see in the 2nd column, there are values, the name of the parameter
> ('Sq' in that case), some integer ('45' in that case) and the unit ('µm' or
> 'nm').
> I know how to extract the rows of interest (those with values), but they are
> expressed in different units. All values following a line with the unit are
> expressed in that unit, but the number of lines is not constant (sometimes each
> value is expressed in a different unit so there will be a new unit line, but
> there are sometimes several values in a row expressed in the same unit so
> without unit lines in between). I hope this is clear (it should be with the
> example provided).
> This messy dataset comes from an external software so I don't have any means to
> format the ways the data are collated. I have to find a way to deal with it in
> R.
> 
> What I would like to do is convert the values in nm to µm; I just need to
> multiply by 1000.
> 
> What I don't know is how to identify the values that are expressed in nm (all
> values that follow a line with 'nm' until there is a line with 'µm').
> 
> I don't even know how I should search online because I don't know how this kind
> of operation is called.
> Any help is appreciated.
> 
> Thank you in advance.
> Ivan
> 
> 
> my.data <- structure(list(V1 = c("2019/05/10", "#", "#", "#", "2019/05/10",
> "2019/05/10", "2019/05/10", "#", "#", "#", "2019/05/10", "#", "#", "#",
> "2019/05/10", "#", "#", "#", "2019/05/10", "2019/05/10"), V19 =
> c("0.2012800083", "45", "Sq", "µm", "0.3634383236", "0.4360454777",
> "0.3767733568", "45", "Sq", "nm", "102.013048", "45", "Sq", "µm",
> "0.1413840498", "45", "Sq", "nm", "65.4459715", "46.45802917")), row.names =
> c(NA, 20L), class = "data.frame")
> 
> --
> Dr. Ivan Calandra
> TraCEr, laboratory for Traceology and Controlled Experiments
> MONREPOS Archaeological Research Centre and
> Museum for Human Behavioural Evolution
> Schloss Monrepos
> 56567 Neuwied, Germany
> +49 (0) 2631 9772-243
> https://www.researchgate.net/profile/Ivan_Calandra
> 
> ______________________________________________
> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Office: A 4.23
Email: pd.mes using cbs.dk  Priv: PDalgd using gmail.com



More information about the R-help mailing list