>From nm to micron, _divide_ by 1000.... (as you likely know) What are the units of the first value? Looks like micron in your example, but is there a rule?
Basically, it is a "last observation carried forward" type problem, so something like this: my.data <- structure(list(V1 = c("2019/05/10", "#", "#", "#", "2019/05/10", "2019/05/10", "2019/05/10", "#", "#", "#", "2019/05/10", "#", "#", "#", "2019/05/10", "#", "#", "#", "2019/05/10", "2019/05/10"), V19 = c("0.2012800083", "45", "Sq", "µm", "0.3634383236", "0.4360454777", "0.3767733568", "45", "Sq", "nm", "102.013048", "45", "Sq", "µm", "0.1413840498", "45", "Sq", "nm", "65.4459715", "46.45802917")), row.names = c(NA, 20L), class = "data.frame") y <- my.data$V19 u <- ifelse(y=="nm" | y=="µm", y, NA) num <- my.data$V1 != "#" uu <- zoo::na.locf(u, na.rm=FALSE) data.frame(val = as.numeric(y[num]), units = uu[num]) giving val units 1 0.2012800 <NA> 2 0.3634383 µm 3 0.4360455 µm 4 0.3767734 µm 5 102.0130480 nm 6 0.1413840 µm 7 65.4459715 nm 8 46.4580292 nm and you can surely take it from there. -pd > On 10 May 2019, at 13:54 , Ivan Calandra <calan...@rgzm.de> wrote: > > Dear useRs, > > Below is a sample of my dataset (I have more rows and columns). > > As you can see in the 2nd column, there are values, the name of the parameter > ('Sq' in that case), some integer ('45' in that case) and the unit ('µm' or > 'nm'). > I know how to extract the rows of interest (those with values), but they are > expressed in different units. All values following a line with the unit are > expressed in that unit, but the number of lines is not constant (sometimes > each > value is expressed in a different unit so there will be a new unit line, but > there are sometimes several values in a row expressed in the same unit so > without unit lines in between). I hope this is clear (it should be with the > example provided). > This messy dataset comes from an external software so I don't have any means > to > format the ways the data are collated. I have to find a way to deal with it in > R. > > What I would like to do is convert the values in nm to µm; I just need to > multiply by 1000. > > What I don't know is how to identify the values that are expressed in nm (all > values that follow a line with 'nm' until there is a line with 'µm'). > > I don't even know how I should search online because I don't know how this > kind > of operation is called. > Any help is appreciated. > > Thank you in advance. > Ivan > > > my.data <- structure(list(V1 = c("2019/05/10", "#", "#", "#", "2019/05/10", > "2019/05/10", "2019/05/10", "#", "#", "#", "2019/05/10", "#", "#", "#", > "2019/05/10", "#", "#", "#", "2019/05/10", "2019/05/10"), V19 = > c("0.2012800083", "45", "Sq", "µm", "0.3634383236", "0.4360454777", > "0.3767733568", "45", "Sq", "nm", "102.013048", "45", "Sq", "µm", > "0.1413840498", "45", "Sq", "nm", "65.4459715", "46.45802917")), row.names = > c(NA, 20L), class = "data.frame") > > -- > Dr. Ivan Calandra > TraCEr, laboratory for Traceology and Controlled Experiments > MONREPOS Archaeological Research Centre and > Museum for Human Behavioural Evolution > Schloss Monrepos > 56567 Neuwied, Germany > +49 (0) 2631 9772-243 > https://www.researchgate.net/profile/Ivan_Calandra > > ______________________________________________ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. -- Peter Dalgaard, Professor, Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Office: A 4.23 Email: pd....@cbs.dk Priv: pda...@gmail.com ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.