[R] < symbols in a data frame

Sarah Goslee sarah.goslee at gmail.com
Wed Jul 9 19:26:54 CEST 2014


Hi Sam,

I'd take the similar tack of removing the < instead. Note that if you
import the data frame using the stringsAsFactors=FALSE argument, you
don't need the first step.

metals$Cedar.Creek <- as.character(metals$Cedar.Creek)
metals$Cedar.Creek <- gsub("<", "", metals$Cedar.Creek)
metals$Cedar.Creek <- as.numeric(metals$Cedar.Creek)

R> str(metals)
'data.frame':    19 obs. of  2 variables:
 $ Parameter  : Factor w/ 20 levels "Antimony","Arsenic",..: 1 2 3 4 6
7 8 9 10 11 ...
 $ Cedar.Creek: num  100 100 500 100 10 1000 100 516 550 10 ...

Sarah


On Wed, Jul 9, 2014 at 1:19 PM, Sam Albers <tonightsthenight at gmail.com> wrote:
> Hello,
>
> I have recently received a dataset from a metal analysis company. The
> dataset is filled with less than symbols. What I am looking for is a
> efficient way to subset for any whole numbers from the dataset. The column
> is automatically formatted as a factor because of the "<" symbols making it
> difficult to deal with the numbers is a useful way.
>
> So in sum any ideas on how I could subset the example below for only whole
> numbers?
>
> Thanks in advance!
>
> Sam
>
> #code
>
> metals <-
>
>
> structure(list(Parameter = structure(c(1L, 2L, 3L, 4L, 6L, 7L,
> 8L, 9L, 10L, 11L, 12L, 13L, 15L, 16L, 17L, 18L, 19L, 20L, 1L), .Label
> = c("Antimony",
> "Arsenic", "Barium", "Beryllium", "Boron (Hot Water Soluble)",
> "Cadmium", "Chromium", "Cobalt", "Copper", "Lead", "Mercury",
> "Molybdenum", "Nickel", "pH 1:2", "Selenium", "Silver", "Thallium",
> "Tin", "Vanadium", "Zinc"), class = "factor"), Cedar.Creek = structure(c(3L,
> 3L, 7L, 3L, 2L, 4L, 3L, 34L, 36L, 2L, 5L, 7L, 3L, 7L, 3L, 45L,
> 4L, 4L, 3L), .Label = c("<1", "<10", "<100", "<1000", "<200",
> "<5", "<500", "0.1", "0.13", "0.5", "0.8", "1.07", "1.1", "1.4",
> "1.5", "137", "154", "163", "165", "169", "178", "2.3", "2.4",
> "22", "24", "244", "27.2", "274", "3", "3.1", "40.2", "43", "50",
> "516", "53.3", "550", "569", "65", "66.1", "68", "7.6", "72",
> "77", "89", "951"), class = "factor")), .Names = c("Parameter",
> "Cedar.Creek"), row.names = c(NA, 19L), class = "data.frame")
>

-- 
Sarah Goslee
http://www.functionaldiversity.org



More information about the R-help mailing list