[R] subset data using a vector

DIGHE, NILESH [AG/2362] nilesh.dighe at monsanto.com
Mon Nov 23 17:26:29 CET 2015


Michael:  I like to use the actual range id's listed in column "rangestouse" to subset my data and not the length of that vector.

Thanks.
Nilesh

-----Original Message-----
From: Michael Dewey [mailto:lists at dewey.myzen.co.uk] 
Sent: Monday, November 23, 2015 10:17 AM
To: DIGHE, NILESH [AG/2362]; r-help at r-project.org
Subject: Re: [R] subset data using a vector

length(strsplit(as.character(mydata$ranges2use), ","))

was that what you expected? I think not.

On 23/11/2015 16:05, DIGHE, NILESH [AG/2362] wrote:
> Dear R users,
>                  I like to split my data by a vector created by using variable "ranges".  This vector will have the current range (ranges), preceding range (ranges - 1), and post range (ranges + 1) for a given plotid.  If the preceding or post ranges in this vector are outside the levels of ranges in the data set then I like to drop those ranges and only include the ranges that are available.  Variable "rangestouse" includes all the desired ranges I like to subset a given plotid.  After I subset these dataset using these desired ranges, then I like to extract the yield data for checks in those desired ranges and adjust yield of my data by dividing yield of a given plotid with the check average for the desired ranges.
>
> I have created this function (fun1) but when I run it, I get the following error:
>
> Error in m1[[i]] : subscript out of bounds
>
> Any help will be highly appreciated!
> Thanks, Nilesh
>
> Dataset:
> dput(mydata)
> structure(list(rows = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
> 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
> 2L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 
> 3L, 3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 
> 4L, 4L, 4L, 4L), .Label = c("1", "2", "3", "4"), class = "factor"), 
> cols = structure(c(1L, 10L, 11L, 12L, 13L, 14L, 15L, 16L, 2L, 3L, 4L, 
> 5L, 6L, 7L, 8L, 9L, 1L, 10L, 11L, 12L, 13L, 14L, 15L, 16L, 2L, 3L, 4L, 
> 5L, 6L, 7L, 8L, 9L, 1L, 10L, 11L, 12L, 13L, 14L, 15L, 16L, 2L, 3L, 4L, 
> 5L, 6L, 7L, 8L, 9L, 1L, 10L, 11L, 12L, 13L, 14L, 15L, 16L, 2L, 3L, 4L, 
> 5L, 6L, 7L, 8L, 9L), .Label = c("1", "2", "3", "4", "5", "6", "7", 
> "8", "9", "10", "11", "12", "13", "14", "15", "16"), class = "factor"),
>      plotid = c(289L, 298L, 299L, 300L, 301L, 302L, 303L, 304L,
>      290L, 291L, 292L, 293L, 294L, 295L, 296L, 297L, 384L, 375L,
>      374L, 373L, 372L, 371L, 370L, 369L, 383L, 382L, 381L, 380L,
>      379L, 378L, 377L, 376L, 385L, 394L, 395L, 396L, 397L, 398L,
>      399L, 400L, 386L, 387L, 388L, 389L, 390L, 391L, 392L, 393L,
>      480L, 471L, 470L, 469L, 468L, 467L, 466L, 465L, 479L, 478L,
>      477L, 476L, 475L, 474L, 473L, 472L), yield = c(5.1, 5, 3.9,
>      4.6, 5, 4.4, 5.1, 4.3, 5.5, 5, 5.5, 6.2, 5.1, 5.5, 5.2, 5,
>      5.6, 4.7, 5.4, 4.8, 4.6, 3.9, 4.2, 4.4, 5.3, 5.5, 5.8, 4.6,
>      5.8, 4.8, 5.3, 5.5, 5.6, 4.2, 4.6, 4.2, 4.2, 4, 3.9, 4.5,
>      5, 4.8, 4.9, 5.2, 5.3, 4.6, 4.8, 5.3, 4.5, 4.5, 5.1, 4.9,
>      5.2, 4.6, 4.8, 5.4, 5.9, 4.9, 5.8, 5.3, 4.8, 4.7, 5.2, 5.8
>      ), linecode = structure(c(1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L,
>      2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 2L, 2L, 2L, 2L,
>      2L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
>      1L, 2L, 2L, 2L, 2L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
>      2L, 2L, 2L, 2L, 2L, 1L, 2L, 2L, 2L, 2L, 1L), .Label = c("check",
>      "variety"), class = "factor"), ranges = c(1L, 1L, 1L, 1L,
>      1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L,
>      2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 3L, 3L,
>      3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 4L,
>      4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L
>      ), rangestouse = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L,
>      1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L,
>      2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L,
>      3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L,
>      4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L), .Label = c("1,2",
>      "1,2,3", "2,3,4", "3,4"), class = "factor")), .Names = c("rows", 
> "cols", "plotid", "yield", "linecode", "ranges", "rangestouse"
>
> ), class = "data.frame", row.names = c(NA, -64L))
>
> Function:
>
> fun1<- function (dataset, plot.id, ranges2use, control)
>
> {
>
>      m1 <- strsplit(as.character(dataset$ranges2use), ",")
>
>      dat1 <- data.frame()
>
>      m2 <- c()
>
>      row_check_mean <- c()
>
>      row_check_adj_yield <- c()
>
>      x <- length(plot.id)
>
>      for (i in (1:x)) {
>
>          m2[i] <- m1[[i]]
>
>          dat1 <- dataset[dataset$ranges %in% m2[i], ]
>
>          row_check_mean[i] <- tapply(dat1$trait, dat1$control,
>
>              mean, na.rm = TRUE)[1]
>
>          row_check_adj_yield[i] <- ifelse(control[i] == "variety",
>
>              trait[i]/dataset$row_check_mean[i], trait[i]/trait[i])
>
>      }
>
>      data.frame(dataset, row_check_adj_yield)
>
> }
>
> Apply function:
> fun1(mydata, plot.id=mydata$plotid, ranges2use = 
> mydata$rangestouse,control=mydata$linecode)
>
> Error:
>
> Error in m1[[i]] : subscript out of bounds
>
> Session info:
>
> R version 3.2.1 (2015-06-18)
>
> Platform: i386-w64-mingw32/i386 (32-bit)
>
> Running under: Windows 7 x64 (build 7601) Service Pack 1
>
>
>
> locale:
>
> [1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United 
> States.1252
>
> [3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C
>
> [5] LC_TIME=English_United States.1252
>
>
>
> attached base packages:
>
> [1] stats     graphics  grDevices utils     datasets  methods   base
>
>
>
> loaded via a namespace (and not attached):
>
>   [1] magrittr_1.5    plyr_1.8.3      tools_3.2.1     reshape2_1.4.1  Rcpp_0.12.1     stringi_1.0-1
>
>   [7] grid_3.2.1      agridat_1.12    stringr_1.0.0   lattice_0.20-31
>
>
> Nilesh Dighe
> (806)-252-7492 (Cell)
> (806)-741-2019 (Office)
>
>
> This e-mail message may contain privileged and/or confidential 
> information, and is intended to be received only by persons entitled 
> to receive such information. If you have received this e-mail in error, please notify the sender immediately. Please delete it and all attachments from any servers, hard drives or any other media. Other use of this e-mail by you is strictly prohibited.
>
> All e-mails and attachments sent and received are subject to 
> monitoring, reading and archival by Monsanto, including its subsidiaries. The recipient of this e-mail is solely responsible for checking for the presence of "Viruses" or other "Malware".
> Monsanto, along with its subsidiaries, accepts no liability for any 
> damage caused by any such code transmitted by or accompanying this e-mail or any attachment.
>
>
> The information contained in this email may be subject to the export 
> control laws and regulations of the United States, potentially 
> including but not limited to the Export Administration Regulations 
> (EAR) and sanctions regulations issued by the U.S. Department of Treasury, Office of Foreign Asset Controls (OFAC).  As a recipient of this information you are obligated to comply with all applicable U.S. export laws and regulations.
>
> 	[[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see 
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

--
Michael
http://www.dewey.myzen.co.uk/home.html
This e-mail message may contain privileged and/or confidential information, and is intended to be received only by persons entitled
to receive such information. If you have received this e-mail in error, please notify the sender immediately. Please delete it and
all attachments from any servers, hard drives or any other media. Other use of this e-mail by you is strictly prohibited.

All e-mails and attachments sent and received are subject to monitoring, reading and archival by Monsanto, including its
subsidiaries. The recipient of this e-mail is solely responsible for checking for the presence of "Viruses" or other "Malware".
Monsanto, along with its subsidiaries, accepts no liability for any damage caused by any such code transmitted by or accompanying
this e-mail or any attachment.


The information contained in this email may be subject to the export control laws and regulations of the United States, potentially
including but not limited to the Export Administration Regulations (EAR) and sanctions regulations issued by the U.S. Department of
Treasury, Office of Foreign Asset Controls (OFAC).  As a recipient of this information you are obligated to comply with all
applicable U.S. export laws and regulations.



More information about the R-help mailing list