[R] HOW TO FILTER DATA
macqueen1 at llnl.gov
Thu Jan 4 17:41:36 CET 2018
Just a couple of minor comments:
No vignettes or demos or help files found with alias or concept or
title matching 'read_delim' using regular expression matching.
read_delim is not part of base R; it must come from some unnamed non-base package. I'd recommend using base R as much as possible for someone who is new to R, as I suspect the original poster is.
The call to subset would be better written as
df_new <- subset(df, IPC == 'H04M001/02' | IPC == 'C07K016/26' )
df_new <- subset(df, df$IPC == 'H04M001/02' | df$IPC == 'C07K016/26' )
IPC is a variable within the data frame, so it is unnecessary to include the data frame's name in the logical expression.
Lawrence Livermore National Laboratory
7000 East Ave., L-627
Livermore, CA 94550
Lab cell 925-724-7509
On 1/3/18, 12:54 PM, "R-help on behalf of Leilei Ruan" <r-help-bounces at r-project.org on behalf of ruanleilei at gmail.com> wrote:
Try the code below:
df <- read_delim("C:/Users/lruan1/Desktop/1112.csv", "|", escape_double =
FALSE, trim_ws = TRUE)
df_new <- subset(df,df$IPC == 'H04M001/02'| df$IPC == 'C07K016/26' )
You can add more condition with "|" in the subset function. Good luck!
On Wed, Jan 3, 2018 at 2:53 PM, Saptorshee Kanto Chakraborty <
chkstr at unife.it> wrote:
> I have a data of Patents from OECD in delimited text format with IPC being
> one column, I want to filter the data by selecting only certain IPC in that
> column and delete other rows which do not have my required IPCs. Please,
> can anybody guide me doing it, also the IPC codes are string variables.
> The data is somewhat like below, but its a huge dataset containing more
> than 11 million rows
> Thanking You
> [[alternative HTML version deleted]]
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> PLEASE do read the posting guide http://www.R-project.org/
> and provide commented, minimal, self-contained, reproducible code.
[[alternative HTML version deleted]]
R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
More information about the R-help