[R] Filtering an Entire Dataset based on Several Conditions

Paul Bernal p@u|bern@|07 @end|ng |rom gm@||@com
Mon May 9 18:44:04 CEST 2022


Dear Rui,

I was trying to dput() the datasets I am working on, but since it is a bit
large (42,000 rows by 60 columns) couldn´t retrieve all the structure of
the data to include it here, so I am attaching a couple of files. One is
the raw data (called trainFeatures42k), which is the data I need to
normalize, and the other is normalized_Data, which is the data normalized
(or at least I think I got to normalize it).

 Normalized_Data.csv
<https://drive.google.com/file/d/143I1O710gAqWjzx48Gt1bwUbrG0mbpfa/view?usp=drive_web>
 trainFeatures42k.xls
<https://drive.google.com/file/d/1deMzGMkJyeVsnRzTKirmm4VqIBRzbvzV/view?usp=drive_web>

I have tried some of the code you and other friends from the community have
kindly shared, but have not been able to filter values > -3 and < 3.

Thank you all for your valuable help always.
Best,
Paul

El lun, 9 may 2022 a las 4:22, Rui Barradas (<ruipbarradas using sapo.pt>)
escribió:

> Hello,
>
> Something like this?
> First normalize the data.
> Then a apply loop creates a logical matrix giving which numbers are in
> the range -3 to 3.
> If they are all TRUE then their sum by rows is equal to the number of
> columns. This creates a logical index i.
> Use that index i to subset the scaled data set.
>
> # test data set, remove the Species column (not numeric)
> df1 <- iris[-5]
>
> df1_norm <- scale(df1)
> i <- rowSums(apply(df1_norm, 2, \(x) x > -3 & x < 3)) == ncol(df1_norm)
>
> # returns a matrix
> df1_norm[i, ]
>
> # returns a data.frame
> as.data.frame(df1_norm[i,])
>
>
> Hope this helps,
>
> Rui Barradas
>
> Às 09:23 de 09/05/2022, Paul Bernal escreveu:
> > Dear friends,
> >
> > I have a dataframe which every single (i,j) entry (i standing for ith
> row,
> > j for jth column) has been normalized (converted to z-scores).
> >
> > Now I want to filter or subset the dataframe so that I only end up with
> a a
> > dataframe containing only entries greater than -3 or less than 3.
> >
> > How could I accomplish this?
> >
> > Best,
> > Paul
> >
> >       [[alternative HTML version deleted]]
> >
> > ______________________________________________
> > R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>

	[[alternative HTML version deleted]]



More information about the R-help mailing list