[R] Removing variables from data frame with a wile card
Andrew Simmons
@kw@|mmo @end|ng |rom gm@||@com
Sun Feb 12 23:30:15 CET 2023
drop = FALSE means that should the indexing select exactly one column, then
return a data frame with one column, instead of the object in the column.
It's usually not necessary, but I've messed up some data before by assuming
the indexing always returns a data frame when it doesn't, so drop = FALSE
let's me that I will always get a data frame.
```
x <- data.frame(V1 = 1:5, V2 = letters[1:5])
x[, "V2"]
x[, "V2", drop = FALSE]
```
You'll notice that the first returns a character vector, a through e, where
the second returns a data frame with one column where the object in the
column is the same character vector.
You could alternatively use
x["V2"]
which should be identical to x[, "V2", drop = FALSE], but some people don't
like that because it doesn't look like matrix indexing anymore.
On Sun, Feb 12, 2023, 17:18 Steven T. Yen <styen using ntu.edu.tw> wrote:
> In the line suggested by Andrew Simmons,
>
> mydata <- mydata[, !grepl("^yr", colnames(mydata)), drop = FALSE]
>
> what does drop=FALSE do? Thanks.
>
> On 1/14/2023 8:48 PM, Steven Yen wrote:
>
> Thanks to all. Very helpful.
>
> Steven from iPhone
>
> On Jan 14, 2023, at 3:08 PM, Andrew Simmons <akwsimmo using gmail.com>
> <akwsimmo using gmail.com> wrote:
>
> You'll want to use grep() or grepl(). By default, grep() uses extended
> regular expressions to find matches, but you can also use perl regular
> expressions and globbing (after converting to a regular expression).
> For example:
>
> grepl("^yr", colnames(mydata))
>
> will tell you which 'colnames' start with "yr". If you'd rather you
> use globbing:
>
> grepl(glob2rx("yr*"), colnames(mydata))
>
> Then you might write something like this to remove the columns starting
> with yr:
>
> mydata <- mydata[, !grepl("^yr", colnames(mydata)), drop = FALSE]
>
> On Sat, Jan 14, 2023 at 1:56 AM Steven T. Yen <styen using ntu.edu.tw>
> <styen using ntu.edu.tw> wrote:
>
>
> I have a data frame containing variables "yr3",...,"yr28".
>
>
> How do I remove them with a wild card----something similar to "del yr*"
>
> in Windows/doc? Thank you.
>
>
> colnames(mydata)
>
> [1] "year" "weight" "confeduc" "confothr" "college"
>
> [6] ...
>
> [41] "yr3" "yr4" "yr5" "yr6" "yr7"
>
> [46] "yr8" "yr9" "yr10" "yr11" "yr12"
>
> [51] "yr13" "yr14" "yr15" "yr16" "yr17"
>
> [56] "yr18" "yr19" "yr20" "yr21" "yr22"
>
> [61] "yr23" "yr24" "yr25" "yr26" "yr27"
>
> [66] "yr28"...
>
>
> ______________________________________________
>
> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
>
> https://stat.ethz.ch/mailman/listinfo/r-help
>
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
>
> and provide commented, minimal, self-contained, reproducible code.
>
>
[[alternative HTML version deleted]]
More information about the R-help
mailing list