[R] Removing variables from data frame with a wile card

Sun Feb 12 23:57:36 CET 2023

x["V2"]

is more efficient than using drop=FALSE, and perfectly normal syntax (data frames are lists of columns).  I would ignore the naysayers, or put a comment in if you want to accelerate their uptake.

As I understand it, one of the main reasons tibbles exist is because of drop=TRUE. List-slice (single-dimension) indexing works equally well with both standard and tibble types of data frames.

On February 12, 2023 2:30:15 PM PST, Andrew Simmons <akwsimmo using gmail.com> wrote:
>drop = FALSE means that should the indexing select exactly one column, then
>return a data frame with one column, instead of the object in the column.
>It's usually not necessary, but I've messed up some data before by assuming
>the indexing always returns a data frame when it doesn't, so drop = FALSE
>let's me that I will always get a data frame.
>
>```
>x <- data.frame(V1 = 1:5, V2 = letters[1:5])
>x[, "V2"]
>x[, "V2", drop = FALSE]
>```
>
>You'll notice that the first returns a character vector, a through e, where
>the second returns a data frame with one column where the object in the
>column is the same character vector.
>
>You could alternatively use
>
>x["V2"]
>
>which should be identical to x[, "V2", drop = FALSE], but some people don't
>like that because it doesn't look like matrix indexing anymore.
>
>
>On Sun, Feb 12, 2023, 17:18 Steven T. Yen <styen using ntu.edu.tw> wrote:
>
>> In the line suggested by Andrew Simmons,
>>
>> mydata <- mydata[, !grepl("^yr", colnames(mydata)), drop = FALSE]
>>
>> what does drop=FALSE do? Thanks.
>>
>> On 1/14/2023 8:48 PM, Steven Yen wrote:
>>
>> Thanks to all. Very helpful.
>>
>> Steven from iPhone
>>
>> On Jan 14, 2023, at 3:08 PM, Andrew Simmons <akwsimmo using gmail.com>
>> <akwsimmo using gmail.com> wrote:
>>
>> You'll want to use grep() or grepl(). By default, grep() uses extended
>> regular expressions to find matches, but you can also use perl regular
>> expressions and globbing (after converting to a regular expression).
>> For example:
>>
>> grepl("^yr", colnames(mydata))
>>
>> will tell you which 'colnames' start with "yr". If you'd rather you
>> use globbing:
>>
>> grepl(glob2rx("yr*"), colnames(mydata))
>>
>> Then you might write something like this to remove the columns starting
>> with yr:
>>
>> mydata <- mydata[, !grepl("^yr", colnames(mydata)), drop = FALSE]
>>
>> On Sat, Jan 14, 2023 at 1:56 AM Steven T. Yen <styen using ntu.edu.tw>
>> <styen using ntu.edu.tw> wrote:
>>
>>
>> I have a data frame containing variables "yr3",...,"yr28".
>>
>>
>> How do I remove them with a wild card----something similar to "del yr*"
>>
>> in Windows/doc? Thank you.
>>
>>
>> colnames(mydata)
>>
>>   [1] "year"       "weight"     "confeduc"   "confothr" "college"
>>
>>   [6] ...
>>
>>  [41] "yr3"        "yr4"        "yr5"        "yr6" "yr7"
>>
>>  [46] "yr8"        "yr9"        "yr10"       "yr11" "yr12"
>>
>>  [51] "yr13"       "yr14"       "yr15"       "yr16" "yr17"
>>
>>  [56] "yr18"       "yr19"       "yr20"       "yr21" "yr22"
>>
>>  [61] "yr23"       "yr24"       "yr25"       "yr26" "yr27"
>>
>>  [66] "yr28"...
>>
>>
>> ______________________________________________
>>
>> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>
>> https://stat.ethz.ch/mailman/listinfo/r-help
>>
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>>
>> and provide commented, minimal, self-contained, reproducible code.
>>
>>
>
>	[[alternative HTML version deleted]]
>
>______________________________________________
>R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.

-- 
Sent from my phone. Please excuse my brevity.