[R] Row exclude
Rui Barradas
ru|pb@rr@d@@ @end|ng |rom @@po@pt
Sat Jan 29 09:46:33 CET 2022
Hello,
Getting creative, here is another way with mapply.
regex <- list("[[:digit:]]", "[[:alpha:]]", "[[:alpha:]]")
i <- mapply(\(x, r) grepl(r, x), dat1, regex)
dat1[rowSums(i) == 0L, ]
# Name Age Weight
#2 Bob 25 142
#3 Carol 24 120
#5 Katy 35 160
Hope this helps,
Rui Barradas
Às 06:30 de 29/01/2022, David Carlson via R-help escreveu:
> Given that you know which columns should be numeric and which should be
> character, finding characters in numeric columns or numbers in character
> columns is not difficult. Your data frame consists of three character
> columns so you can use regular expressions as Bert mentioned. First you
> should strip the whitespace out of your data:
>
> dat1 <-read.table(text="Name, Age, Weight
> Alex, 20, 13X
> Bob, 25, 142
> Carol, 24, 120
> John, 3BC, 175
> Katy, 35, 160
> Jack3, 34, 140",sep=",", header=TRUE, stringsAsFactors=FALSE,
> strip.white=TRUE)
>
> Now check to see if all of the fields are character as expected.
>
> sapply(dat1, typeof)
> # Name Age Weight
> # "character" "character" "character"
>
> Now identify character variables containing numbers and numeric variables
> containing characters:
>
> BadName <- which(grepl("[[:digit:]]", dat1$Name))
> BadAge <- which(grepl("[[:alpha:]]", dat1$Age))
> BadWeight <- which(grepl("[[:alpha:]]", dat1$Weight))
>
> Next remove those rows:
>
> (dat2 <- dat1[-unique(c(BadName, BadAge, BadWeight)), ])
> # Name Age Weight
> # 2 Bob 25 142
> # 3 Carol 24 120
> # 5 Katy 35 160
>
> You still need to convert Age and Weight to numeric, e.g. dat2$Age <-
> as.numeric(dat2$Age).
>
> David Carlson
>
>
> On Fri, Jan 28, 2022 at 11:59 PM Bert Gunter <bgunter.4567 using gmail.com> wrote:
>
>> As character 'polluted' entries will cause a column to be read in (via
>> read.table and relatives) as factor or character data, this sounds like a
>> job for regular expressions. If you are not familiar with this subject,
>> time to learn. And, yes, ZjQcmQRYFpfptBannerStart
>> This Message Is From an External Sender
>> This message came from outside your organization.
>> ZjQcmQRYFpfptBannerEnd
>>
>> As character 'polluted' entries will cause a column to be read in (via
>> read.table and relatives) as factor or character data, this sounds like a
>> job for regular expressions. If you are not familiar with this subject,
>> time to learn. And, yes, some heavy lifting will be required.
>> See ?regexp for a start maybe? Or the stringr package?
>>
>> Cheers,
>> Bert
>>
>>
>>
>>
>> On Fri, Jan 28, 2022, 7:08 PM Val <valkremk using gmail.com> wrote:
>>
>>> Hi All,
>>>
>>> I want to remove rows that contain a character string in an integer
>>> column or a digit in a character column.
>>>
>>> Sample data
>>>
>>> dat1 <-read.table(text="Name, Age, Weight
>>> Alex, 20, 13X
>>> Bob, 25, 142
>>> Carol, 24, 120
>>> John, 3BC, 175
>>> Katy, 35, 160
>>> Jack3, 34, 140",sep=",",header=TRUE,stringsAsFactors=F)
>>>
>>> If the Age/Weight column contains any character(s) then remove
>>> if the Name column contains an digit then remove that row
>>> Desired output
>>>
>>> Name Age weight
>>> 1 Bob 25 142
>>> 2 Carol 24 120
>>> 3 Katy 35 160
>>>
>>> Thank you,
>>>
>>> ______________________________________________
>>> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>> https://urldefense.com/v3/__https://stat.ethz.ch/mailman/listinfo/r-help__;!!KwNVnqRv!QW1WPKY5eSNT7sMW28dnAKV7IXWvIc4UwOwUHkJgJ8uuGUrIAXvRjZWVXhZB_0c$
>>> PLEASE do read the posting guide
>>> https://urldefense.com/v3/__http://www.R-project.org/posting-guide.html__;!!KwNVnqRv!QW1WPKY5eSNT7sMW28dnAKV7IXWvIc4UwOwUHkJgJ8uuGUrIAXvRjZWVRmZSfcI$
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>> [[alternative HTML version deleted]]
>>
>> ______________________________________________R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, seehttps://urldefense.com/v3/__https://stat.ethz.ch/mailman/listinfo/r-help__;!!KwNVnqRv!QW1WPKY5eSNT7sMW28dnAKV7IXWvIc4UwOwUHkJgJ8uuGUrIAXvRjZWVXhZB_0c$
>> PLEASE do read the posting guide https://urldefense.com/v3/__http://www.R-project.org/posting-guide.html__;!!KwNVnqRv!QW1WPKY5eSNT7sMW28dnAKV7IXWvIc4UwOwUHkJgJ8uuGUrIAXvRjZWVRmZSfcI$
>> and provide commented, minimal, self-contained, reproducible code.
>>
>>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list