[R] grep() exclude certain patterns?

Fri Dec 4 22:18:46 CET 2009

On Fri, Dec 4, 2009 at 3:06 PM, Peng Yu <pengyu.ut at gmail.com> wrote:
> On Fri, Dec 4, 2009 at 2:35 PM, Greg Snow <Greg.Snow at imail.org> wrote:
>> The invert argument seems a likely candidate, you could also do perl=TRUE and use negations within the pattern (but that is probably overkill for your original question).
>
> I don't see 'invert' in the R version (2.7.1) that I use. Here is the
> snip from ?grep
>
> Usage:
>
>     grep(pattern, x, ignore.case = FALSE, extended = TRUE,
>          perl = FALSE, value = FALSE, fixed = FALSE, useBytes = FALSE)
>
>     sub(pattern, replacement, x,
>         ignore.case = FALSE, extended = TRUE, perl = FALSE,
>         fixed = FALSE, useBytes = FALSE)
>
>     gsub(pattern, replacement, x,
>          ignore.case = FALSE, extended = TRUE, perl = FALSE,
>          fixed = FALSE, useBytes = FALSE)
>
>     regexpr(pattern, text, ignore.case = FALSE, extended = TRUE,
>             perl = FALSE, fixed = FALSE, useBytes = FALSE)
>
>     gregexpr(pattern, text, ignore.case = FALSE, extended = TRUE,
>              perl = FALSE, fixed = FALSE, useBytes = FALSE)
>
>
>> Could you explain to us the process that you use to search for answers to your questions before posting?  You have been asking quite a few questions that have answers out there if you can find them.  If you tell us where you are looking (and why) then we may be able to suggest some different search strategies that will help you find the answers quicker.  Also knowing your thought process may help us in designing future help/tutorials that cater more to people learning R for the first time, things that seem obvious to those of us who have been using the current documentation, apparently are not that obvious to some new users (but also realize that the first place that you may think to look may not even occur to some of us that learned computers in a different time, see fortune(89) ).
>
> For this particular problem in the original post, it is due to the
> fact that I use an older R.
>
> But in general, the R help and examples in the help page should be
> improved in terms of the structure. Just as we write a paper, it is
> better to have a hierarchical descriptions (i.e., which is similar to
> the flow of abstract -> introduction -> maintext, in each section that
> appears later, more detailed information should be given; but earlier
> section should give readers general ideas.)

Here is another bad example. See ?rep. The Usage section has 'rep(x,
...)'. However, '...' is only explained later in Arguments. I know
that it is probably because '...' is from functions underlying rep().
But it does not matter to end users whether they are from an
underlying function or not. Why not put the arguments in the Usage
section?

Similar cases can be found in the help of many functions.

> The current way to organizing the help is less satisfactory.
> Description->Usage->Arguments
>
> This may be good if you have already what you should look for. But if
> you are new to it, you will be easily lost. For example, many
> functions are given in Usage without been explained what the
> difference between them until very late, or no explicit explanations
> at all. But having such descriptions on the differences can help users
> choose the appropriate ones.
>
> Some of informative examples should be put forward to help newbies
> understand how to use each function, rather than put at the end of the
> help page. Many examples in the help page requires previous knowledge
> in other functions. In general, it is better to have the information
> on each help page self contained.
>
> Another problem is not due to the help of R, but the design of R
> itself --- there many specially case to use a function. For example,
> x[1:2,] is a matrix but x[1,] is a vector.
>
>> x=matrix(1:6,nr=3)
>> x[1:2,]
>     [,1] [,2]
> [1,]    1    4
> [2,]    2    5
>> x[1,]
> [1] 1 4
>
> I know that somebody that has worked with R for over 10 years don't
> know why (It may be because he doesn't care). But I have to ask the
> mailing list to understand that I have to use the option 'drop' in
> order to get a matrix as the returned value.
>> x[1,,drop=F]
>     [,1] [,2]
> [1,]    1    4
>
> If I were the original designer of R, I would make the interface more
> orthogonal (this is the usual way to reduce complexity in software).
> For example, [] would always return a matrix, if I want to reduce its
> dimension, I will have another function to do so.
>
> Have many special cases although might be convenient in some cases.
> But they may also cause confusions and may cause some delicate bugs
> that are to figure out especially to newbies.
>
> The above are my current thoughts. Let me know if it makes sense to you or not.
>
>>> -----Original Message-----
>>> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-
>>> project.org] On Behalf Of Peng Yu
>>> Sent: Friday, December 04, 2009 12:43 PM
>>> To: r-help at stat.math.ethz.ch
>>> Subject: Re: [R] grep() exclude certain patterns?
>>>
>>> On Fri, Dec 4, 2009 at 11:54 AM, Duncan Murdoch <murdoch at stats.uwo.ca>
>>> wrote:
>>> > On 04/12/2009 12:52 PM, Peng Yu wrote:
>>> >>
>>> >> The external grep program has an option -v to select non-matching
>>> >> lines. I'm wondering if how to exclude certain patterns in grep() in
>>> >> R?
>>> >>
>>> >
>>> > ?grep
>>>
>>> I don't see which argument to use.
>>>
>>> ______________________________________________
>>> R-help at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide http://www.R-project.org/posting-
>>> guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>
>