[R] Popularity of R, SAS, SPSS, Stata...

Allan Engelhardt allane at cybaea.com
Sat Jun 26 17:19:10 CEST 2010



On 26/06/10 16:07, Muenchen, Robert A (Bob) wrote:
> I've been trying to make sense of Google Scholar searches. I'm obviously
> missing something basic. Here are two searches on www.google.com:
>
> sas - gets 68M hits
> sas OR spss - gets 74.3M hits. A bigger number as "OR" would imply.
>
> But when I do the same searches on scholar.google.com, here's what I
> get:
>
> sas - gets 4.6M hits
> sas OR spss - gets 1.65M hits
>
> How on earth can an "OR" get you less??
>    

Because the search for SAS alone stems the words so you get hist on SA 
alone (SAS obviously (!) being the plural of SA).  As you will see from 
the first few hits (hint: the matched word is highlighted in bold).  
With the OR you don't stem (weird but true).  Put quotes around the 
single search term to avoid (some of) the stemming:

SAS - 4.62M
"SAS" - 1.62M
"SPSS" - 0.635M
"SAS" OR "SPSS" - 1.52M

It is obviously still not right, but closer.  Happy reading of the 
articles by D. Sas, S.A.S. Eddington, etc.

Any follow-ups probably belong on a different mailing list - I think 
there are forums for Google search.


Allan


> Thanks,
> Bob
>
>    
>
>> http://www.google.com/insights/search/#q=code%20for%20r%2Ccode%20for%20
>>      
> S
>    
>> AS%2Ccode%20for%20SPSS%2Ccode%20for%20matlab&cmpt=q
>>
>> This one is nice too. You can see that the bump in the autumn semester
>> for R is replacing the one for Matlab. Then in the spring semester
>> Matlab stays high but R drops. And both the US and India always have a
>> very large search index, whereas the rest of the world is essentially
>> worthless. Which leads me to the conclusion that : 1) The results are
>> probably coming from google.com, excluding local versions, and 2) in
>> the US (and India), statistics is mainly taught in the autumn
>> semester. Given the fact that daylight has a beneficial effect on the
>> emotional well being, the impopularity of statistics is likely caused
>> by unfortunate scheduling.
>>
>> Forget Excel. Google rocks! ;-)
>>
>> Cheers
>> Joris
>>
>>      
>>> Once you go the phrase route, you gain precision but end up with zero
>>> counts on various phrases. I avoided that by combining them with "+"
>>>        
>> to
>>      
>>> get enough to plot. The resulting graph shows SAS dominant until
>>> mid-2006 when SPSS takes the top position, followed by R, SAS, Stata
>>>        
>> in
>>      
>>> order:
>>>
>>>
>>>        
>> http://www.google.com/insights/search/#q=%22r%20code%20for%22%2B%22r%20
>>      
> m
>    
>>>        
>> anual%22%2B%22r%20tutorial%22%2B%22r%20graph%22%2C%22sas%20code%20for%2
>>      
> 2
>    
>>>        
>> %2B%22sas%20manual%22%2B%22sas%20tutorial%22%2B%22sas%20graph%22%2C%22s
>>      
> p
>    
>>>        
>> ss%20code%20for%22%2B%22spss%20manual%22%2B%22spss%20tutorial%22%2B%22s
>>      
> p
>    
>>>        
>> ss%20graph%22%2C%22stata%20code%20for%22%2B%22stata%20manual%22%2B%22st
>>      
> a
>    
>>> ta%20tutorial%22%2B%22stata%20graph%22%2C%22s-
>>>        
>> plus%20code%20for%22%2B%22
>>      
>>> s-plus%20manual%22%2Bs-plus%20tutorial%22%2B%22s-
>>>        
>> plus%20graph%22&cmpt=q
>>      
>>> This might be a good one to add to http://r4stats.com/popularity
>>>
>>> Bob
>>>
>>>        
>>>> I see that there's a car, the R Code Mustang, that adding "for" gets
>>>>          
>>> rid
>>>        
>>>> of.
>>>>
>>>> Thanks for getting me back on a topic that I had given up on!
>>>>
>>>> Bob
>>>>
>>>>          
>>>>> -----Original Message-----
>>>>> From: r-help-bounces at r-project.org
>>>>>            
>>>> [mailto:r-help-bounces at r-project.org]
>>>>          
>>>>> On Behalf Of Joris Meys
>>>>> Sent: Thursday, June 24, 2010 7:56 PM
>>>>> To: Dario Solari
>>>>> Cc: r-help at r-project.org
>>>>> Subject: Re: [R] Popularity of R, SAS, SPSS, Stata...
>>>>>
>>>>> Nice idea, but quite sensitive to search terms, if you compare your
>>>>> result on "... code" with "... code for":
>>>>> http://www.google.com/insights/search/#q=r%20code%20for%2Csas%20code
>>>>>            
> %
>    
>> 2
>>      
>>> 0
>>>        
>>>> f
>>>>          
>>>>> or%2Cspss%20code%20for&cmpt=q
>>>>>
>>>>> On Thu, Jun 24, 2010 at 10:48 PM, Dario Solari
>>>>>            
>>> <dario.solari at gmail.com>
>>>        
>>>>> wrote:
>>>>>            
>>>>>> First: excuse for my english
>>>>>>
>>>>>> My opinion: a useful font for measuring "popoularity" can be
>>>>>>              
> Google
>    
>>>>>> Insights for Search - http://www.google.com/insights/search/#
>>>>>>
>>>>>> Every person using a software like R, SAS, SPSS needs first to
>>>>>>              
>> learn
>>      
>>>>>> it. So probably he make a web-search for a manual, a tutorial, a
>>>>>> guide. One can measure the share of this kind of serach query.
>>>>>> This kind of results can be useful to determine trends of
>>>>>> "popularity".
>>>>>>
>>>>>> Example 1: "R tutorial/manual/guide", "SAS tutorial/manual/guide",
>>>>>> "SPSS tutorial/manual/guide"
>>>>>>
>>>>>>              
>>>>> http://www.google.com/insights/search/#q=%22r%20tutorial%22%2B%22r%2
>>>>>            
> 0
>    
>> m
>>      
>>> a
>>>        
>>>> n
>>>>          
>>>>> ual%22%2B%22r%20guide%22%2B%22r%20vignette%22%2C%22spss%20tutorial%2
>>>>>            
> 2
>    
>> %
>>      
>>> 2
>>>        
>>>> B
>>>>          
>>>>> %22spss%20manual%22%2B%22spss%20guide%22%2C%22sas%20tutorial%22%2B%2
>>>>>            
> 2
>    
>> s
>>      
>>> a
>>>        
>>>> s
>>>>          
>>>>> %20manual%22%2B%22sas%20guide%22&cmpt=q
>>>>>            
>>>>>> Example 2: "R software", "SAS software", "SPSS software"
>>>>>>
>>>>>>              
>>>>> http://www.google.com/insights/search/#q=%22r%20software%22%2C%22sps
>>>>>            
> s
>    
>> %
>>      
>>> 2
>>>        
>>>> 0
>>>>          
>>>>> software%22%2C%22sas%20software%22&cmpt=q
>>>>>            
>>>>>> Example 3: "R code", "SAS code", "SPSS code"
>>>>>>
>>>>>>              
>>>>> http://www.google.com/insights/search/#q=%22r%20code%22%2C%22spss%20
>>>>>            
> c
>    
>> o
>>      
>>> d
>>>        
>>>> e
>>>>          
>>>>> %22%2C%22sas%20code%22&cmpt=q
>>>>>            
>>>>>> Example 4: "R graph", "SAS graph", "SPSS graph"
>>>>>>
>>>>>>              
>>>>> http://www.google.com/insights/search/#q=%22r%20graph%22%2C%22spss%2
>>>>>            
> 0
>    
>> g
>>      
>>> r
>>>        
>>>> a
>>>>          
>>>>> ph%22%2C%22sas%20graph%22&cmpt=q
>>>>>            
>>>>>> Example 5: "R regression", "SAS regression", "SPSS regression"
>>>>>>
>>>>>>              
>>>>> http://www.google.com/insights/search/#q=%22r%20regression%22%2C%22s
>>>>>            
> p
>    
>> s
>>      
>>> s
>>>        
>>>> %
>>>>          
>>>>> 20regression%22%2C%22sas%20regression%22&cmpt=q
>>>>>            
>>>>>> Some example are cross-software (learning needs - Example1), other
>>>>>>              
>>>> can
>>>>          
>>>>>> be biased by the tarditional use of that software (in SPSS usually
>>>>>>              
>>>> you
>>>>          
>>>>>> don't manipulate graph, i think)
>>>>>>
>>>>>> ______________________________________________
>>>>>> R-help at r-project.org mailing list
>>>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>>>> PLEASE do read the posting guide http://www.R-project.org/posting-
>>>>>>              
>>>>> guide.html
>>>>>            
>>>>>> and provide commented, minimal, self-contained, reproducible code.
>>>>>>
>>>>>>              
>>>>>
>>>>>
>>>>> --
>>>>> Joris Meys
>>>>> Statistical consultant
>>>>>
>>>>> Ghent University
>>>>> Faculty of Bioscience Engineering
>>>>> Department of Applied mathematics, biometrics and process control
>>>>>
>>>>> tel : +32 9 264 59 87
>>>>> Joris.Meys at Ugent.be
>>>>> -------------------------------
>>>>> Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php
>>>>>
>>>>> ______________________________________________
>>>>> R-help at r-project.org mailing list
>>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>>> PLEASE do read the posting guide http://www.R-project.org/posting-
>>>>> guide.html
>>>>> and provide commented, minimal, self-contained, reproducible code.
>>>>>            
>>>> ______________________________________________
>>>> R-help at r-project.org mailing list
>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>> PLEASE do read the posting guide http://www.R-project.org/posting-
>>>> guide.html
>>>> and provide commented, minimal, self-contained, reproducible code.
>>>>          
>>>        
>>
>>
>> --
>> Joris Meys
>> Statistical consultant
>>
>> Ghent University
>> Faculty of Bioscience Engineering
>> Department of Applied mathematics, biometrics and process control
>>
>> tel : +32 9 264 59 87
>> Joris.Meys at Ugent.be
>> -------------------------------
>> Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php
>>      
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



More information about the R-help mailing list