[R] SQL and R - tangential

@vi@e@gross m@iii@g oii gm@ii@com @vi@e@gross m@iii@g oii gm@ii@com
Wed Dec 11 21:03:05 CET 2024


Actually, tangentially, JC, I have a deep suspicion that many computer
languages are not written to solve problems. They are sometimes an effort by
someone to implement a new paradigm different than what others have tried
before or to protect the programmer from themselves or to demand extensive
rigor so the result almost proves itself and so on.

Only after the language is in place, might they look for a problem that
might be solved using it.

Kidding aside, some aspects of R may not have been as useful early on in
that the use of multi-dimensional arrays is not as common a need in many
areas of endeavor. But it has taken on more meaning in areas like AI. Other
languages that did not support some concepts like that well, have often had
to add it in.

The problem today in looking at computer languages is how ALIKE they are
becoming as features keep being grafted on to make each language have
features already found in another.

If you look at the question about SQL and R, you might note that although
base R had lots of functionality, some aspects you see in extensions like
dplyr are attempts to in some way mimic SQL or do things different or even
better. Consider the many clauses allowed in a SQL "select" statement and
map that onto a dplyr query in a pipeline with what seems like lots of
clauses. Base R before the pipe did not lend itself so easily to that.


-----Original Message-----
From: R-help <r-help-bounces using r-project.org> On Behalf Of J C Nash
Sent: Wednesday, December 11, 2024 11:58 AM
To: r-help using r-project.org
Subject: Re: [R] SQL and R - tangential

My late friend Morven Gentleman, not long after he stepped down from being
chair
of Computer Science at Waterloo, said that it seemed computer scientists had
to create
a new computer language for every new problem they encountered.

If we could use least squares to measure this approximation, we'd likely be
suspicious
of a terribly small error measure or overly high R^2.

JN

On 2024-12-11 11:11, avi.e.gross using gmail.com wrote:
> Akshay,
> 
> Your question has way too many answers.
> 
> SQL has a long history and early versions came long before R arrived on
the
> scene. There is a huge embedded base of hardware and software dedicated to
> managing databases. It has some features that most R programs do not even
> dream of doing. Besides easily handling massive amounts of data or
sometimes
> tweaking queries to possibly run more efficiently, there are all kinds of
> issue of how to manage multiple people accessing and changing the data at
> about the same time, or rolling the data back to an earlier checkpoint.
> 
> R came along later and, as Ben pointed out, adds all kinds of things SQL
> does not have and likely does not need, or alternate ways to do things.
> 
> For many people now, the workload is to use a programming language, and R
is
> not the only one used, which has enhanced with packages or modules that
> allow access in a fairly general way to one or many databases running
> various versions of SQL. The programmer uses this API in many ways.
> 
> In some ways, it is just a way to tell the database what to do without
much
> other processing. You can ask to open a connection to the server, do a
query
> that gets translated to SQL (or you can provide the actual SQL)  and let
the
> remote (or local) machine do much of the work. For example, imagine a
> database with terabytes of data and all you want is a few rows/columns
that
> meet your query. In R, you might have to open a collection of huge CSV
files
> and fill more memory than you have and do the query somehow. If the data
is
> remote, we are talking about a huge receiving of data. Using SQL divides
the
> work so you do parts here and parts there.
> 
> Why use a local MYSQL? Part of the answer is that you have a fairly
> optimized and debugged system that does it well and lets the programmer
> focus on the parts they need to add within R like complex analyses. Part
is
> portability, as you can later move the data outside your machine and with
> minor changes, your program should still work. And, there are many other
> scenarios such as wanting to gather data from different sources such as
> connecting to multiple remote databases and getting filtered data and
doing
> an analysis across that data and perhaps updating them.
> 
> R used in ways like this provides lots of flexibility. But part of the
> question is like asking why there are a hundred programming languages
still
> in use out there. Why do we need so many? In short, we don't necessarily
> need all or even most of them but they are there because various people
> developed them and used them and it is not trivial to get people to switch
> and maybe abandon all the older software or try to rewrite it.
> 
> Having said that, I think a large fraction of R users have never had any
> particular reason to learn SQL. Many have never used it directly or even
> indirectly. I know someone who I have programmed for who calls some expert
> to do a SQL query and save the results in CSV files and then works
directly
> in R on those files. I have pointed out to them that their life could be
> even easier if they got a more focused dump of the SQL data with some of
the
> added processing done in SQL and then a smaller amount of data coming into
> the R side.
> 
> I also note that languages like R and python can have parts that run
fairly
> slowly. Arguably, most versions of SQL have been tuned over decades ...
> 
> 
> -----Original Message-----
> From: R-help <r-help-bounces using r-project.org> On Behalf Of akshay kulkarni
> Sent: Wednesday, December 11, 2024 8:17 AM
> To: R help Mailing list <r-help using r-project.org>
> Subject: [R] SQL and R
> 
> dear Members,
>                              I have recently started studying SQL and
MySQL.
> My question is, what exactly is SQL used for? That is, whatever can be
done
> by SQL, like subsetting and filtering of data sets, can also be done by R.
> What's, then, the advantage of SQL?  It is OK if you tag this question as
> offtopic, but I could'nt find any info on the web. Can you please refer me
> to some online resources that shed some light on this? Finally, how does
SQL
> complement R? Are both dependent?
> 
> THanking you,
> Yours sincerely,
> AKSHAY M KULKARNI
> 
>
[https://s-install.avcdn.net/ipm/preview/icons/icon-envelope-tick-round-oran
>
ge-animated-no-repeat-v1.gif]<https://www.avast.com/sig-email?utm_medium=ema
> il&utm_source=link&utm_campaign=sig-email&utm_content=webmail>
>
Virus-free.www.avast.com<https://www.avast.com/sig-email?utm_medium=email&ut
> m_source=link&utm_campaign=sig-email&utm_content=webmail>
> 
> 	[[alternative HTML version deleted]]
> 
> ______________________________________________
> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> https://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 
> ______________________________________________
> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
https://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

______________________________________________
R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
https://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list