[R] SQL and R
@vi@e@gross m@iii@g oii gm@ii@com
@vi@e@gross m@iii@g oii gm@ii@com
Wed Dec 11 17:11:27 CET 2024
Akshay,
Your question has way too many answers.
SQL has a long history and early versions came long before R arrived on the
scene. There is a huge embedded base of hardware and software dedicated to
managing databases. It has some features that most R programs do not even
dream of doing. Besides easily handling massive amounts of data or sometimes
tweaking queries to possibly run more efficiently, there are all kinds of
issue of how to manage multiple people accessing and changing the data at
about the same time, or rolling the data back to an earlier checkpoint.
R came along later and, as Ben pointed out, adds all kinds of things SQL
does not have and likely does not need, or alternate ways to do things.
For many people now, the workload is to use a programming language, and R is
not the only one used, which has enhanced with packages or modules that
allow access in a fairly general way to one or many databases running
various versions of SQL. The programmer uses this API in many ways.
In some ways, it is just a way to tell the database what to do without much
other processing. You can ask to open a connection to the server, do a query
that gets translated to SQL (or you can provide the actual SQL) and let the
remote (or local) machine do much of the work. For example, imagine a
database with terabytes of data and all you want is a few rows/columns that
meet your query. In R, you might have to open a collection of huge CSV files
and fill more memory than you have and do the query somehow. If the data is
remote, we are talking about a huge receiving of data. Using SQL divides the
work so you do parts here and parts there.
Why use a local MYSQL? Part of the answer is that you have a fairly
optimized and debugged system that does it well and lets the programmer
focus on the parts they need to add within R like complex analyses. Part is
portability, as you can later move the data outside your machine and with
minor changes, your program should still work. And, there are many other
scenarios such as wanting to gather data from different sources such as
connecting to multiple remote databases and getting filtered data and doing
an analysis across that data and perhaps updating them.
R used in ways like this provides lots of flexibility. But part of the
question is like asking why there are a hundred programming languages still
in use out there. Why do we need so many? In short, we don't necessarily
need all or even most of them but they are there because various people
developed them and used them and it is not trivial to get people to switch
and maybe abandon all the older software or try to rewrite it.
Having said that, I think a large fraction of R users have never had any
particular reason to learn SQL. Many have never used it directly or even
indirectly. I know someone who I have programmed for who calls some expert
to do a SQL query and save the results in CSV files and then works directly
in R on those files. I have pointed out to them that their life could be
even easier if they got a more focused dump of the SQL data with some of the
added processing done in SQL and then a smaller amount of data coming into
the R side.
I also note that languages like R and python can have parts that run fairly
slowly. Arguably, most versions of SQL have been tuned over decades ...
-----Original Message-----
From: R-help <r-help-bounces using r-project.org> On Behalf Of akshay kulkarni
Sent: Wednesday, December 11, 2024 8:17 AM
To: R help Mailing list <r-help using r-project.org>
Subject: [R] SQL and R
dear Members,
I have recently started studying SQL and MySQL.
My question is, what exactly is SQL used for? That is, whatever can be done
by SQL, like subsetting and filtering of data sets, can also be done by R.
What's, then, the advantage of SQL? It is OK if you tag this question as
offtopic, but I could'nt find any info on the web. Can you please refer me
to some online resources that shed some light on this? Finally, how does SQL
complement R? Are both dependent?
THanking you,
Yours sincerely,
AKSHAY M KULKARNI
[https://s-install.avcdn.net/ipm/preview/icons/icon-envelope-tick-round-oran
ge-animated-no-repeat-v1.gif]<https://www.avast.com/sig-email?utm_medium=ema
il&utm_source=link&utm_campaign=sig-email&utm_content=webmail>
Virus-free.www.avast.com<https://www.avast.com/sig-email?utm_medium=email&ut
m_source=link&utm_campaign=sig-email&utm_content=webmail>
[[alternative HTML version deleted]]
______________________________________________
R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
https://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list