[R-sig-teaching] introducing R to high school students

Grant, Robert Robert.Grant at sgul.kingston.ac.uk
Thu Apr 19 11:27:31 CEST 2012


Dear Chris et al

I would strictly control how much of the session looks at their own datasets. This is incredibly demanding on a teacher and your time will vanish before you know it! First, you need to get them typing some basic code and getting nice graphs out. I would focus on things they can't do in Excel/SPSS, such as controlling options like cex and col with variables. But you could get them to work through some good datasets first and then set them loose on their own stuff, maybe in small groups so they can try to stretch beyond what you show them.

I think the Titanic passenger list would be ideal for some binary variables. How topical can you get? (http://lib.stat.cmu.edu/S/Harrell/data/descriptions/titanic.html)

Disclaimer: I'm a university lecturer and probably have typically unrealistic views of high school teaching!

Robert

-----Original Message-----
From: r-sig-teaching-bounces at r-project.org [mailto:r-sig-teaching-bounces at r-project.org] On Behalf Of Randall Pruim
Sent: 19 April 2012 06:03
To: Christopher W Ryan
Cc: R-sig-teaching at r-project.org
Subject: Re: [R-sig-teaching] introducing R to high school students


A few thoughts.  You can do what you want with them.

1) Use R formulas.

If you lattice graphics, then lm() and plots have essentially the same  
syntax and you can make nice connections between the graphs and the  
analyses.  For example,

bwplot( weightLoss ~ diet ) or xyplot (weightLoss ~ diet) if the data  
set is small
lm( weightLoss ~ diet )

This approach should give you the "time to get to those topics" since  
the formula interface can be learned through graphical explorations  
first.

If you add the mosaic package to your arsenal, then you can also do  
numerical summaries this way

mean( weightLoss ~ diet )

1a) Some quirks in R you might want to just avoid.

Out of the box, not all of the statistical test functions take a  
formula interface.  binom.test() and chisq.test() require summarized  
data.  t.test() uses a formula for 2-sample tests, but not for 1- 
sample tests.  If time is short, dealing with this might be more  
hassle than it's worth.  You might want to limit yourself to lm() --  
perhaps augmented by glm().  You can do some fun examples in that  
context and see how well they are able to construct models and  
interpret their parameters.

If the students left your session(s) knowing how to make a handful of  
lattice plots, to create and fit models with lm(), and to interpret  
the resulting model fits, I would call that highly successful.

2) You may or may not find "rectangular data"

People storing data in Excel do all manner of things.  If the students  
do something other than "rectangular data", take some time to talk  
about R's convention and why it is important to know about  
observational units and variables.

3) If you have access to and RStudio server, then you can avoid all  
installation and set up issues and students can work in a browser.

In my experience, you never know just what state high school  
technology will be in.

4) Find some good data sets for your examples.

What qualifies is a matter of taste, I suppose, but don't skimp on the  
data. If you have time, and if it is easy to get their data, you could  
do some examples with the students' data.

5) I don't know how much time you will be given, but it will go by too  
quickly.

Prepare lots of cool stuff, but don't rush to use it all.  Better to  
do less and do it well.  Leave them begging for more.

6) Teach a little bit about function syntax.

If your time is limited, you likely won't have much time to get into  
programming, control structures, classes of objects, method dispatch,  
lazy evaluation, ...  But you can do a lot with R in a one-line-of- 
code-at-a-time sort of way.  One thing you do need to say a bit about  
is functions, since nearly every one of these lines will include one  
or more of an arithmetic computation, an assignment, or function  
call.  When I teach new functions to beginners, I ask them what things  
the computer would need to know to produce the result we are hoping to  
get.  Once they have identified the inputs and outputs, then I tell  
them the syntax used to provide the inputs to R and look at the output  
R returns.  I emphasize too the common pattern of functions syntax --  
name, open paren, comma-separated list of arguments, close paren.

7) If the students are good and you can react quickly on your feet,  
ask them what they want to learn about and show it to them.

You'll probably have a good sense for their level after the first  
15-20 minutes.

8) Almost forgot... You could do some resampling stuff.

R is well suited for this.  You can simplify it a bit by using do()  
from the mosaic package, or you can use replicate()

 > lm( age ~ sex, HELPrct )

Call:
lm(formula = age ~ sex, data = HELPrct)

Coefficients:
(Intercept)      sexmale
     36.2523      -0.7841

 > do(5) * lm( age ~ shuffle(sex), HELPrct )
   Intercept     sexmale    sigma    r-squared
1  35.83178 -0.23350980 7.718169 1.658421e-04
2  35.34579  0.40276052 7.716905 4.933763e-04
3  35.69159 -0.04997029 7.718780 7.594658e-06
4  34.62617  1.34493004 7.697547 5.501535e-03
5  35.04673  0.79431149 7.711400 1.918961e-03

Have fun.  Hope it goes well for you.

---rjp





On Apr 18, 2012, at 10:47 PM, Christopher W Ryan wrote:

> After some interesting discussions on r-help list, the suggestion was
> made that I could also probably gain some useful insights on this
> teaching listserve, a resource that I didn't know about previously.
>
> I participate peripherally on a listserve for middle- and high-school
> science teachers. Sometimes questions about graphing or data analysis
> come up. I never miss an opportunity to advocate for R. However, the
> teachers are often skeptical that the students would be able to issue
> commands or write a little code; they think it would be too difficult.
> Perhaps this stems from the Microsoft- and spreadsheet-centered,
> pointy-clicky culture prevalent in most US public schools. Then again,
> I have little experience teaching this age group, besides my own kids
> and my Science Olympiad team, so I respect their concerns.
>
> Now I have to put my money where my mouth is. I've offered to visit a
> high school and introduce R to some fairly advanced students
> participating in a longitudinal 3-year science research class.  To be
> clear, they are already, for good or for ill, doing data analysis and
> graphics for their projects using software.  Mostly they are using
> Excel and SPSS.  My goal would be to introduce them to R as another
> (and better) tool for what they are currently doing. I would have to
> work hard to keep it at a very introductory level, but I don't see why
> plot(force, acceleration) should be any more conceptually difficult
> for high schoolers than clicking through a whole series of dialog
> boxes. The latter merely has the advantage of familiarity. But I can't
> help but wonder whether it would be better to give kids good
> scientific tools upfront, rather than have them spend many
> impressionable years using sub-optimal tools and then in graduate
> school try to entice them to switch.
>
> They all will have datasets of their own.  I imagine they will mostly
> be single, "rectangular" datasets, ie  data frames.
>
> I tentatively anticipate a lot of graphics, of course, which I'm
> hoping they would find pretty cool and useful. I'd also like to
> introduce the concept of an object, just to the level of "there are
> different kinds, here's what some of the kinds are called, there's
> stuff inside them, and you can explore them with str(), head(),
> tail(), class()" and the like. Some simple descriptive statistics.
> They are already doing t-tests, Chi-squared tests, and linear
> regression (again, for good or for ill.)  I don't know whether I'd
> have time to get to those topics in R, probably not.
>
> There was a diversity of opinions on R-help about how to do this, and
> especially, whether to do it at all.
>
> Has anyone done anything with R in high schools?
>
> Thanks.
>
> --Chris Ryan
> SUNY Upstate Medical University
> Binghamton Clinical Campus
>
> _______________________________________________
> R-sig-teaching at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-teaching


	[[alternative HTML version deleted]]

_______________________________________________
R-sig-teaching at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-teaching

This email has been scanned for all viruses by the MessageLabs Email
Security System.

This email has been scanned for all viruses by the MessageLabs Email
Security System.



More information about the R-sig-teaching mailing list