[R-sig-teaching] introducing R to high school students

Randy Johnson randy.n.julie at gmail.com
Fri Apr 20 19:08:37 CEST 2012


I work with High School students, and while they are certainly capable, your time constraints and the class size may be the hardest thing to work around. I would second Robert's advice on starting with a clean dataset and a short introductory tutorial.

It would also help a lot to have an additional assistant or two who are familiar enough with R to answer their questions as they begin to play on their own.

Best,
Randy

On Apr 19, 2012, at 5:27 AM, Grant, Robert wrote:

> Dear Chris et al
> 
> I would strictly control how much of the session looks at their own datasets. This is incredibly demanding on a teacher and your time will vanish before you know it! First, you need to get them typing some basic code and getting nice graphs out. I would focus on things they can't do in Excel/SPSS, such as controlling options like cex and col with variables. But you could get them to work through some good datasets first and then set them loose on their own stuff, maybe in small groups so they can try to stretch beyond what you show them.
> 
> I think the Titanic passenger list would be ideal for some binary variables. How topical can you get? (http://lib.stat.cmu.edu/S/Harrell/data/descriptions/titanic.html)
> 
> Disclaimer: I'm a university lecturer and probably have typically unrealistic views of high school teaching!
> 
> Robert
> 
> -----Original Message-----
> From: r-sig-teaching-bounces at r-project.org [mailto:r-sig-teaching-bounces at r-project.org] On Behalf Of Randall Pruim
> Sent: 19 April 2012 06:03
> To: Christopher W Ryan
> Cc: R-sig-teaching at r-project.org
> Subject: Re: [R-sig-teaching] introducing R to high school students
> 
> 
> A few thoughts.  You can do what you want with them.
> 
> 1) Use R formulas.
> 
> If you lattice graphics, then lm() and plots have essentially the same  
> syntax and you can make nice connections between the graphs and the  
> analyses.  For example,
> 
> bwplot( weightLoss ~ diet ) or xyplot (weightLoss ~ diet) if the data  
> set is small
> lm( weightLoss ~ diet )
> 
> This approach should give you the "time to get to those topics" since  
> the formula interface can be learned through graphical explorations  
> first.
> 
> If you add the mosaic package to your arsenal, then you can also do  
> numerical summaries this way
> 
> mean( weightLoss ~ diet )
> 
> 1a) Some quirks in R you might want to just avoid.
> 
> Out of the box, not all of the statistical test functions take a  
> formula interface.  binom.test() and chisq.test() require summarized  
> data.  t.test() uses a formula for 2-sample tests, but not for 1- 
> sample tests.  If time is short, dealing with this might be more  
> hassle than it's worth.  You might want to limit yourself to lm() --  
> perhaps augmented by glm().  You can do some fun examples in that  
> context and see how well they are able to construct models and  
> interpret their parameters.
> 
> If the students left your session(s) knowing how to make a handful of  
> lattice plots, to create and fit models with lm(), and to interpret  
> the resulting model fits, I would call that highly successful.
> 
> 2) You may or may not find "rectangular data"
> 
> People storing data in Excel do all manner of things.  If the students  
> do something other than "rectangular data", take some time to talk  
> about R's convention and why it is important to know about  
> observational units and variables.
> 
> 3) If you have access to and RStudio server, then you can avoid all  
> installation and set up issues and students can work in a browser.
> 
> In my experience, you never know just what state high school  
> technology will be in.
> 
> 4) Find some good data sets for your examples.
> 
> What qualifies is a matter of taste, I suppose, but don't skimp on the  
> data. If you have time, and if it is easy to get their data, you could  
> do some examples with the students' data.
> 
> 5) I don't know how much time you will be given, but it will go by too  
> quickly.
> 
> Prepare lots of cool stuff, but don't rush to use it all.  Better to  
> do less and do it well.  Leave them begging for more.
> 
> 6) Teach a little bit about function syntax.
> 
> If your time is limited, you likely won't have much time to get into  
> programming, control structures, classes of objects, method dispatch,  
> lazy evaluation, ...  But you can do a lot with R in a one-line-of- 
> code-at-a-time sort of way.  One thing you do need to say a bit about  
> is functions, since nearly every one of these lines will include one  
> or more of an arithmetic computation, an assignment, or function  
> call.  When I teach new functions to beginners, I ask them what things  
> the computer would need to know to produce the result we are hoping to  
> get.  Once they have identified the inputs and outputs, then I tell  
> them the syntax used to provide the inputs to R and look at the output  
> R returns.  I emphasize too the common pattern of functions syntax --  
> name, open paren, comma-separated list of arguments, close paren.
> 
> 7) If the students are good and you can react quickly on your feet,  
> ask them what they want to learn about and show it to them.
> 
> You'll probably have a good sense for their level after the first  
> 15-20 minutes.
> 
> 8) Almost forgot... You could do some resampling stuff.
> 
> R is well suited for this.  You can simplify it a bit by using do()  
> from the mosaic package, or you can use replicate()
> 
>> lm( age ~ sex, HELPrct )
> 
> Call:
> lm(formula = age ~ sex, data = HELPrct)
> 
> Coefficients:
> (Intercept)      sexmale
>     36.2523      -0.7841
> 
>> do(5) * lm( age ~ shuffle(sex), HELPrct )
>   Intercept     sexmale    sigma    r-squared
> 1  35.83178 -0.23350980 7.718169 1.658421e-04
> 2  35.34579  0.40276052 7.716905 4.933763e-04
> 3  35.69159 -0.04997029 7.718780 7.594658e-06
> 4  34.62617  1.34493004 7.697547 5.501535e-03
> 5  35.04673  0.79431149 7.711400 1.918961e-03
> 
> Have fun.  Hope it goes well for you.
> 
> ---rjp
> 
> 
> 
> 
> 
> On Apr 18, 2012, at 10:47 PM, Christopher W Ryan wrote:
> 
>> After some interesting discussions on r-help list, the suggestion was
>> made that I could also probably gain some useful insights on this
>> teaching listserve, a resource that I didn't know about previously.
>> 
>> I participate peripherally on a listserve for middle- and high-school
>> science teachers. Sometimes questions about graphing or data analysis
>> come up. I never miss an opportunity to advocate for R. However, the
>> teachers are often skeptical that the students would be able to issue
>> commands or write a little code; they think it would be too difficult.
>> Perhaps this stems from the Microsoft- and spreadsheet-centered,
>> pointy-clicky culture prevalent in most US public schools. Then again,
>> I have little experience teaching this age group, besides my own kids
>> and my Science Olympiad team, so I respect their concerns.
>> 
>> Now I have to put my money where my mouth is. I've offered to visit a
>> high school and introduce R to some fairly advanced students
>> participating in a longitudinal 3-year science research class.  To be
>> clear, they are already, for good or for ill, doing data analysis and
>> graphics for their projects using software.  Mostly they are using
>> Excel and SPSS.  My goal would be to introduce them to R as another
>> (and better) tool for what they are currently doing. I would have to
>> work hard to keep it at a very introductory level, but I don't see why
>> plot(force, acceleration) should be any more conceptually difficult
>> for high schoolers than clicking through a whole series of dialog
>> boxes. The latter merely has the advantage of familiarity. But I can't
>> help but wonder whether it would be better to give kids good
>> scientific tools upfront, rather than have them spend many
>> impressionable years using sub-optimal tools and then in graduate
>> school try to entice them to switch.
>> 
>> They all will have datasets of their own.  I imagine they will mostly
>> be single, "rectangular" datasets, ie  data frames.
>> 
>> I tentatively anticipate a lot of graphics, of course, which I'm
>> hoping they would find pretty cool and useful. I'd also like to
>> introduce the concept of an object, just to the level of "there are
>> different kinds, here's what some of the kinds are called, there's
>> stuff inside them, and you can explore them with str(), head(),
>> tail(), class()" and the like. Some simple descriptive statistics.
>> They are already doing t-tests, Chi-squared tests, and linear
>> regression (again, for good or for ill.)  I don't know whether I'd
>> have time to get to those topics in R, probably not.
>> 
>> There was a diversity of opinions on R-help about how to do this, and
>> especially, whether to do it at all.
>> 
>> Has anyone done anything with R in high schools?
>> 
>> Thanks.
>> 
>> --Chris Ryan
>> SUNY Upstate Medical University
>> Binghamton Clinical Campus
>> 
>> _______________________________________________
>> R-sig-teaching at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-sig-teaching
> 
> 
> 	[[alternative HTML version deleted]]
> 
> _______________________________________________
> R-sig-teaching at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-teaching
> 
> This email has been scanned for all viruses by the MessageLabs Email
> Security System.
> 
> This email has been scanned for all viruses by the MessageLabs Email
> Security System.
> 
> _______________________________________________
> R-sig-teaching at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-teaching


CGTGCTAACGACTACTAG



More information about the R-sig-teaching mailing list