[R] Matrices

Mon Aug 10 19:54:22 CEST 2009

On Mon, 2009-08-10 at 13:33 -0400, mmv.listservs wrote:
> Gavin and Stefan,
> 
> Both the subset commands and the flag were exactly what I needed. On another
> note, I'm dealing with variables that are categorical and have long names
> like "Task XYZ", "Task ABC" "Task CCC"
> 
> When I try to plot against the probability it doesn't show me the Task name
> anymore. How can I map a number back to the actual task name so that I can
> understand my plots.

In what sense does it not "...show me [you] the Task name anymore."? Do
you mean it is cut off because the margin in the plot is not big enough?
If so, increase the size of the margin. first, determine which margin it
is and note the relative number:

bottom == 1, left == 2, top == 3, right == 4.

The default margins are in:

par("mar")

and are c(5,4,4,2) + 0.1 usually. The number are in the order I gave
above. So if the Task XXX labels are on the left side of the plot region
(in the left margin), then you increase the left margin:

## extreme left margin
op <- par(mar = c(5,10,4,2) + 0.1)
### do you plotting here
## reset to default "mar"
par(op)

Please read the posting guide and form posts to the list that explain
exactly what the problem is, with REPRODUCIBLE code. If my guess at your
problem is the correct one, here is one way of showing what the problem
is. Note I don't know what plots you were doing so I'm making that bit
up. but here is a reproducible example, the sort of things we want:

## dummy data
set.seed(123)
dummy <- data.frame(A = sample(paste("really long labels", 1:5), 100,
                               replace = TRUE),
                    B = rnorm(100))
str(dummy)
head(dummy)
## plot it
boxplot(B ~ A, data = dummy, horizontal = TRUE, las = 1)
## Hmm labels are truncated on left margin. How do I stop that?

HTH

G

> 
> Another problem is I have hundreds of tasks so it is virtually impossible to
> show the task names on a plot vs. the probability.
> 
> -Melanie
> 
> 
> 
> On Mon, Aug 10, 2009 at 1:18 PM, Gavin Simpson <gavin.simpson at ucl.ac.uk>wrote:
> 
> > On Mon, 2009-08-10 at 11:17 -0400, mmv.listservs wrote:
> > > yy<-poisson2[poisson2$Reboot.Id=="Reboot
> > > 2",poisson2$Task.Status=="F",,drop=FALSE]
> >
> > The above doesn't make any sense and can't be working or doing what you
> > think it is doing.
> >
> > Lets dissect this command:
> >
> > yy <- poisson2[poisson2$Reboot.Id=="Reboot2",
> >                ^^^ so this bit is a flag as to whether we include
> >               certain rows
> >
> >              poisson2$Task.Status=="F", , drop=FALSE]
> >               ^^^ Now this bit is saying include columns based on
> > whether
> >              or not each of your 10000 Task.Status entries == "F" or
> >              not
> >
> > That doesn't make sense. If you want to combine the two clauses, so that
> > we return only rows where Reboot.Id=="Reboot2" *and* Task.Status=="F"
> > are TRUE, then you need to use the & operator, e.g.
> >
> > Reboot.Id=="Reboot2" & Task.Status=="F"
> >
> > This should work:
> >
> > flag <- with(poisson2, Reboot.Id=="Reboot2" & Task.Status=="F")
> > yy <- poisson2[flag, , drop = FALSE]
> > ~~                  ^ the blank here means all columns.
> >
> > As a concrete example as you didn't provide us with the means of
> > replicating your problem (while you are reading some introductory
> > material on subsetting, also read the Posting Guide to see how to help
> > *us* help *you*), we use some dummy data, 3 variables and subset
> > conditional upon values of two of them, but return all three columns for
> > the result.
> >
> > ## first set the random seed so we get the same results
> > set.seed(123)
> > ## now produce some dummy data
> > dummy <- data.frame(A = sample(LETTERS[1:4], 100, replace = TRUE),
> >                    B = sample(c("T","F"), 100, replace = TRUE),
> >                    C = rnorm(100))
> > ## view first few rows
> > head(dummy)
> > ## Lets see which LETTERs we have
> > with(dummy, table(A))
> > ## Produce table of A vs B
> > with(dummy, table(A, B))
> > ## As example, select rows of 'dummy' where:
> > ## A == "D" *and* B == "F"
> > ## which, from table above, should contain 13 rows
> > flag <- with(dummy, A == "D" & B == "F")
> > want <- dummy[flag,]
> > want
> > ## notice we get column C as well, because we don't specify which
> > ## columns to return...
> > ## how many rows? Is this what we expected?
> > nrow(want)
> >
> > Does this help?
> >
> > The problem with your first posting is that you forgot the trailing
> > comma:
> >
> > all_column_attributes_for_reboot_1 <- poisson2[poisson2
> > $Reboot.Id=="Reboot1"]
> >                     ^^ needs a , here
> >
> > 1) choose simpler names - you'll save yourself some RSI not having to
> > type them in
> > 2) the command should look something like this:
> >
> > res <- poisson2[poisson2$Reboot.Id=="Reboot1", ]
> >
> > So now res will contain all rows of poisson2 where Reboot.Id ==
> > "Reboot1", with all column attributes. To stop R dropping empty
> > dimensions, we might wish to extend this to:
> >
> > res <- poisson2[poisson2$Reboot.Id=="Reboot1", , drop = FALSE]
> > ## Note the empty column indicator -----------^
> >
> > G
> >
> > >
> > > doesn't work either? Any other ideas?
> > >
> > > On Mon, Aug 10, 2009 at 11:01 AM, mmv.listservs <mmv.listservs at gmail.com
> > >wrote:
> > >
> > > > How do you access all the column attributes associated with a column
> > reboot
> > > > instance?
> > > >
> > > > The variables
> > > >
> > > > poisson2 ~ a matrix with 10,000 rows and 8 column attributes.
> > > >
> > > > Things I tried:
> > > >
> > > >
> > > > This command only returns a vector for one of the column attributes
> > > > x1_prob <- poisson2$Probability[poisson2$Reboot.Id=="Reboot 1"]
> > > >
> > > > The command below gave an error:
> > > > all_column_attributes_for_reboot_1 <-
> > poisson2[poisson2$Reboot.Id=="Reboot
> > > > 1"]
> > > >
> > > > Thank you,
> > > >
> > > >
> > >
> > >       [[alternative HTML version deleted]]
> > >
> > > ______________________________________________
> > > R-help at r-project.org mailing list
> > > https://stat.ethz.ch/mailman/listinfo/r-help
> > > PLEASE do read the posting guide
> > http://www.R-project.org/posting-guide.html
> > > and provide commented, minimal, self-contained, reproducible code.
> > --
> > %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
> >  Dr. Gavin Simpson             [t] +44 (0)20 7679 0522
> >  ECRC, UCL Geography,          [f] +44 (0)20 7679 0565
> >  Pearson Building,             [e] gavin.simpsonATNOSPAMucl.ac.uk
> >  Gower Street, London          [w] http://www.ucl.ac.uk/~ucfagls/<http://www.ucl.ac.uk/%7Eucfagls/>
> >  UK. WC1E 6BT.                 [w] http://www.freshwaters.org.uk
> > %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
> >
> >
> 
> 	[[alternative HTML version deleted]]
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
-- 
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
 Dr. Gavin Simpson             [t] +44 (0)20 7679 0522
 ECRC, UCL Geography,          [f] +44 (0)20 7679 0565
 Pearson Building,             [e] gavin.simpsonATNOSPAMucl.ac.uk
 Gower Street, London          [w] http://www.ucl.ac.uk/~ucfagls/
 UK. WC1E 6BT.                 [w] http://www.freshwaters.org.uk
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%