[R] Oddity: I seem to have a variable in a dataframe that doesn't show in colnames() - can anyone advise?

Chris Evans chrishold at psyctc.org
Sun May 29 16:52:42 CEST 2011


I may be being dopey, I surely am, but I'm baffled by this.  I've been
working, on and off for a few days in R version 2.13.0 (2011-04-13)
i386-pc-mingw32/i386 (32-bit) working it through ESS.

I've got a dataframe created a couple of days back, during the session:
> dim(AllDat)
[1] 27270    94

I came back this morning and misremembered my variables and thought I
had a variable AllDat$PHQ and started using it and everything seemed
fine until I realised that I shouldn't have it (!) and that the variable
I was thinking of is AllDat$PHQ9 and that's there:
> colnames(AllDat)[grep("PHQ",colnames(AllDat))]
[1] "PHQ9"    "HasPHQ"  "ZeroPHQ"

and, as you can see, AllDat$PHQ.  But I can I do:

> head(table(AllDat$PHQ))
  0   1   2   3   4   5
731 527 764 845 872 915

Ooops ... so AllDat$PHQ _DOES_ exist.  Its contents exactly match
AllDat$PHQ9:
> table(abs(AllDat$PHQ - AllDat$PHQ9))
    0
19032

I have searched back through my ESS transcript back to the start of the
session and I can't see anywhere I've assigned to AllDat$PHQ (and I've
never used "attach").

However, I guess that somehow I must have managed to duplicate AllDat in
more than one open environment so I check out and I have 16 environments
(I'm sure that's not right terminology, apologies):
> search()
 [1] ".GlobalEnv"        "package:reshape2"
 [3] "package:Hmisc"     "package:survival"
 [5] "package:splines"   "package:nnet"
 [7] "package:MASS"      "package:gdata"
 [9] "package:stats"     "package:graphics"
[11] "package:grDevices" "package:utils"
[13] "package:datasets"  "package:methods"
[15] "Autoloads"         "package:base"

So I try:
> for (i in 1:16) { print(paste("i =",i,exists("AllDat",i,inherits =
FALSE))) }
[1] "i = 1 TRUE"
[1] "i = 2 FALSE"
[1] "i = 3 FALSE"
[1] "i = 4 FALSE"
[1] "i = 5 FALSE"
[1] "i = 6 FALSE"
[1] "i = 7 FALSE"
[1] "i = 8 FALSE"
[1] "i = 9 FALSE"
[1] "i = 10 FALSE"
[1] "i = 11 FALSE"
[1] "i = 12 FALSE"
[1] "i = 13 FALSE"
[1] "i = 14 FALSE"
[1] "i = 15 FALSE"
[1] "i = 16 FALSE"

So I don't think I do have two different AllDat dataframes.

Can anyone throw light on what's going on?  I have searched archives
etc. but can't think of sensible keywords and so far turned up nothing.
 Happy to be told RTFM or the equivalent but could someone point me to a
specific location?  Also happy to try any diagnostics anyone recommends.

Many thanks in advance,

Chris

-- 
Chris Evans <chris at psyctc.org> Skype: chris-psyctc
Consultant Psychiatrist in Psychotherapy, Notts. PDD network;
Professor, Psychotherapy, Nottingham University
*If I am writing from one of those roles, it will be clear. Otherwise*
*my views are my own and not representative of those institutions    *
If you have difficulty Emailing me on this address or getting a reply,
send again but cc to:       chris dot evans at nottshc dot nhs dot uk
and to:                     c dot evans at nottingham dot ac dot uk



More information about the R-help mailing list