[R] Identifying records with the correct number of repeated measures

Sarah Goslee sarah.goslee at gmail.com
Mon Dec 19 00:35:22 CET 2011


Thank you for asking a clear question and including a reproducible
small example.

Here's one possible (2-line) solution to your main question, and both
the others:

> WW_Names <- table(WW_Sample_SI$Individual_ID)
> WW_Names <- names(WW_Names)[WW_Names == 9]
> WW_Names
[1] "WW_08I_01" "WW_08I_03"
>
> #by ROWID you mean row names? If so:
> row.names(WW_Sample_SI) <- 1:nrow(WW_Sample_SI)
> head(WW_Sample_SI)
  Individual_ID Site_Name Latitude Longitude FeatherPosition Delta13C
1     WW_08I_01     Anjan 63.72935  12.54022              P1   -18.30
2     WW_08I_01     Anjan 63.72935  12.54022              P2   -18.53
3     WW_08I_01     Anjan 63.72935  12.54022              P3   -19.55
4     WW_08I_01     Anjan 63.72935  12.54022              P4   -20.18
5     WW_08I_01     Anjan 63.72935  12.54022              P5   -20.96
6     WW_08I_01     Anjan 63.72935  12.54022              P6   -21.08
>

# factor() can be used to eliminate unused levels
# your sample data doesn't have any, but here's an example:
> testdata <- factor(c("a", "a", "b", "c", "d"))
> str(testdata)
 Factor w/ 4 levels "a","b","c","d": 1 1 2 3 4
> testdata <- testdata[1:3]
> str(testdata)
 Factor w/ 4 levels "a","b","c","d": 1 1 2
> testdata <- factor(testdata)
> str(testdata)
 Factor w/ 2 levels "a","b": 1 1 2


Sarah

On Sun, Dec 18, 2011 at 5:38 PM, Keith Larson <keith.larson at biol.lu.se> wrote:
> Dear list,
>
> I have a dataset where we sampled multiple individuals either 1 or 9
> times. Our measurement variable is 'Delta13C' (see below sample
> dataset). I cannot figure out how to efficiently use a vector command
> (preferably) or a loop to create a new vector of the names of the
> individuals sampled 9 times. Note that the 'FeatherPosition' variable
> will only be "P1" for individuals sampled only once, while it will be
> %in% c('P1', 'P2', 'P3', 'P4', 'P5', 'P6', 'P7', 'P8', 'P9')  for
> individuals sampled 9 times. In my sample data below the new vector
> (e.g. WW_Names) would include only 'WW_08I_01' and 'WW_08I_03'.
>
> Two other quick questions: 1) how can I re-number my 'ROWID', as when
> I subset my complete dataset to a smaller dataset the old ROWID's are
> no longer meaningful, and 2) when I subset my dataset my 'factor'
> variables contain all the levels from the complete dataset, how can I
> reset these factor variables to condense my 'dump' file as much as
> possible?
>
> Many Holiday Cheers from a NEW R user!
> Keith
>
> Sample data:
>
> WW_Sample_SI <-
> structure(list(Individual_ID = structure(c(1L, 1L, 1L, 1L, 1L,
> 1L, 1L, 1L, 1L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 4L, 5L
> ), .Label = c("WW_08I_01", "WW_08I_02", "WW_08I_03", "WW_08I_04",
> "WW_08I_05"), class = "factor"), Site_Name = structure(c(1L,
> 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
> 1L, 1L, 1L, 1L), .Label = "Anjan", class = "factor"), Latitude = c(63.72935,
> 63.72935, 63.72935, 63.72935, 63.72935, 63.72935, 63.72935, 63.72935,
> 63.72935, 63.72935, 63.72935, 63.72935, 63.72935, 63.72935, 63.72935,
> 63.72935, 63.72935, 63.72935, 63.72935, 63.72935, 63.72935),
>    Longitude = c(12.54022, 12.54022, 12.54022, 12.54022, 12.54022,
>    12.54022, 12.54022, 12.54022, 12.54022, 12.54022, 12.54022,
>    12.54022, 12.54022, 12.54022, 12.54022, 12.54022, 12.54022,
>    12.54022, 12.54022, 12.54022, 12.54022), FeatherPosition = structure(c(1L,
>    2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 1L, 1L, 2L, 3L, 4L, 5L, 6L,
>    7L, 8L, 9L, 1L, 1L), .Label = c("P1", "P2", "P3", "P4", "P5",
>    "P6", "P7", "P8", "P9"), class = "factor"), Delta13C = c(-18.3,
>    -18.53, -19.55, -20.18, -20.96, -21.08, -21.5, -17.42, -13.18,
>    -19.95, -22.3, -22.2, -22.18, -22.14, -21.55, -20.85, -23.1,
>    -20.75, -20.9, -21.61, -22.24)), .Names = c("Individual_ID",
> "Site_Name", "Latitude", "Longitude", "FeatherPosition", "Delta13C"
> ), class = "data.frame", row.names = c("1282", "1277", "1279",
> "1270", "1272", "1274", "1280", "1276", "1271", "1284", "1289",
> "1290", "1295", "1293", "1292", "1288", "1291", "1285", "1297",
> "1298", "1299"))
>



-- 
Sarah Goslee
http://www.sarahgoslee.com



More information about the R-help mailing list