[BioC] DataFrame(..., check.names=F); colnames change when assign data using '[[', not when using '[,]'
Ludo Pagie
ludo.pagie at gmail.com
Fri May 30 14:49:18 CEST 2014
Dear all,
I'm using GRanges objects ('df') to store read counts on 7M genomic
regions. I found that the colnames of mcols(df) do not behave nicely
when using the names are not 'syntactic valid'. But this behavior
depends on whether columns are referenced using '[[' or using '['. I
would think this is not intended to work like this?
###############
# initialize a DataFrame with 3 columns and 2 rows. specify colnames
should not be checked for validity
df <- DataFrame(matrix(ncol=3, nrow=2), check.names=FALSE)
# assign names which would be changed if "check.names=TRUE"
names(df) <- c(1:3)
# names are as specified:
df
#DataFrame with 2 rows and 3 columns
# 1 2 3
# <logical> <logical> <logical>
#1 NA NA NA
#2 NA NA NA
# assign values to 1st column using '['
df[,1] <- 1:2
# df still has original names:
df
#DataFrame with 2 rows and 3 columns
# 1 2 3
# <integer> <logical> <logical>
#1 1 NA NA
#2 2 NA NA
# assign using '[['
df[[2]] <- 3:4
# colnames are now changed:
df
#DataFrame with 2 rows and 3 columns
# X1 X2 X3
# <integer> <integer> <logical>
#1 1 3 NA
#2 2 4 NA
# sessionInfo:
R version 3.0.2 (2013-09-25)
Platform: x86_64-unknown-linux-gnu (64-bit)
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
[3] LC_TIME=nl_NL.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=nl_NL.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=nl_NL.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=nl_NL.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] parallel stats graphics grDevices utils datasets methods
[8] base
other attached packages:
[1] IRanges_1.20.6 BiocGenerics_0.8.0
loaded via a namespace (and not attached):
[1] stats4_3.0.2
########################################################
Thanks, Ludo
More information about the Bioconductor
mailing list