[R] transpose and split dataframe

Jim Lemon drj|m|emon @end|ng |rom gm@||@com
Tue Apr 30 23:46:32 CEST 2019


Hi Matthew,
Is this what you are trying to do?

mmdf<-read.table(text="Regulator    hits
AT1G69490    AT4G31950,AT5G24110,AT1G26380,AT1G05675
AT2G55980    AT2G85403,AT4G89223",header=TRUE,
stringsAsFactors=FALSE)
# split the second column at the commas
hitsplit<-strsplit(mmdf$hits,",")
# define a function that will fill with NAs
NAfill<-function(x,n) return(x[1:n])
# get the maximum length of hits
maxlen<-max(unlist(lapply(hitsplit,length)))
# fill the list with NAs
hitsplit<-lapply(hitsplit,NAfill,maxlen)
# change the names of the list
names(hitsplit)<-mmdf$Regulator
# convert to a data frame
tmmdf<-as.data.frame(hitsplit)

Jim

On Wed, May 1, 2019 at 5:25 AM Matthew <mccormack using molbio.mgh.harvard.edu> wrote:
>
> I have a data frame that is a lot bigger but for simplicity sake we can
> say it looks like this:
>
> Regulator    hits
> AT1G69490    AT4G31950,AT5G24110,AT1G26380,AT1G05675
> AT2G55980    AT2G85403,AT4G89223
>
>     In other words:
>
> data.frame : 2 obs. of 2 variables
> $Regulator: Factor w/ 2 levels
> $hits         : Factor w/ 6 levels
>
>    I want to transpose it so that Regulator is now the column headings
> and each of the AGI numbers now separated by commas is a row. So,
> AT1G69490 is now the header of the first column and AT4G31950 is row 1
> of column 1, AT5G24110 is row 2 of column 1, etc. AT2G55980 is header of
> column 2 and AT2G85403 is row 1 of column 2, etc.
>
>    I have tried playing around with strsplit(TF2list[2:2]) and
> strsplit(as.character(TF2list[2:2]), but I am getting nowhere.
>
> Matthew
>
> ______________________________________________
> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list