[R] melt function chooses wrong id variable with large datasets
Jeff Newmiller
jdnewmil at dcn.davis.CA.us
Thu Apr 16 14:53:14 CEST 2015
Maybe what you really want is the ?stack function.
---------------------------------------------------------------------------
Jeff Newmiller The ..... ..... Go Live...
DCN:<jdnewmil at dcn.davis.ca.us> Basics: ##.#. ##.#. Live Go...
Live: OO#.. Dead: OO#.. Playing
Research Engineer (Solar/Batteries O.O#. #.O#. with
/Software/Embedded Controllers) .OO#. .OO#. rocks...1k
---------------------------------------------------------------------------
Sent from my phone. Please excuse my brevity.
On April 16, 2015 4:59:47 AM PDT, Joachim Audenaert <Joachim.Audenaert at pcsierteelt.be> wrote:
>Thanks,
>
>indeed norm should be in the same group as as the months. everything
>works
>fine when the number of data is quite small, but with big datasets (15
>000
>values) things seem to go wrong and I can't explain why. It puts norm
>as
>an individual column in stead of in the group of months as it does when
>
>the dataset is small.
>
>Met vriendelijke groeten - With kind regards,
>
>Joachim Audenaert
>onderzoeker gewasbescherming - crop protection researcher
>
>PCS | proefcentrum voor sierteelt - ornamental plant research
>
>Schaessestraat 18, 9070 Destelbergen, Belgi�
>T: +32 (0)9 353 94 71 | F: +32 (0)9 353 94 95
>E: joachim.audenaert at pcsierteelt.be | W: www.pcsierteelt.be
>
>
>
>From: PIKAL Petr <petr.pikal at precheza.cz>
>To: Joachim Audenaert <Joachim.Audenaert at pcsierteelt.be>
>Cc: "r-help at r-project.org" <r-help at r-project.org>
>Date: 16/04/2015 13:41
>Subject: RE: [R] melt function chooses wrong id variable with
>large datasets
>
>
>
>Hi
>
>With this dataset I get
>
>> dd.m0<-melt(dataset, na.rm=T)
>Using norm as id variables
>> head(dd.m0)
> norm variable value
>1 45.8713463281901 januari 38.1
>2 24.047250681782984 januari 32.4
>3 3.7533684144746324 januari 34.5
>4 38.594241119279324 januari 20.7
>5 26.391897460120358 januari 21.5
>6 61.746470001194638 januari 23.1
>>
>or
>
>dd.m<-melt(dataset, id.vars=NULL, na.rm=T)
>
>> head(dd.m)
> variable value
>1 januari 38.1
>2 januari 32.4
>3 januari 34.5
>4 januari 20.7
>5 januari 21.5
>6 januari 23.1
>> tail(dd.m)
> variable value
>255 norm 4.856812959269508
>256 norm 5.3982910143166514
>257 norm 46.553976273304215
>258 norm 17.566272518985429
>259 norm 20.552451905814117
>260 norm 61.894775704479279
>
>The latter will put norm to the same column as months. Is it intended?
>
>Maybe you want
>
>> dd.m1<-melt(dataset[,-13], na.rm=T)
>No id variables; using all as measure variables
>> head(dd.m1)
> variable value
>1 januari 38.1
>2 januari 32.4
>3 januari 34.5
>4 januari 20.7
>5 januari 21.5
>6 januari 23.1
>> tail(dd.m1)
> variable value
>235 december 20.7
>236 december 30.9
>237 december 36.2
>238 december 21.0
>239 december 20.2
>240 december 21.3
>
>Cheers
>Petr
>
>From: Joachim Audenaert [mailto:Joachim.Audenaert at pcsierteelt.be]
>Sent: Thursday, April 16, 2015 1:13 PM
>To: PIKAL Petr
>Cc: r-help at r-project.org
>Subject: RE: [R] melt function chooses wrong id variable with large
>datasets
>
>Hello,
>
>This is a part of my dataset:
>
>structure(list(januari = c(38.1, 32.4, 34.5, 20.7, 21.5, 23.1,
>29.7, 36.6, 36.1, 20.6, 20.4, 30.1, 38.7, 41.4, 37, 36, 37, 38,
>23, 26.7), februari = c(31.5, 36.2, 38.2, 26.4, 20.9, 21.5, 30.2,
>33.4, 32.6, 22.2, 21.7, 30, 35.7, 32.8, 39.3, 25.5, 23, 19.9,
>21.3, 20.8), maart = c(34.2, 27, 24.2, 19.9, 19.7, 21.5, 30.6,
>30, 19, 19.6, 20.6, 23.6, 17.9, 17.3, 21.4, 24.1, 20.9, 30.1,
>32.6, 21.3), april = c(26.3, 29.6, 30.3, 23.6, 28.4, 20.7, 24.1,
>27.3, 23.2, 18.3, 24.6, 27.4, 20.4, 18.1, 25.2, 19.8, 21, 23.7,
>19.6, 18.1), mei = c(23.7, 24, 17.2, 23.2, 25.2, 17.2, 16, 15.6,
>13.4, 16, 16.8, 14.6, 19.4, 21, 19.5, 18.5, 13.3, 13.7, 14.3,
>14.1), juni = c(17.7, 14.2, 16.6, 15.7, 13.7, 14.7, 13.1, 12.9,
>15.4, 11.9, 15.2, 15.3, 16.5, 16.1, 11.7, 11.2, 11.5, 10.8, 16.1,
>14.8), juli = c(15.7, 14.5, 10.8, 10.5, 13.4, 12.2, 13.2, 13,
>12.4, 13.1, 9.8, 10.5, 13.4, 11, 13.1, 15, 16.7, 16.1, 18.2,
>15.7), augustus = c(12.9, 12.8, 15.2, 14.5, 17.2, 14.5, 14.4,
>11, 13.1, 13.6, 14.6, 12.7, 13.6, 12.7, 15.5, 17.4, 15.2, 14.2,
>17.7, 19.2), september = c(15.6, 15.5, 15.9, 15.1, 16, 19.4,
>21.5, 23.7, 18.7, 23.8, 18, 16.2, 18.5, 20.6, 18.3, 22.5, 26.9,
>19.4, 15.9, 20.5), oktober = c(21.4, 20.8, 14, 17, 23, 26.4,
>19.6, 22.7, 26.9, 14.7, 15.2, 19.8, 26.9, 20.2, 14.3, 14.8, 18.5,
>21.7, 21.4, 21.8), november = c(24.7, 26.2, 29, 21.6, 17.1, 16.9,
>19.1, 24.7, 25.4, 19.8, 18.2, 16.3, 17, 17.7, 15.5, 14.7, 15.8,
>19.9, 20.4, 23.3), december = c(19.8, 27, 21, 33, 22.6, 28.3,
>21.1, 19, 17.3, 27, 30.2, 24.8, 17.9, 17.9, 20.7, 30.9, 36.2,
>21, 20.2, 21.3), norm = c("45.8713463281901", "24.047250681782984",
>"3.7533684144746324", "38.594241119279324", "26.391897460120358",
>"61.746470001194638", "6.8321020448487992", "11.933109250115226",
>"51.951891096493924", "37.424611852237945", "5.1587836676942374",
>"36.552835044409434", "31.781209673851027", "29.09146215582853",
>"4.856812959269508", "5.3982910143166514", "46.553976273304215",
>"17.566272518985429", "20.552451905814117", "61.894775704479279"
>)), .Names = c("januari", "februari", "maart", "april", "mei",
>"juni", "juli", "augustus", "september", "oktober", "november",
>"december", "norm"), row.names = c(NA, 20L), class = "data.frame")
>
>I transform my dataset with the following script:
>
>y <- melt(dataset,na.rm=TRUE)
>variable <- y[,1]
>value <- y[,2]
>
>and can then perform a levene test as follows:
>
>LEVENE <- leveneTest(value~variable,y)
>
>When the dataset is small, lets say less than 100 values per column
>everything works great. I get the message:
>
>No id variables; using all as measure variables
>
>When the dataset is much bigger I get the following message
>
>Using norm as id variables, why does this function pick norm as id
>variable? and how can I tell R that each column title is my variable
>
>
>Met vriendelijke groeten - With kind regards,
>
>Joachim Audenaert
>onderzoeker gewasbescherming - crop protection researcher
>
>PCS | proefcentrum voor sierteelt - ornamental plant research
>
>
>Schaessestraat 18, 9070 Destelbergen, Belgi�
>T: +32 (0)9 353 94 71 | F: +32 (0)9 353 94 95
>E: joachim.audenaert at pcsierteelt.be | W: www.pcsierteelt.be
>
>
>
>From: PIKAL Petr <petr.pikal at precheza.cz>
>To: Joachim Audenaert <Joachim.Audenaert at pcsierteelt.be>, "
>r-help at r-project.org" <r-help at r-project.org>
>Date: 16/04/2015 12:13
>Subject: RE: [R] melt function chooses wrong id variable with
>large datasets
>
>
>
>
>Hi
>
>There is something weird with your data and melt function.
>
>AFAIK melt does not use first row as id.variables.
>
>What is result of
>
>str(dataset)
>
>Instead of
>
>melt(dataset,id.vars=dataset[1,], na.rm=TRUE)
>
>melt expects something like
>
>melt(dataset, id.vars=c("norm, "jaar"), na.rm=TRUE)
>
>If you want more specific answer you shall show us part of your data,
>preferably copy output of
>
>dput(dataset[1:20,])
>
>into your mail.
>
>Cheers
>Petr
>
>> -----Original Message-----
>> From: R-help [mailto:r-help-bounces at r-project.org] On Behalf Of
>Joachim
>> Audenaert
>> Sent: Thursday, April 16, 2015 11:37 AM
>> To: r-help at r-project.org
>> Subject: [R] melt function chooses wrong id variable with large
>> datasets
>>
>> Hello all,
>>
>> I'm using a large dataset consisting of 2 groups of data, 2 columns
>in
>> excel with a header (group name) and 15 000 rows of data. I would
>like
>> like to compare this data, so I transform my dataset with the melt
>> function to get 1 column of data and 1 column of ID variables, then I
>> can apply different statistical tests. With small datasets this works
>> great, the melt function automatically chooses the name in row 1 as
>ID
>> variable and melts the data, thus giving me a matrix with all ID
>> variables in column one and the data accordingly in column 2.
>> With this big dataset however it chooses the whole first column as ID
>> variables in stead of the first row. Is there a reason why this
>happens
>> and how can I make sure the first row is chosen as ID variabele and
>the
>> lower rows as data?
>>
>> If I specify that I want the first row to be the id variable I also
>get
>> error.
>>
>> melt(dataset,id.vars=dataset[1,], na.rm=TRUE)
>>
>> Error: id variables not found in data: norm, jaar
>>
>> Are there alternative ways to create a good reshaped dataset?
>>
>> Met vriendelijke groeten - With kind regards,
>>
>> Joachim Audenaert
>> onderzoeker gewasbescherming - crop protection researcher
>>
>> PCS | proefcentrum voor sierteelt - ornamental plant research
>>
>> Schaessestraat 18, 9070 Destelbergen, Belgi
>> T: +32 (0)9 353 94 71 | F: +32 (0)9 353 94 95
>> E: joachim.audenaert at pcsierteelt.be | W: www.pcsierteelt.be
>>
>> Heb je je individuele begeleiding bemesting (CVBB) al aangevraagd? |
>> Het PCS op LinkedIn Disclaimer | Please consider the environment
>before
>> printing. Think green, keep it on the screen!
>> [[alternative HTML version deleted]]
>
>
>________________________________
>Tento e-mail a jak�koliv k n�mu p�ipojen� dokumenty jsou d�v�rn� a jsou
>
>ur�eny pouze jeho adres�t�m.
>Jestli�e jste obdr�el(a) tento e-mail omylem, informujte laskav�
>neprodlen� jeho odes�latele. Obsah tohoto emailu i s p��lohami a jeho
>kopie vyma�te ze sv�ho syst�mu.
>Nejste-li zam�len�m adres�tem tohoto emailu, nejste opr�vn�ni tento
>email
>jakkoliv u��vat, roz�i�ovat, kop�rovat �i zve�ej�ovat.
>Odes�latel e-mailu neodpov�d� za eventu�ln� �kodu zp�sobenou
>modifikacemi
>�i zpo�d�n�m p�enosu e-mailu.
>
>V p��pad�, �e je tento e-mail sou��st� obchodn�ho jedn�n�:
>- vyhrazuje si odes�latel pr�vo ukon�it kdykoliv jedn�n� o uzav�en�
>smlouvy, a to z jak�hokoliv d�vodu i bez uveden� d�vodu.
>- a obsahuje-li nab�dku, je adres�t opr�vn�n nab�dku bezodkladn�
>p�ijmout;
>Odes�latel tohoto e-mailu (nab�dky) vylu�uje p�ijet� nab�dky ze strany
>p��jemce s dodatkem �i odchylkou.
>- trv� odes�latel na tom, �e p��slu�n� smlouva je uzav�ena teprve
>v�slovn�m dosa�en�m shody na v�ech jej�ch n�le�itostech.
>- odes�latel tohoto emailu informuje, �e nen� opr�vn�n uzav�rat za
>spole�nost ��dn� smlouvy s v�jimkou p��pad�, kdy k tomu byl p�semn�
>zmocn�n nebo p�semn� pov��en a takov� pov��en� nebo pln� moc byly
>adres�tovi tohoto emailu p��padn� osob�, kterou adres�t zastupuje,
>p�edlo�eny nebo jejich existence je adres�tovi �i osob� j�m zastoupen�
>zn�m�.
>
>This e-mail and any documents attached to it may be confidential and
>are
>intended only for its intended recipients.
>If you received this e-mail by mistake, please immediately inform its
>sender. Delete the contents of this e-mail with all attachments and its
>
>copies from your system.
>If you are not the intended recipient of this e-mail, you are not
>authorized to use, disseminate, copy or disclose this e-mail in any
>manner.
>The sender of this e-mail shall not be liable for any possible damage
>caused by modifications of the e-mail or by delay with transfer of the
>email.
>
>In case that this e-mail forms part of business dealings:
>- the sender reserves the right to end negotiations about entering into
>a
>contract in any time, for any reason, and without stating any
>reasoning.
>- if the e-mail contains an offer, the recipient is entitled to
>immediately accept such offer; The sender of this e-mail (offer)
>excludes
>any acceptance of the offer on the part of the recipient containing any
>
>amendment or variation.
>- the sender insists on that the respective contract is concluded only
>upon an express mutual agreement on all its aspects.
>- the sender of this e-mail informs that he/she is not authorized to
>enter
>into any contracts on behalf of the company except for cases in which
>he/she is expressly authorized to do so in writing, and such
>authorization
>or power of attorney is submitted to the recipient or the person
>represented by the recipient, or the existence of such authorization is
>
>known to the recipient of the person represented by the recipient.
>
>
>
>Heb je je individuele begeleiding bemesting (CVBB) al aangevraagd? |
>Het
>PCS op LinkedIn
>Disclaimer | Please consider the environment before printing. Think
>green,
>keep it on the screen!
>
>Tento e-mail a jak�koliv k n�mu p�ipojen� dokumenty jsou d�v�rn� a jsou
>
>ur�eny pouze jeho adres�t�m.
>Jestli�e jste obdr�el(a) tento e-mail omylem, informujte laskav�
>neprodlen� jeho odes�latele. Obsah tohoto emailu i s p��lohami a jeho
>kopie vyma�te ze sv�ho syst�mu.
>Nejste-li zam�len�m adres�tem tohoto emailu, nejste opr�vn�ni tento
>email
>jakkoliv u��vat, roz�i�ovat, kop�rovat �i zve�ej�ovat.
>Odes�latel e-mailu neodpov�d� za eventu�ln� �kodu zp�sobenou
>modifikacemi
>�i zpo�d�n�m p�enosu e-mailu.
>
>V p��pad�, �e je tento e-mail sou��st� obchodn�ho jedn�n�:
>- vyhrazuje si odes�latel pr�vo ukon�it kdykoliv jedn�n� o uzav�en�
>smlouvy, a to z jak�hokoliv d�vodu i bez uveden� d�vodu.
>- a obsahuje-li nab�dku, je adres�t opr�vn�n nab�dku bezodkladn�
>p�ijmout;
>Odes�latel tohoto e-mailu (nab�dky) vylu�uje p�ijet� nab�dky ze strany
>p��jemce s dodatkem �i odchylkou.
>- trv� odes�latel na tom, �e p��slu�n� smlouva je uzav�ena teprve
>v�slovn�m dosa�en�m shody na v�ech jej�ch n�le�itostech.
>- odes�latel tohoto emailu informuje, �e nen� opr�vn�n uzav�rat za
>spole�nost ��dn� smlouvy s v�jimkou p��pad�, kdy k tomu byl p�semn�
>zmocn�n nebo p�semn� pov��en a takov� pov��en� nebo pln� moc byly
>adres�tovi tohoto emailu p��padn� osob�, kterou adres�t zastupuje,
>p�edlo�eny nebo jejich existence je adres�tovi �i osob� j�m zastoupen�
>zn�m�.
>
>This e-mail and any documents attached to it may be confidential and
>are
>intended only for its intended recipients.
>If you received this e-mail by mistake, please immediately inform its
>sender. Delete the contents of this e-mail with all attachments and its
>
>copies from your system.
>If you are not the intended recipient of this e-mail, you are not
>authorized to use, disseminate, copy or disclose this e-mail in any
>manner.
>The sender of this e-mail shall not be liable for any possible damage
>caused by modifications of the e-mail or by delay with transfer of the
>email.
>
>In case that this e-mail forms part of business dealings:
>- the sender reserves the right to end negotiations about entering into
>a
>contract in any time, for any reason, and without stating any
>reasoning.
>- if the e-mail contains an offer, the recipient is entitled to
>immediately accept such offer; The sender of this e-mail (offer)
>excludes
>any acceptance of the offer on the part of the recipient containing any
>
>amendment or variation.
>- the sender insists on that the respective contract is concluded only
>upon an express mutual agreement on all its aspects.
>- the sender of this e-mail informs that he/she is not authorized to
>enter
>into any contracts on behalf of the company except for cases in which
>he/she is expressly authorized to do so in writing, and such
>authorization
>or power of attorney is submitted to the recipient or the person
>represented by the recipient, or the existence of such authorization is
>
>known to the recipient of the person represented by the recipient.
>
>
>Heb je je individuele begeleiding bemesting (CVBB) al aangevraagd? |
>Het
>PCS op LinkedIn
>Disclaimer | Please consider the environment before printing. Think
>green,
>keep it on the screen!
> [[alternative HTML version deleted]]
>
>
>
>------------------------------------------------------------------------
>
>______________________________________________
>R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list