[R] melt function chooses wrong id variable with large datasets

PIKAL Petr petr.pikal at precheza.cz
Thu Apr 16 13:41:47 CEST 2015


Hi

With this dataset I get

> dd.m0<-melt(dataset, na.rm=T)
Using norm as id variables
> head(dd.m0)
                norm variable value
1   45.8713463281901  januari  38.1
2 24.047250681782984  januari  32.4
3 3.7533684144746324  januari  34.5
4 38.594241119279324  januari  20.7
5 26.391897460120358  januari  21.5
6 61.746470001194638  januari  23.1
>
or

dd.m<-melt(dataset, id.vars=NULL, na.rm=T)

> head(dd.m)
  variable value
1  januari  38.1
2  januari  32.4
3  januari  34.5
4  januari  20.7
5  januari  21.5
6  januari  23.1
> tail(dd.m)
    variable              value
255     norm  4.856812959269508
256     norm 5.3982910143166514
257     norm 46.553976273304215
258     norm 17.566272518985429
259     norm 20.552451905814117
260     norm 61.894775704479279

The latter will put norm to the same column as months. Is it intended?

Maybe you want

> dd.m1<-melt(dataset[,-13], na.rm=T)
No id variables; using all as measure variables
> head(dd.m1)
  variable value
1  januari  38.1
2  januari  32.4
3  januari  34.5
4  januari  20.7
5  januari  21.5
6  januari  23.1
> tail(dd.m1)
    variable value
235 december  20.7
236 december  30.9
237 december  36.2
238 december  21.0
239 december  20.2
240 december  21.3

Cheers
Petr

From: Joachim Audenaert [mailto:Joachim.Audenaert at pcsierteelt.be]
Sent: Thursday, April 16, 2015 1:13 PM
To: PIKAL Petr
Cc: r-help at r-project.org
Subject: RE: [R] melt function chooses wrong id variable with large datasets

Hello,

This is a part of my dataset:

structure(list(januari = c(38.1, 32.4, 34.5, 20.7, 21.5, 23.1,
29.7, 36.6, 36.1, 20.6, 20.4, 30.1, 38.7, 41.4, 37, 36, 37, 38,
23, 26.7), februari = c(31.5, 36.2, 38.2, 26.4, 20.9, 21.5, 30.2,
33.4, 32.6, 22.2, 21.7, 30, 35.7, 32.8, 39.3, 25.5, 23, 19.9,
21.3, 20.8), maart = c(34.2, 27, 24.2, 19.9, 19.7, 21.5, 30.6,
30, 19, 19.6, 20.6, 23.6, 17.9, 17.3, 21.4, 24.1, 20.9, 30.1,
32.6, 21.3), april = c(26.3, 29.6, 30.3, 23.6, 28.4, 20.7, 24.1,
27.3, 23.2, 18.3, 24.6, 27.4, 20.4, 18.1, 25.2, 19.8, 21, 23.7,
19.6, 18.1), mei = c(23.7, 24, 17.2, 23.2, 25.2, 17.2, 16, 15.6,
13.4, 16, 16.8, 14.6, 19.4, 21, 19.5, 18.5, 13.3, 13.7, 14.3,
14.1), juni = c(17.7, 14.2, 16.6, 15.7, 13.7, 14.7, 13.1, 12.9,
15.4, 11.9, 15.2, 15.3, 16.5, 16.1, 11.7, 11.2, 11.5, 10.8, 16.1,
14.8), juli = c(15.7, 14.5, 10.8, 10.5, 13.4, 12.2, 13.2, 13,
12.4, 13.1, 9.8, 10.5, 13.4, 11, 13.1, 15, 16.7, 16.1, 18.2,
15.7), augustus = c(12.9, 12.8, 15.2, 14.5, 17.2, 14.5, 14.4,
11, 13.1, 13.6, 14.6, 12.7, 13.6, 12.7, 15.5, 17.4, 15.2, 14.2,
17.7, 19.2), september = c(15.6, 15.5, 15.9, 15.1, 16, 19.4,
21.5, 23.7, 18.7, 23.8, 18, 16.2, 18.5, 20.6, 18.3, 22.5, 26.9,
19.4, 15.9, 20.5), oktober = c(21.4, 20.8, 14, 17, 23, 26.4,
19.6, 22.7, 26.9, 14.7, 15.2, 19.8, 26.9, 20.2, 14.3, 14.8, 18.5,
21.7, 21.4, 21.8), november = c(24.7, 26.2, 29, 21.6, 17.1, 16.9,
19.1, 24.7, 25.4, 19.8, 18.2, 16.3, 17, 17.7, 15.5, 14.7, 15.8,
19.9, 20.4, 23.3), december = c(19.8, 27, 21, 33, 22.6, 28.3,
21.1, 19, 17.3, 27, 30.2, 24.8, 17.9, 17.9, 20.7, 30.9, 36.2,
21, 20.2, 21.3), norm = c("45.8713463281901", "24.047250681782984",
"3.7533684144746324", "38.594241119279324", "26.391897460120358",
"61.746470001194638", "6.8321020448487992", "11.933109250115226",
"51.951891096493924", "37.424611852237945", "5.1587836676942374",
"36.552835044409434", "31.781209673851027", "29.09146215582853",
"4.856812959269508", "5.3982910143166514", "46.553976273304215",
"17.566272518985429", "20.552451905814117", "61.894775704479279"
)), .Names = c("januari", "februari", "maart", "april", "mei",
"juni", "juli", "augustus", "september", "oktober", "november",
"december", "norm"), row.names = c(NA, 20L), class = "data.frame")

I transform my dataset with the following script:

y <- melt(dataset,na.rm=TRUE)
variable <- y[,1]
value <- y[,2]

and can then perform a levene test as follows:

LEVENE <- leveneTest(value~variable,y)

When the dataset is small, lets say less than 100 values per column everything works great. I get the message:

No id variables; using all as measure variables

When the dataset is much bigger I get the following message

Using norm as id variables, why does this function pick norm as id variable? and how can I tell R that each column title is my variable


Met vriendelijke groeten - With kind regards,

Joachim Audenaert
onderzoeker gewasbescherming - crop protection researcher

PCS | proefcentrum voor sierteelt - ornamental plant research
________________________________

Schaessestraat 18, 9070 Destelbergen, België
T: +32 (0)9 353 94 71 | F: +32 (0)9 353 94 95
E: joachim.audenaert at pcsierteelt.be<mailto:joachim.audenaert at pcsierteelt.be> | W: www.pcsierteelt.be<http://www.pcsierteelt.be/>



From:        PIKAL Petr <petr.pikal at precheza.cz<mailto:petr.pikal at precheza.cz>>
To:        Joachim Audenaert <Joachim.Audenaert at pcsierteelt.be<mailto:Joachim.Audenaert at pcsierteelt.be>>, "r-help at r-project.org<mailto:r-help at r-project.org>" <r-help at r-project.org<mailto:r-help at r-project.org>>
Date:        16/04/2015 12:13
Subject:        RE: [R]  melt function chooses wrong id variable with large datasets
________________________________



Hi

There is something weird with your data and melt function.

AFAIK melt does not use first row as id.variables.

What is result of

str(dataset)

Instead of

melt(dataset,id.vars=dataset[1,], na.rm=TRUE)

melt expects something like

melt(dataset, id.vars=c("norm, "jaar"), na.rm=TRUE)

If you want more specific answer you shall show us part of your data, preferably copy output of

dput(dataset[1:20,])

into your mail.

Cheers
Petr

> -----Original Message-----
> From: R-help [mailto:r-help-bounces at r-project.org] On Behalf Of Joachim
> Audenaert
> Sent: Thursday, April 16, 2015 11:37 AM
> To: r-help at r-project.org<mailto:r-help at r-project.org>
> Subject: [R] melt function chooses wrong id variable with large
> datasets
>
> Hello all,
>
> I'm using a large dataset consisting of 2 groups of data, 2 columns in
> excel with a header (group name) and 15 000 rows of data. I would like
> like to compare this data, so I transform my dataset with the melt
> function to get 1 column of data and 1 column of ID variables, then I
> can apply different statistical tests. With small datasets this works
> great, the melt function automatically chooses the name in row 1 as ID
> variable and melts the data, thus giving me a matrix with all ID
> variables in column one and the data accordingly in column 2.
> With this big dataset however it chooses the whole first column as ID
> variables in stead of the first row. Is there a reason why this happens
> and how can I make sure the first row is chosen as ID variabele and the
> lower rows as data?
>
> If I specify that I want the first row to be the id variable I also get
> error.
>
> melt(dataset,id.vars=dataset[1,], na.rm=TRUE)
>
> Error: id variables not found in data: norm, jaar
>
> Are there alternative ways to create a good reshaped dataset?
>
> Met vriendelijke groeten - With kind regards,
>
> Joachim Audenaert
> onderzoeker gewasbescherming - crop protection researcher
>
> PCS | proefcentrum voor sierteelt - ornamental plant research
>
> Schaessestraat 18, 9070 Destelbergen, Belgi
> T: +32 (0)9 353 94 71 | F: +32 (0)9 353 94 95
> E: joachim.audenaert at pcsierteelt.be<mailto:joachim.audenaert at pcsierteelt.be> | W: www.pcsierteelt.be
>
> Heb je je individuele begeleiding bemesting (CVBB) al aangevraagd? |
> Het PCS op LinkedIn Disclaimer | Please consider the environment before
> printing. Think green, keep it on the screen!
>       [[alternative HTML version deleted]]


________________________________
Tento e-mail a jakékoliv k nìmu pøipojené dokumenty jsou dùvìrné a jsou urèeny pouze jeho adresátùm.
Jestli¾e jste obdr¾el(a) tento e-mail omylem, informujte laskavì neprodlenì jeho odesílatele. Obsah tohoto emailu i s pøílohami a jeho kopie vyma¾te ze svého systému.
Nejste-li zamý¹leným adresátem tohoto emailu, nejste oprávnìni tento email jakkoliv u¾ívat, roz¹iøovat, kopírovat èi zveøejòovat.
Odesílatel e-mailu neodpovídá za eventuální ¹kodu zpùsobenou modifikacemi èi zpo¾dìním pøenosu e-mailu.

V pøípadì, ¾e je tento e-mail souèástí obchodního jednání:
- vyhrazuje si odesílatel právo ukonèit kdykoliv jednání o uzavøení smlouvy, a to z jakéhokoliv dùvodu i bez uvedení dùvodu.
- a obsahuje-li nabídku, je adresát oprávnìn nabídku bezodkladnì pøijmout; Odesílatel tohoto e-mailu (nabídky) vyluèuje pøijetí nabídky ze strany pøíjemce s dodatkem èi odchylkou.
- trvá odesílatel na tom, ¾e pøíslu¹ná smlouva je uzavøena teprve výslovným dosa¾ením shody na v¹ech jejích nále¾itostech.
- odesílatel tohoto emailu informuje, ¾e není oprávnìn uzavírat za spoleènost ¾ádné smlouvy s výjimkou pøípadù, kdy k tomu byl písemnì zmocnìn nebo písemnì povìøen a takové povìøení nebo plná moc byly adresátovi tohoto emailu pøípadnì osobì, kterou adresát zastupuje, pøedlo¾eny nebo jejich existence je adresátovi èi osobì jím zastoupené známá.

This e-mail and any documents attached to it may be confidential and are intended only for its intended recipients.
If you received this e-mail by mistake, please immediately inform its sender. Delete the contents of this e-mail with all attachments and its copies from your system.
If you are not the intended recipient of this e-mail, you are not authorized to use, disseminate, copy or disclose this e-mail in any manner.
The sender of this e-mail shall not be liable for any possible damage caused by modifications of the e-mail or by delay with transfer of the email.

In case that this e-mail forms part of business dealings:
- the sender reserves the right to end negotiations about entering into a contract in any time, for any reason, and without stating any reasoning.
- if the e-mail contains an offer, the recipient is entitled to immediately accept such offer; The sender of this e-mail (offer) excludes any acceptance of the offer on the part of the recipient containing any amendment or variation.
- the sender insists on that the respective contract is concluded only upon an express mutual agreement on all its aspects.
- the sender of this e-mail informs that he/she is not authorized to enter into any contracts on behalf of the company except for cases in which he/she is expressly authorized to do so in writing, and such authorization or power of attorney is submitted to the recipient or the person represented by the recipient, or the existence of such authorization is known to the recipient of the person represented by the recipient.



Heb je je individuele begeleiding bemesting (CVBB) al aangevraagd?<http://www.pcsierteelt.be/hosting/pcs/pcs_site.nsf/0/32795D6E62E2F6A8C1257DCE002F4A44?opendocument> | Het PCS op LinkedIn<http://www.linkedin.com/company/proefcentrum-voor-sierteelt>
Disclaimer<http://www.pcsierteelt.be/hosting/pcs/pcs_site.nsf/0/EABE6EFE9E0C1C55C1257AD100322499> | Please consider the environment before printing. Think green, keep it on the screen!

________________________________
Tento e-mail a jakékoliv k nìmu pøipojené dokumenty jsou dùvìrné a jsou urèeny pouze jeho adresátùm.
Jestli¾e jste obdr¾el(a) tento e-mail omylem, informujte laskavì neprodlenì jeho odesílatele. Obsah tohoto emailu i s pøílohami a jeho kopie vyma¾te ze svého systému.
Nejste-li zamý¹leným adresátem tohoto emailu, nejste oprávnìni tento email jakkoliv u¾ívat, roz¹iøovat, kopírovat èi zveøejòovat.
Odesílatel e-mailu neodpovídá za eventuální ¹kodu zpùsobenou modifikacemi èi zpo¾dìním pøenosu e-mailu.

V pøípadì, ¾e je tento e-mail souèástí obchodního jednání:
- vyhrazuje si odesílatel právo ukonèit kdykoliv jednání o uzavøení smlouvy, a to z jakéhokoliv dùvodu i bez uvedení dùvodu.
- a obsahuje-li nabídku, je adresát oprávnìn nabídku bezodkladnì pøijmout; Odesílatel tohoto e-mailu (nabídky) vyluèuje pøijetí nabídky ze strany pøíjemce s dodatkem èi odchylkou.
- trvá odesílatel na tom, ¾e pøíslu¹ná smlouva je uzavøena teprve výslovným dosa¾ením shody na v¹ech jejích nále¾itostech.
- odesílatel tohoto emailu informuje, ¾e není oprávnìn uzavírat za spoleènost ¾ádné smlouvy s výjimkou pøípadù, kdy k tomu byl písemnì zmocnìn nebo písemnì povìøen a takové povìøení nebo plná moc byly adresátovi tohoto emailu pøípadnì osobì, kterou adresát zastupuje, pøedlo¾eny nebo jejich existence je adresátovi èi osobì jím zastoupené známá.

This e-mail and any documents attached to it may be confidential and are intended only for its intended recipients.
If you received this e-mail by mistake, please immediately inform its sender. Delete the contents of this e-mail with all attachments and its copies from your system.
If you are not the intended recipient of this e-mail, you are not authorized to use, disseminate, copy or disclose this e-mail in any manner.
The sender of this e-mail shall not be liable for any possible damage caused by modifications of the e-mail or by delay with transfer of the email.

In case that this e-mail forms part of business dealings:
- the sender reserves the right to end negotiations about entering into a contract in any time, for any reason, and without stating any reasoning.
- if the e-mail contains an offer, the recipient is entitled to immediately accept such offer; The sender of this e-mail (offer) excludes any acceptance of the offer on the part of the recipient containing any amendment or variation.
- the sender insists on that the respective contract is concluded only upon an express mutual agreement on all its aspects.
- the sender of this e-mail informs that he/she is not authorized to enter into any contracts on behalf of the company except for cases in which he/she is expressly authorized to do so in writing, and such authorization or power of attorney is submitted to the recipient or the person represented by the recipient, or the existence of such authorization is known to the recipient of the person represented by the recipient.

	[[alternative HTML version deleted]]



More information about the R-help mailing list