[Rd] Incorrect behavior for ordering timepoints in "reshape" (PR#7669)

Dav Clark davclark at nyu.edu
Tue Feb 8 01:32:42 CET 2005


On Feb 7, 2005, at 6:38 PM, Peter Dalgaard wrote:

> davclark at nyu.edu writes:
>
>> Full_Name: Dav Clark
>> Version: 2.0.1
>> OS: OS X 10.3
>> Submission from: (NULL) (128.122.87.35)
>>
>>
>> When the timepoints that reshape uses (in direction="long") are 
>> negative or
>> fractional, the time label is assigned incorrectly.  It is easier to 
>> give an
>> example than to describe the problem abstractly:
>>
>> Assume you have a data.frame header with values related to 
>> peri-stimulus time
>> like this:
>>
>> "HRF -5" "HRF -2.5" "HRF 0" "HRF 2.5" ... "HRF 10"
>>
>> And you give reshape a split argument of a space " ".
>>
>> Then the labels will be assigned strangely, based on alphabetical 
>> ordering.  So
>> the above list order maps to:
>>
>> -2.5, -5, 0, 10, ... 2.5
>>
>> Items under the "HRF -5" column in wide format recieve a -2.5 label, 
>> items under
>> "HRF 2.5" receive a label of 10, and so on.
>>
>> Somewhere, the time labels are being used before conversion to 
>> numbers.  But,
>> reshape returns an error if it is not possible to convert the 
>> timepoints to
>> numeric!  So obviously, more functionality could be provided, or at 
>> least the
>> documentation should reflect the current shortfall.
>>
>> For completeness, here is a minimal example demonstrating the bug:
>>
>> df <- data.frame(id="S1", V1="from -2", V2="from -1")
>> names(df)[2:3] <- c("vals.-2", "vals.-1")
>> df
>> reshape(df, direction="long", varying=2:3)
>
> Hmm, this looks messed up even without the negatives. The guess()
> function inside reshape always sorts before converting to numeric, so
> you get the 1 10 11 2 3 4 5 6 7 8 9 effect, but what is worse: the
> sorting decouples the values from the variable names, as demonstrated
> by modifying your example slightly
>
>> reshape(df, direction="long", varying=3:2)
>       id time    vals
> S1.-1 S1   -1 from -1
> S1.-2 S1   -2 from -2
>
> I'm not at all sure I understand what was supposed to happen here,
> perhaps the sort in
>
>     varying <- unique(nn[, 1])
>     times <- sort(unique(nn[, 2]))
>
> is a thinko? Over to Thomas, I think.
>

Just to throw it out there, my current solution is to convert to 
integers, then run the following on the row numbers:

    new.nums <- formatC(new.nums, flag="0",
                         width=max(nchar(new.nums)))

But thanks for the observation, I was scratching my head so hard it 
hurt.

DC
	[[alternative text/enriched version deleted]]



More information about the R-devel mailing list