[Rd] Incorrect behavior for ordering timepoints in "reshape"
(PR#7669)
Dav Clark
davclark at nyu.edu
Tue Feb 8 01:32:42 CET 2005
On Feb 7, 2005, at 6:38 PM, Peter Dalgaard wrote:
> davclark at nyu.edu writes:
>
>> Full_Name: Dav Clark
>> Version: 2.0.1
>> OS: OS X 10.3
>> Submission from: (NULL) (128.122.87.35)
>>
>>
>> When the timepoints that reshape uses (in direction="long") are
>> negative or
>> fractional, the time label is assigned incorrectly. It is easier to
>> give an
>> example than to describe the problem abstractly:
>>
>> Assume you have a data.frame header with values related to
>> peri-stimulus time
>> like this:
>>
>> "HRF -5" "HRF -2.5" "HRF 0" "HRF 2.5" ... "HRF 10"
>>
>> And you give reshape a split argument of a space " ".
>>
>> Then the labels will be assigned strangely, based on alphabetical
>> ordering. So
>> the above list order maps to:
>>
>> -2.5, -5, 0, 10, ... 2.5
>>
>> Items under the "HRF -5" column in wide format recieve a -2.5 label,
>> items under
>> "HRF 2.5" receive a label of 10, and so on.
>>
>> Somewhere, the time labels are being used before conversion to
>> numbers. But,
>> reshape returns an error if it is not possible to convert the
>> timepoints to
>> numeric! So obviously, more functionality could be provided, or at
>> least the
>> documentation should reflect the current shortfall.
>>
>> For completeness, here is a minimal example demonstrating the bug:
>>
>> df <- data.frame(id="S1", V1="from -2", V2="from -1")
>> names(df)[2:3] <- c("vals.-2", "vals.-1")
>> df
>> reshape(df, direction="long", varying=2:3)
>
> Hmm, this looks messed up even without the negatives. The guess()
> function inside reshape always sorts before converting to numeric, so
> you get the 1 10 11 2 3 4 5 6 7 8 9 effect, but what is worse: the
> sorting decouples the values from the variable names, as demonstrated
> by modifying your example slightly
>
>> reshape(df, direction="long", varying=3:2)
> id time vals
> S1.-1 S1 -1 from -1
> S1.-2 S1 -2 from -2
>
> I'm not at all sure I understand what was supposed to happen here,
> perhaps the sort in
>
> varying <- unique(nn[, 1])
> times <- sort(unique(nn[, 2]))
>
> is a thinko? Over to Thomas, I think.
>
Just to throw it out there, my current solution is to convert to
integers, then run the following on the row numbers:
new.nums <- formatC(new.nums, flag="0",
width=max(nchar(new.nums)))
But thanks for the observation, I was scratching my head so hard it
hurt.
DC
[[alternative text/enriched version deleted]]
More information about the R-devel
mailing list