[R] follow-up on Error when reading a SAS transport file (with sasxport.get from Hmisc)
Frank E Harrell Jr
f.harrell at vanderbilt.edu
Fri Oct 10 22:45:31 CEST 2008
Peter Dalgaard wrote:
> Jean-Louis Abitbol wrote:
>> I have done what P. Dalgaard has suggested and I don't find a
>> descrepancy between the number of values and the number of labels: there
>> 15 each...
>>
>> Any hint on what might go wrong here ?
>>
>
> Actually, I think you got it:
>
> > factor(1,c(NA,1:4),c(1:5))
> Error in factor(1, c(NA, 1:4), c(1:5)) :
> invalid labels; length 5 should be 1 or 4
>
> but
>
> > factor(1,c(NA,1:4),c(1:5),exclude=NULL)
> [1] 2
> Levels: 1 2 3 4 5
>
> so the issue is more than likely that your SAS format puts a label on
> "." (missing). You probably need something like
>
> factor(x, f$value, f$label, exclude=if (!any(is.na(f$value))) NA)
Thanks Peter. We will make this change in Hmisc for the next release.
Thomas - please take note. Thanks.
Frank
>> Here is the output
>>
>> The SAS format from proc contents
>>
>> VISITF
>> . =
>> INEXTXT -10
>> = Visit 1 [Screening]
>> 0 = Visit 2 [Baseline]
>> 1 = CRF
>> Tracking 6 =
>> Visit 6 7
>> = Tel.Contact (day 7)
>> 14 = Visit 3
>> 21 = Tel.Contact (day
>> 21) 28 = Visit
>> 4 35 =
>> Tel.Contact (day 35) 43
>> = Visit 5 [EOT]
>> 65 = Visit 6 [Follow-up]
>> 777 = End of
>> Study 888 =
>> Concomitant Med. 999
>> = Adverse Events the cat output with sep=" * " (manual
>> CR edit due to line length) Processing SAS dataset
>> ADMIN .x= * 43 * 28 * 0 * 14 * 43 * 0 * 28 *
>> 14 * 28 * 43 * 14 * 0 * 28 * 14 * 0 * 43 * 43 * 28 * 14 * 0 * 0 * 43 *
>> 28 * 14 * 0 * 43 * 28 * 14 * 0 * 14
>> * 0 * 43 * 28 * 43 * 28 * 14 * 0 * 43 * 28 * 14 * 0 * 43 * 28 * 14 * 0
>> * 43 * 14 * 0 * 0 * 43 * 28 * 14 * 0 *
>> 43 * 28 * 14 * 0 * 43 * 28 * 14 * 0 * 14 * 0 * 43 * 28 * 0 * 28 * 43 *
>> 14 * 14 * 0 * 28 * 43 * 0 * 43 * 0 *
>> 43 * 14 * 0 * 28 * 0 * 43 * 28 * 14 * 43 * 28 * 14 * 0 * 43 * 28 * 14
>> * 0 * 43 * 28 * 14 * 0 * 0 * 43 * 43 * 28 *
>> 14 * 0 * 43 * 28 * 14 * 0 * 43 * 28 * 14
>> * 0 * 43 * 28 * 14 * 0 * 43 * 28 * 14 * 0 * 43 * 28 * 14 * 0 * 43 * 28
>> * 14 * 0 * 28 * 14 * 0 * 43 * 14 * 0 * 43 * 28 * 0 * 14 * 43 * 28 *
>> 14 * 43 * 28 * 0 * 28 * 14 * 0 * 43 * 0 * 43
>> * 28 * 14 * 0 * 43 * 28 * 14 * 0 * 43 * 28 * 14 * 0 * 0 * 28 * 14 *
>> 0 * 43 * 43 * 28 * 14 * 0 * 43 * 28 * 14 * 0 *
>> 43 * 28 * 14 * 0 * 43 * 28 * 14 * 0 * 43 * 28 * 14 * 0 * 43 * 28 *
>> 14 * 0 * 28 * 14 * 0 * 43 * 28 * 14 * 0 * 28
>> * 14 * 0 * 14 * 28 * 43 * 0 * 43 * 14
>> * 28 * 0 * 28 * 14 * 43 * 0 * 0 * 14 * 0 * 28 * 43 * 43 * 28 * 14 * 0
>> * 43 * 28 * 14 * 0 * 43 * 28 * 14 * 0
>> * 14 * 0 * 43 * 28 * 43 * 28 * 14 * 0 * 43 * 28 * 14 * 0 * 43 * 28 *
>> 14 * 0 * 43 * 28 * 14 * 0 * 43 * 28 * 14 * 0 * 43 * 28 * 14 * 0
>> * 43 * 28 * 14 * 0 * 14 * 0 * 43 * 28 * 0
>> * 43 * 28 * 14 * 28 * 43 * 0 * 14
>> * 0 * 43 * 28 * 14 * 43 * 28 * 14 * 0 * 43 * 28 * 14 * 0 * 43 * 28 *
>> 14 * 0 * 43 * 28 * 14 * 0 * 43 * 28 * 14 * 0 * 43 * 28 * 14 *
>> 0 * 14 * 0 * 28 * 14 * 0 * 14 * 0 *
>> 43 * 28 * 14 * 0 * 43 * 28 * 14 * 0 * 43 * 28 * 14 * 0 * 43 *
>> 28 * 14 * 0 * 43 * 28 * 14 * 0 * 0 * 43 *
>> 28 * 14 * 0 * 14 * 43 * 28 * 28 * 14 * 0 * 43 * 43 * 28 * 0 *
>> 14 * 28 * 0 * 14 * 43 * 43 * 28 * 14 * 0 *
>> 43 * 28 * 14 * 0 * 43 * 28 * 14 * 0 * 28 * 14 * 0 * 43 * 43 *
>> 28 * 14 * 0 * 43 * 28 * 14 * 0 * 43 * 28 *
>> 14 * 0 * 43 * 28 * 14 * 0 * 43 * 28 * 14 * 0 * 43 * 28 * 14 *
>> 0 * 0 * 28 * 14 * 0 * 0 * 14 * 0 * 43 * 28
>> * 0 * 43 * 14 * 28 * 14 * 43 * 28 * 0 * 0 * 43 * 28 * 14 * 43
>> * 28 * 14 * 0 * 43 * 28 * 14 * 0 * 43 *
>> 28 * 14 * 0 * 43 * 14 * 0 * 43 * 28 * 14 * 0 * 0 * 43 * 28 *
>> 14 * 43 * 28 * 14 * 0 * 43 * 28 * 14 * 0 *
>> 43 * 28 * 14 * 0 * 43 * 28 * 14 * 0 * 43 * 28 * 14 * 0 * 43 *
>> 28 * 14 * 0 * 43 * 28 * 14 * 0 * 0 * 43 *
>> 28 * 14 * 0 * 14 * 28 * 43 * 14 * 28 * 0 * 43 * 43 * 28 * 0 *
>> 14 * 28 * 0 * 14 * 43 * 43 * 28 * 14 * 0 *
>> 43 * 28 * 14 * 0 * 43 * 28 * 14 * 0 * 28 * 14 * 0 * 43 * 43 *
>> 28 * 14 * 0 * 43 * 28 * 14 * 0 * 43 * 28 *
>> 14 * 0 * 43 * 28 * 14 * 0 * 43 * 28 * 14 * 0 * 43 * 28 * 14 *
>> 0 * 43 * 28 * 14 * 0 * 28 * 14 * 0 * 43 *
>> 14 * 0 * 28 * 43 * 14 * 0 * 43 * 28 * 43 * 28 * 0 * 14 * 43 *
>> 14 * 0 * 28 * 0 * 43 * 28 * 14 * 43 * 28 *
>> 14 * 0 * 43 * 28 * 14 * 0 * 43 * 28 * 14 * 0 * 0 * 43 * 28 *
>> 14 * 14 * 0 * 28 * 14 * 0 * 43 * 28 * 14 *
>> 0 * 0 * 43 * 28 * 14 * 0 * 43 * 28 * 14 * 0 * 43 * 28 * 14 * 0
>> * 43 * 28 * 14 * 0 * 28 * 14 * 0 * 43 *
>> 14 * 0 * 43 * 28 * 14 * 28 * 0 * 28 * 43 * 0 * 14 * 43 * 28 *
>> 14 * 0 * 14 * 0 * 43 * 28 * 43 * 28 * 14 *
>> 0 * 43 * 28 * 14 * 0 * 0 * 43 * 28 * 14 * 0 * 0 * 43 * 28 * 14
>> * 43 * 28 * 14 * 0 * 43 * 28 * 14 * 0 *
>> 43 * 28 * 14 * 0 * 43 * 28 * 14 * 0 * 43 * 28 * 14 * 0 * 43 *
>> 28 * 14 * 0 * 43 * 28 * 14 * 0 * 0 * 43 *
>> 28 * 14 * 14 * 43 * 28 * 0 * 14 * 0 * 14 * 0 * 43 * 28 * 43 *
>> 14 * 0 * 28 * 0 * 43 * 28 * 14 * 43 * 28 *
>> 14 * 0 * 43 * 28 * 14 * 0 * 43 * 28 * 14 * 0 * 0 * 28 * 14 *
>> 28 * 14 * 0 * f$value= * NA * -10 * 0 * 1 * 6 * 7 * 14 * 21 * 28 * 35
>> * 43 * 65 * 777
>> * 888 * 999 * f$label= * INEXTXT * Visit 1 [Screening] * Visit 2
>> [Baseline] * CRF
>> Tracking * Visit 6 * Tel.Contact (day 7) * Visit 3 * Tel.Contact (day
>> 21) * Visit 4 * Tel.Contact (day 35) * Visit 5 [EOT]
>> * Visit 6 [Follow-up] * End of Study * Concomitant Med. * Adverse Events
>> * Erreur dans factor(x, f$value, f$label) : invalid labels; length
>> 15 should be 1 or 14
>>
>> Thanks again, JL
>>
>>
>> On Thu, 09 Oct 2008 17:33:06 +0200, "Peter Dalgaard"
>> <P.Dalgaard at biostat.ku.dk> said:
>>
>>> Jean-Louis Abitbol wrote:
>>>
>>>> Dear All,
>>>>
>>>> I get the following error when using either SASxport or Hmisc to
>>>> read an
>>>> .xpt file:
>>>>
>>>>
>>>>> w <- read.xport("D:/consult/Trophos/dnp/base/TRO_ds_20081006.xpt")
>>>>>
>>>> Erreur dans factor(x, f$value, f$label) : invalid labels; length
>>>> 15 should be 1 or 14
>>>>
>>>>> z<- sasxport.get("D:/consult/Trophos/dnp/base/TRO_ds_20081006.xpt")
>>>>>
>>>> Erreur dans factor(x, f$value, f$label) : invalid labels; length
>>>> 15 should be 1 or 14
>>>>
>>>> I don't understand what is wrong with the labels ! Is there a limit for
>>>> their length ?
>>>> Could the problem be in the formats label ?
>>> Hmmnoo...
>>>
>>> This is happening in R code, and the error is the same as you'd get from
>>>
>>>
>>>> factor(1,levels=1:4,labels=1:5)
>>>>
>>> Error in factor(1, levels = 1:4, labels = 1:5) :
>>> invalid labels; length 5 should be 1 or 4
>>>
>>> So, not going into the actual code, I would suspect that it is
>>> encountering a problem where a user format has values and labels out of
>>> sync. This could well be a bug in the package(s), but I wouldn't rule
>>> out that your data could have gotten into some inconsistent state. You
>>> might try debugging to the trouble spot and see what is actually in
>>> f$value and f$label at that point.
>>>
>>>
>>>> Just in case this might help this is the output from test <-
>>>> lookup.xport("D:/consult/Trophos/dnp/base/TRO_ds_20081006.xpt")
>>>> print(test)
>>>>
>>>> for the first SAS dataset:
>>>> SAS xport file
>>>> --------------
>>>> Filename: `D:/consult/Trophos/dnp/base/TRO_ds_20081006.xpt'
>>>>
>>>> Variables in data set `ADMIN':
>>>> dataset name type format flength fdigits iformat iflength
>>>> ifdigits label nobs
>>>> ADMIN CEN numeric 5 0
>>>> 0 0 Centre 696
>>>> ADMIN PNO numeric 6 0
>>>> 0 0 Pat./Subj. number 696
>>>> ADMIN VISIT numeric VISITF 0 0
>>>> 0 0 Visit no. 696
>>>> ADMIN VISITR numeric 0 0
>>>> 0 0 Visit repeat 696
>>>> ADMIN PRO character 0 0
>>>> 0 0 Project number 696
>>>> ADMIN STUDY character 0 0
>>>> 0 0 Study number 696
>>>> ADMIN COLLDAT numeric DATE 7 0
>>>> 0 0 Date collected (study medication) 696
>>>> ADMIN COMM_O character 0 0
>>>> 0 0 Comment 696
>>>> ADMIN INEXMET numeric YESNOF 0 0
>>>> 0 0 In-/exclusion criteria still met? 696
>>>> ADMIN LABEL_NO numeric 4 0
>>>> 0 0 Medication number (label) 696
>>>> ADMIN RAND_NO numeric 4 0
>>>> 0 0 Lowest randomisation/medication number 696
>>>> ADMIN RETMED numeric 4 0
>>>> 0 0 Number of capsules returned 696
>>>> ADMIN PAGE numeric 0 0
>>>> 0 0 Page 696
>>>> ADMIN PAGER numeric 0 0
>>>> 0 0 Page repeat 696
>>>> ADMIN CT_RECID character $ 40 0 $
>>>> 40 0 for merge with notes and flags 696
>>>> ADMIN RNO numeric 4 0
>>>> 0 0 Randomisation number 696
>>>> ADMIN SAF numeric NOYESZF 0 0
>>>> 0 0 696
>>>> ADMIN ITT numeric NOYESZF 0 0
>>>> 0 0 696
>>>> ADMIN PP numeric NOYESZF 0 0
>>>> 0 0 696
>>>> ADMIN SEX numeric SEXF 0 0
>>>> 0 0 Sex 696
>>>> ADMIN AGE_C numeric 4 0
>>>> 0 0 Age calc 696
>>>> ADMIN TRT numeric TRTF 0 0
>>>> 0 0 696
>>>> ADMIN CRF_VERS character 0 0
>>>> 0 0 CRF Version no. 696
>>>>
>>>> Thanks for any help,
>>>>
>>>> Best wishes, Jean-Louis
>>>>
>>>> PS: sessionInfo()
>>>> R version 2.7.1 RC (2008-06-20 r45965) i386-pc-mingw32
>>>> locale:
>>>> LC_COLLATE=French_France.1252;LC_CTYPE=French_France.1252;LC_MONETARY=French_France.1252;LC_NUMERIC=C;LC_TIME=French_France.1252
>>>>
>>>>
>>>> attached base packages:
>>>> [1] stats graphics grDevices utils datasets methods
>>>> base
>>>> other attached packages:
>>>> [1] SASxport_1.2.3 Hmisc_3.4-3 foreign_0.8-29 RWinEdt_1.8-0
>>>> loaded via a namespace (and not attached):
>>>> [1] chron_2.3-24 cluster_1.11.11 grid_2.7.1 lattice_0.17-15
>>>>
>>>>
>>>> Jean-Louis Abitbol, MD
>>>> Chief Medical Officer
>>>> Trophos SA, Parc scientifique de Luminy, Case 931
>>>> Luminy Biotech Entreprises
>>>> 13288 Marseille Cedex 9 France
>>>> Email: jlabitbol at trophos.com ---- Backup Email: abitbol at sent.com
>>>> Cellular: (33) (0)6 24 47 59 34
>>>> Direct Line: (33) (0)4 91 82 82 73-Switchboard: (33) (0)4 91 82 82
>>>> 82 Fax: (33) (0)4 91 82 82 89
>>>>
>>>> ______________________________________________
>>>> R-help at r-project.org mailing list
>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>> PLEASE do read the posting guide
>>>> http://www.R-project.org/posting-guide.html
>>>> and provide commented, minimal, self-contained, reproducible code.
>>>>
>>> --
>>> O__ ---- Peter Dalgaard Øster Farimagsgade 5, Entr.B
>>> c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
>>> (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918
>>> ~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk) FAX: (+45) 35327907
>>>
>>>
>>>
>>>
>
>
--
Frank E Harrell Jr Professor and Chair School of Medicine
Department of Biostatistics Vanderbilt University
More information about the R-help
mailing list