[Bioc-devel] Non-ASCII in datase from Biomart EMBL via Gviz package

Vincent Carey stvjc at channing.harvard.edu
Sun Oct 12 20:35:39 CEST 2014


I don't know exactly how you are triggering this warning.  If you have the
ability to prefilter your content before serializing, that may be best.
The following
is from the gwascat package.  You have very little chance, I believe, of
getting an
institutional guarantee that only ascii will go into their emissions.

fixNonASCII = function(df) {
 hasNonASCII = function(x) {
   asc = iconv(x, "latin1", "ASCII")
   any(asc != x | is.na(asc))
   }
 havebad = sapply(df, function(x) hasNonASCII(x))
 if (!(any(havebad))) return(df)
 message("NOTE: input data had non-ASCII characters replaced by '*'.")
 badinds = which(havebad)
 for (i in 1:length(badinds))
   df[,badinds[i]] = iconv(df[,badinds[i]], to="ASCII", sub="*")
 df
}



On Sun, Oct 12, 2014 at 2:14 PM, Martin, Tiphaine <tiphaine.martin at kcl.ac.uk
> wrote:

> Hi,
>
>
> I need to create dataset BiomartGeneRegionTrack via Gviz package to run
> examples in my packages. But when I run
>
> "R CMD check coMET", i have warning message for the checking :
>
>
>  checking data for non-ASCII characters ... WARNING
>   Warning: found non-ASCII strings
>   '[alpha cell,acidophil cell,acinar cell,adipoblast,adipocyte,amacrine
> cell,beta cell,capsular cell,cementocyte,chief
> cell,chondroblast,chondrocyte,chromaffin cell,chromophobic
> cell,corticotroph,delta cell,dendritic cell,enterochromaffin
> cell,ependymocyte,epithelium,erythroblast,erythrocyte,fibroblast,fibrocyte,follicular
> cell,germ cell,germinal epithelium,giant cell,glial cell,glioblast,goblet
> cell,gonadotroph,granulosa cell,haemocytoblast,hair
> cell,hepatoblast,hepatocyte,hyalocyte,interstitial cell,juxtaglomerular
> cell,keratinocyte,keratocyte,lemmal cell,leukocyte,luteal cell,lymphocytic
> stem cell,lymphoid cell,lymphoid stem cell,macroglial cell,mammotroph,mast
> cell,medulloblast,megakaryoblast,megakaryocyte,melanoblast,melanocyte,mesangial
> cell,mesothelium,metamyelocyte,monoblast,monocyte,mucous neck cell,muscle
> cell,myelocyte,myeloid cell,myeloid stem cell,myoblast,myoepithelial
> cell,myofibrobast,neuroblast,neuroepithelium,neuron,odontoblast,osteoblast,osteoclast,osteocy!
>  te,oxyntic cell,parafollicular cell,paraluteal cell,peptic
> cell,pericyte,phaeochromocyte,phalangeal cell,pinealocyte,pituicyte,plasma
> cell,platelet,podocyte,proerythroblast,promonocyte,promyeloblast,promyelocyte,pronormoblast,reticulocyte,retinal
> pigment epithelium,retinoblast,somatotroph,stem cell,sustentacular
> cell,teloglial cell,zymogenic cell,small cell,Th1,Cell Type,M<c3><bc>ller
> cell,primary oocyte,Claudius' cell,Th2,follicular dendritic
> cell,astrocyte,white,T-lymphoblast,basal cell,T-lymphocyte,helper induced
> T-lymphocyte:Th2,B-lymphocyte,neutrophil,oocyte,unclassifiable (Cell
> Type),natural killer cell,helper induced T-lymphocyte,brown,CD4+,Hensen
> cell,lymphocyte,cardiac muscle cell,lymphoblast,Paneth cell,alveolar
> macrophage,macrophage,squamous cell,oligodendrocyte,smooth muscle
> cell,gamete,spermatid,Schwann cell,CD34+,spermatocyte,helper induced
> T-lymphocyte:Th1,astroblast,eosinophil,oligodendroblast,basophil,peripheral
> blood mononuclear cell,histiocyte,Sertoli cel!
>  l,endothelium,granulocyte,spermatozoon,Merkel cell,skeletal muscle cel
> l,thymocyte,foam cell,ovum,secondary spermatocyte,Langerhans cell,primary
> spermatocyte,transitional,Purkinje cell,Kupffer cell,secondary
> oocyte,B-lymphoblast]' in object 'biomTrack'
>
>
> chrom <- "chr2"
> start <- 38290160
> end <- 38303219
> gen <- "hg19"
>
>   biomTrack <- BiomartGeneRegionTrack(genome = gen,
>                                       chromosome = chr, start = start,
>                                       end = end,  name = "ENSEMBL",
>                                       fontcolor="black", groupAnnotation =
> "group",
>                                       just.group = "above",showId=showId )
>
>
> Do you have an idea to correct this error? I think that we need to discuss
> with EMBL to correct that, do we ?
>
>
> Tiphaine
>
>
> ----------------------------
> Tiphaine Martin
> PhD Research Student | King's College
> The Department of Twin Research & Genetic Epidemiology | Genetics &
> Molecular Medicine Division
> St Thomas' Hospital
> 4th Floor, Block D, South Wing
> SE1 7EH, London
> United Kingdom
>
> email : tiphaine.martin at kcl.ac.uk
> Fax: +44 (0) 207 188 6761
>
>         [[alternative HTML version deleted]]
>
> _______________________________________________
> Bioc-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/bioc-devel
>

	[[alternative HTML version deleted]]



More information about the Bioc-devel mailing list