[BioC] Gene names
Wolfgang Huber
huber at ebi.ac.uk
Sun Nov 6 13:08:48 CET 2005
Hi Narendra,
R is also very good for this sort of thing. Have a look at the strsplit
function.
x = readLines("yourfile")
sp = strsplit(x, split="|")
(see the man page of strsplit) and from this you can construct e.g. a
vector with the j-th column through
sapply(sp, "[", j)
Cheers
Wolfgang
-------------------------------------
Wolfgang Huber
European Bioinformatics Institute
European Molecular Biology Laboratory
Cambridge CB10 1SD
England
Phone: +44 1223 494642
Fax: +44 1223 494486
Http: www.ebi.ac.uk/huber
-------------------------------------
J.delasHeras at ed.ac.uk wrote:
> Quoting Narendra Kaushik <kaushiknk at Cardiff.ac.uk>:
>
>
>>I have gene file in this format, everything in one column (no spaces at all):
>>SFTPB|NM_000542.1|4506904|surfactant, pulmonary-associated protein B
>>Is there any way to convert it in this format (into four columns) except
>>manually?
>>
>>SFTPB NM_000542.1 4506904
>>surfactant, pulmonary-associated protein B
>>
>>Any suggestions?
>>
>>Narendra
>
>
> Maybe too obvious, but Excel is very good for this sort of thing.
> Functions like
> Search allow you to obtain the position of a particulat character (like
> "|") and
> knowing that you can select the text to the left or right to it... if you do
> that consecutively you can sort it like that. It'll take a minute.
>
More information about the Bioconductor
mailing list