[R] reading table
Gabor Grothendieck
ggrothendieck at gmail.com
Wed Jan 9 18:48:10 CET 2008
Read in lines using readLines, delete all T and G characters
and reread using read.table:
Lines.raw <- "T 3 0 -- -- -- T -- -- -- 18.98
3 1 6.75 4.39 39 -- -- -- 18.58
3 2 6.90 4.90 43 -- -- -- 18.63
3 3 7.07 5.39 48 -- -- -- 18.78
G 4 0 7.41 5.54 47 G -- -- -- 18.90
4 1 7.44 5.99 30 10.93 5.30 23 18.95
4 2 7.27 6.05 23 11.16 5.74 19 18.96
4 3 7.27 5.54 27 11.58 5.95 18 18.97
"
# in reality next line would be Lines <- readLines("myfile.dat")
Lines <- readLines(textConnection(Lines.raw))
DF <- read.table(textConnection(gsub("[TG]", "", Lines)), na.strings = "--")
On Jan 9, 2008 10:18 AM, Abi Ghanem josephine
<josephine.abighanem at ibpc.fr> wrote:
> Hi,
> I am encountering a problem in reading a file,
> the file looks like that:
> T 3 0 -- -- -- T -- -- -- 18.98
> 3 1 6.75 4.39 39 -- -- -- 18.58
> 3 2 6.90 4.90 43 -- -- -- 18.63
> 3 3 7.07 5.39 48 -- -- -- 18.78
> G 4 0 7.41 5.54 47 G -- -- -- 18.90
> 4 1 7.44 5.99 30 10.93 5.30 23 18.95
> 4 2 7.27 6.05 23 11.16 5.74 19 18.96
> 4 3 7.27 5.54 27 11.58 5.95 18 18.97
> the first an the 7th column contains only T and G
> my problem is i want to have the 4th column as a vector : 6.75, 6.90,
> 7.07, 7.41, 7.44, 7.27, 7.27.
>
> when i do a simple read.delim(data, sep="", header=FALSE), i get this
>
> T 3 0 -- -- -- T -- -- -- 18.98
> 3 1 6.75 4.39 39 -- -- -- 18.58
> 3 2 6.90 4.90 43 -- -- -- 18.63
> 3 3 7.07 5.39 48 -- -- -- 18.78
> G 4 0 7.41 5.54 47 G -- -- -- 18.90
> 4 1 7.44 5.99 30 10.93 5.30 23 18.95
> 4 2 7.27 6.05 23 11.16 5.74 19 18.96
> 4 3 7.27 5.54 27 11.58 5.95 18 18.97
>
> with the first line containing T, 3, 3, 3, G, 4, 4, 4 so the values are
> shifted in the 1st and 5th row
>
>
> i tried to change sep="" to sep="\t", but than i don't get a matrix
> i just get a one column file.
> " T 3 0 -- -- -- T -- -- -- 18.98"
> " 3 1 6.75 4.39 39 -- -- -- 18.58"
> " 3 2 6.90 4.90 43 -- -- -- 18.63"
> " 3 3 7.07 5.39 48 -- -- -- 18.78"
> " G 4 0 7.41 5.54 47 G -- -- -- 18.90"
> " 4 1 7.44 5.99 30 10.93 5.30 23 18.95"
> " 4 2 7.27 6.05 23 11.16 5.74 19 18.96"
> " 4 3 7.27 5.54 27 11.58 5.95 18 18.97"
>
> My question is there is a way to read the file either with skipping the
> first column and the 7th,
> Or how can i get to have a vector with the 4th column.
>
> Thanks for the help,
> Josephine
>
> --
>
>
> Josephine ABI GHANEM
> IBPC, UPR 9080
> 13, rue P. et M. Curie,
> 75005 Paris, FRANCE
>
> email: josephine.abighanem at ibpc.fr
> tel: 01 58 41 51 67
> 06 28 07 25 71
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
More information about the R-help
mailing list