[R] Novice question about getting data into R
Petr PIKAL
petr.pikal at precheza.cz
Thu Jun 21 15:21:00 CEST 2012
Hi
I can read the example you provided without much problem.
dput(head(test))
structure(list(n = 0:5, X = c(NA, NA, NA, NA, NA, NA), start = c(11185L,
39530L, 40544L, 109684L, 114629L, 118841L), X.1 = c(NA, NA, NA,
NA, NA, NA), dur = c(1L, 2L, 1L, 1L, 0L, 1L), X.2 = c(NA, NA,
NA, NA, NA, NA), pause = c(28344L, 1012L, 69139L, 4944L, 4212L,
2558L), X.3 = c(NA, NA, NA, NA, NA, NA), par = c(0, 100, 100,
100, 0, 100), X.4 = c(NA, NA, NA, NA, NA, NA), ins = c(2L, 3L,
2L, 2L, 1L, 2L), X.5 = c(NA, NA, NA, NA, NA, NA), del = c(0L,
0L, 0L, 0L, 0L, 0L), X.6 = c(NA, NA, NA, NA, NA, NA), sid =
structure(c(10L,
13L, 16L, 1L, 11L, 12L), .Label = c(" -1", " -1+11+13+15", " -1+110",
" -1+16", " -1+26+29", " -1+27+30", " -1+32", " -1+4+5", " -1+48",
" 1", " 17", " 18+19", " 2", " 20", " 28", " 3", " 36", " 37",
" 38", " 42", " 43", " 45", " 49", " 50", " 53", " 54", " 58",
" 59", " 61+64"), class = "factor"), X.7 = c(NA, NA, NA, NA,
NA, NA), tid = structure(c(1L, 6L, 20L, 30L, 38L, 39L), .Label = c(" 1",
" 10+11+12", " 13+14", " 15+16+17", " 18+19", " 2+3", " 20",
" 21", " 22", " 23", " 24+25", " 26", " 27+28+29", " 30+31+32",
" 33+34", " 35", " 36+37", " 38", " 39", " 4", " 40", " 41",
" 42", " 43", " 44+45", " 46", " 47", " 48", " 49", " 5", " 50",
" 51", " 52+93", " 53", " 54", " 55", " 56", " 6", " 7", " 8",
" 9"), class = "factor"), X.8 = c(NA, NA, NA, NA, NA, NA), str =
structure(c(5L,
6L, 5L, 5L, 4L, 5L), .Label = c(" ,", " ,_", " .", " ・", " ・・",
" ・・・", " ・・・.", " ・・・・", " ・・・・・"), class = "factor")), .Names = c("n",
"X", "start", "X.1", "dur", "X.2", "pause", "X.3", "par", "X.4",
"ins", "X.5", "del", "X.6", "sid", "X.7", "tid", "X.8", "str"
), row.names = c(NA, 6L), class = "data.frame")
Only Chinese characters are missing and some extra columns appear
> str(test)
'data.frame': 41 obs. of 19 variables:
$ n : int 0 1 2 3 4 5 6 7 8 9 ...
$ X : logi NA NA NA NA NA NA ...
$ start: int 11185 39530 40544 109684 114629 118841 121400 128201 129793
131852 ...
$ X.1 : logi NA NA NA NA NA NA ...
$ dur : int 1 2 1 1 0 1 1 1 436 608 ...
$ X.2 : logi NA NA NA NA NA NA ...
$ pause: int 28344 1012 69139 4944 4212 2558 6800 1591 1623 3573 ...
$ X.3 : logi NA NA NA NA NA NA ...
$ par : num 0 100 100 100 0 100 100 100 0 100 ...
$ X.4 : logi NA NA NA NA NA NA ...
$ ins : int 2 3 2 2 1 2 2 2 3 3 ...
$ X.5 : logi NA NA NA NA NA NA ...
$ del : int 0 0 0 0 0 0 0 0 0 0 ...
$ X.6 : logi NA NA NA NA NA NA ...
$ sid : Factor w/ 29 levels " -1"," -1+11+13+15",..: 10 13 16 1 11 12 1
1 2 4 ...
$ X.7 : logi NA NA NA NA NA NA ...
$ tid : Factor w/ 41 levels " 1"," 10+11+12",..: 1 6 20 30 38 39 40 41 2
3 ...
$ X.8 : logi NA NA NA NA NA NA ...
$ str : Factor w/ 9 levels " ,"," ,_"," .",..: 5 6 5 5 4 5 5 5 6 6 ...
> sessionInfo()
R Under development (unstable) (2012-03-03 r58569)
Platform: i386-pc-mingw32/i386 (32-bit)
locale:
[1] LC_COLLATE=Czech_Czech Republic.1250 LC_CTYPE=Czech_Czech
Republic.1250
[3] LC_MONETARY=Czech_Czech Republic.1250 LC_NUMERIC=C
[5] LC_TIME=Czech_Czech Republic.1250
Regards
Petr
> Dear Professor Daalgard,
>
> I beginning to participate in one research of statiscal modelling of
> translators'activity data, and recently install R and try to generate
the
> one Translation Progress Graph, as my colleagues do (with sucess), but
in my
> Windows platform was found the error below. According R'FAQs, it seems
to be
> very common error, as I'm not even familiar with the program R and even
with
> the ProGra, could you help me? Please!
>
> Note: the Translation Progress Graph is compost by quintuple data {S, T,
A,
> F, K} for Source and Target Text, Alignment, Fixation and Keyboar data,
> respectively.
>
>
> >ReadData("C:/Users/schmaltz/Dropbox/EN-CH/proGra/EN-ZH_P2_T4_T2")
> Reading Fixation Units:
> C:/Users/schmaltz/Dropbox/EN-CH/proGra/EN-ZH_P2_T4_T2 .fu
> Reading Production Units:
> C:/Users/schmaltz/Dropbox/EN-CH/proGra/EN-ZH_P2_T4_T2 .pu
> Error in scan(file, what, nmax, sep, dec, quote, skip, nlines,
na.strings, :
> line 38 did not have 10 elements
>
> Note: We try to delete the line 38, and the program results in another
line
> error. Even delete all lines after, the some error occur. I think is not
one
> encoding error, due the fact my colleague use Linux, and I Windows.
>
> Sample of file above:
> n start dur pause par ins del sid tid str
> 0 11185 1 28344 0 2 0 1 1 尽管
> 1 39530 2 1012 100.00 3 0 2 2+3 发展中
> 2 40544 1 69139 100.00 2 0 3 4 国家
> 3 109684 1 4944 100.00 2 0 -1 5 关于
> 4 114629 0 4212 0 1 0 17 6 为
> 5 118841 1 2558 100.00 2 0 18+19 7 贫困
> 6 121400 1 6800 100.00 2 0 -1 8 人民
> 7 128201 1 1591 100.00 2 0 -1 9 争取
> 8 129793 436 1623 0 3 0 -1+11+13+15 10+11+12 更好的
> 9 131852 608 3573 100.00 3 0 -1+16 13+14 生活的
> 10 136033 1202 1309 100.00 5 0 -1+4+5 15+16+17 说辞是
可以
> 11 138544 468 3682 100.00 3 0 -1 18+19 理解的
> 12 142694 359 10811 0 2 0 20 20 ,_
> 13 153864 0 2121 0 1 0 -1 21 但
> 14 155985 1 2838 100.00 2 0 -1 22 其实
> 15 158824 1 1435 100.00 2 0 -1 23 保护
> 16 160260 421 3619 87.65 3 0 -1 24+25 环境和
> 17 164300 1 1075 100.00 2 0 28 26 经济
> 18 165376 1108 1030 100.00 4 0 -1+26+29 27+28+29 发展
是不
> 19 167514 1466 8440 54.98 4 0 -1+27+30 30+31+32 冲突的
.
> 20 177420 906 4023 100.00 4 0 -1+32 33+34 我们必须
> 21 182349 1 1622 100.00 2 0 36 35 鼓励
> 22 183972 2 1573 100.00 3 0 37 36+37 发展中
> 23 185547 1 15381 100.00 2 0 38 38 国家
> 24 200929 1 1934 100.00 2 0 42 39 扩展
> 25 202864 1 5864 100.00 2 0 43 40 绿色
> 26 208729 1 4383 100.00 2 0 -1 41 植被
> 27 213113 0 1497 0 1 0 45 42 ,
> 28 214610 1 2963 100.00 2 0 -1 43 发展
> 29 217574 906 5085 100.00 4 0 -1+48 44+45 节能科技
> 30 223565 0 1575 0 1 0 49 46 ,
> 31 225140 1 2683 100.00 2 0 50 47 并且
> 32 227824 1 2136 100.00 2 0 53 48 帮助
> 33 229961 1 6613 100.00 2 0 -1 49 它们
> 34 236575 1 6068 100.00 2 0 54 50 减少
> 35 242644 1 2635 100.00 2 0 -1 51 环境
> 36 245280 343 8315 100.00 3 0 -1+110 52+93 污染和
> 37 253938 1 1653 100.00 2 0 -1 53 破坏
> 38 255592 0 25381 0 1 0 58 54 .
> 39 280973 1 1809 100.00 2 0 59 55 一些
> 40 282783 1 16878 100.00 2 0 61+64 56 国家
>
> Thank you very much!
>
> Marcia
>
> --
> View this message in context: http://r.789695.n4.nabble.com/Novice-
> question-about-getting-data-into-R-tp866806p4633954.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list