[R] Help with
arun
smartpink111 at yahoo.com
Thu Oct 18 18:37:20 CEST 2012
Hi,
You can also try this:
dat1<-read.table(text="
1 1 3
1 2 54
1 3 11
1 4 17
2 1 5
2 4 78
2 5 20
",sep="",header=FALSE)
library(reshape2)
dat2<-cast(dat1,V1~V2)
dat2<-dat2[,-1]
dat2[is.na(dat2)]<-0
dat3<-as.matrix(dat2)
dat3
# [,1] [,2] [,3] [,4] [,5]
#[1,] 3 54 11 17 0
#[2,] 5 0 0 78 20
A.K.
----- Original Message -----
From: Rui Esteves <ruimaximo at gmail.com>
To: r-help at r-project.org
Cc:
Sent: Thursday, October 18, 2012 8:44 AM
Subject: [R] Help with
Hi,
I downloaded a dataset from UCI repositories named Bag of Words:
http://archive.ics.uci.edu/ml/machine-learning-databases/bag-of-words/readme.txt
The dataset is in a text file with the following structure:
---
docID1 wordID1 count
docID1 wordID2 count
docID1 wordID3 count
docID1 wordID4 count
...
docID2 wordID2 count
docID2 wordID5 count
docID2 wordID6 count
---
Where docIDx is an integer that identifies the document x; wordIDy is
an integer that identifies the word y ; and count is an integer with
the number of times that the wordIDy appears in the docIDx.
Example:
---
1 1 3
1 2 54
1 3 11
1 4 17
2 1 5
2 4 78
2 5 20
---
I would like to import the file into a matrix (not sparse) where:
the wordIDy would correspond to the column [,y]
the docIDx would correspond to the row [x,]
the value in [x,y] would be the count of wordIDy in the docIDx
So, for the previous example it would be like:
[,1][,2][,3][,4][,5]
[1,] 3 54 11 17 0
[2,] 5 0 0 78 20
I don1t have a clue about how to do this.
Can someone please help me?
Thank you
Rui
[[alternative HTML version deleted]]
______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list