[R] creating a vector from a file
Matt Shotwell
matt at biostatmatt.com
Tue May 31 16:44:40 CEST 2011
On Tue, 2011-05-31 at 16:19 +0200, heimat los wrote:
> On Tue, May 31, 2011 at 4:12 PM, Matt Shotwell <matt at biostatmatt.com>
> wrote:
> On Tue, 2011-05-31 at 15:36 +0200, heimat los wrote:
> > Hello all,
> > I am new to R and my question should be trivial. I need to
> create a word
> > cloud from a txt file containing the words and their
> occurrence number. For
> > that purposes I am using the snippets package [1].
> > As it can be seen at the bottom of the link, first I have to
> create a vector
> > (is that right that words is a vector?) like bellow.
> >
> > > words <- c(apple=10, pie=14, orange=5, fruit=4)
> >
> > My problem is to do the same thing but create the vector
> from a file which
> > would contain words and their occurence number. I would be
> very happy if you
> > could give me some hints.
>
>
> How is the file formatted? Can you provide a small example?
>
>
>
> The file format is
>
> "video tape"=8
> "object recognition"=45
> "object detection"=23
> "vhs tape"=2
>
> But I can change it if needed with bash scripting.
A CSV might be more universal, but this will do.
> Regards
>
OK. Save the above as 'words.txt', then from the R prompt:
words.df <- read.table("words.txt", sep="=")
words.vec <- words.df$V2
names(words.vec) <- words.df$V1
Then use words.vec with the snippets::cloud function. I wasn't able to
install the snippets package and test the cloud function, because I am
still using R 2.13.0-alpha.
read.table returns what R calls a 'data frame'; basically a collection
of records over some number of fields. It's like a matrix but different,
since fields may take values of different types. In the example above,
the data frame returned by read.table has two fields named 'V1' and
'V2', respectively. The R expression 'words.df$V2' references the 'V2'
field of words.df, which is a vector. The last expression sets names for
words.vec, by referencing the 'V1' field of words.df.
>
> > Moreover, to understand the format of the file to be
> inserted I write the
> > vector words to a file.
> >
> > > write(words, file="words.txt")
> >
> > However, the file words.txt contains only the values but not
> the
> > names(apple, pie etc.).
> >
> > $ cat words.txt
> > 10 14 5 4
> >
> > It seems that I have to understand more about the data types
> in R.
> >
> > Thanks.
> > PH
> >
> > http://www.rforge.net/doc/packages/snippets/cloud.html
> >
>
> > [[alternative HTML version deleted]]
> >
> > ______________________________________________
> > R-help at r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible
> code.
>
>
>
More information about the R-help
mailing list