[R] creating a vector from a file

Matt Shotwell matt at biostatmatt.com
Tue May 31 16:44:40 CEST 2011


On Tue, 2011-05-31 at 16:19 +0200, heimat los wrote:
> On Tue, May 31, 2011 at 4:12 PM, Matt Shotwell <matt at biostatmatt.com>
> wrote:
>         On Tue, 2011-05-31 at 15:36 +0200, heimat los wrote:
>         > Hello all,
>         > I am new to R and my question should be trivial. I need to
>         create a word
>         > cloud from a txt file containing the words and their
>         occurrence number. For
>         > that purposes I am using the snippets package [1].
>         > As it can be seen at the bottom of the link, first I have to
>         create a vector
>         > (is that right that words is a vector?) like bellow.
>         >
>         > > words <- c(apple=10, pie=14, orange=5, fruit=4)
>         >
>         > My problem is to do the same thing but create the vector
>         from a file which
>         > would contain words and their occurence number. I would be
>         very happy if you
>         > could give me some hints.
>         
>         
>         How is the file formatted? Can you provide a small example?
>         
>         
> 
> The file format is
> 
> "video tape"=8
> "object recognition"=45
> "object detection"=23
> "vhs tape"=2
> 
> But I can change it if needed with bash scripting.

A CSV might be more universal, but this will do.

> Regards
> 

OK. Save the above as 'words.txt', then from the R prompt:

words.df <- read.table("words.txt", sep="=")
words.vec <- words.df$V2
names(words.vec) <- words.df$V1

Then use words.vec with the snippets::cloud function. I wasn't able to
install the snippets package and test the cloud function, because I am
still using R 2.13.0-alpha.

read.table returns what R calls a 'data frame'; basically a collection
of records over some number of fields. It's like a matrix but different,
since fields may take values of different types. In the example above,
the data frame returned by read.table has two fields named 'V1' and
'V2', respectively. The R expression 'words.df$V2' references the 'V2'
field of words.df, which is a vector. The last expression sets names for
words.vec, by referencing the 'V1' field of words.df. 

>  
>         > Moreover, to understand the format of the file to be
>         inserted I write the
>         > vector words to a file.
>         >
>         > > write(words, file="words.txt")
>         >
>         > However, the file words.txt contains only the values but not
>         the
>         > names(apple, pie etc.).
>         >
>         > $ cat words.txt
>         > 10 14 5 4
>         >
>         > It seems that I have to understand more about the data types
>         in R.
>         >
>         > Thanks.
>         > PH
>         >
>         > http://www.rforge.net/doc/packages/snippets/cloud.html
>         >
>         
>         >       [[alternative HTML version deleted]]
>         >
>         > ______________________________________________
>         > R-help at r-project.org mailing list
>         > https://stat.ethz.ch/mailman/listinfo/r-help
>         > PLEASE do read the posting guide
>         http://www.R-project.org/posting-guide.html
>         > and provide commented, minimal, self-contained, reproducible
>         code.
>         
>         
>



More information about the R-help mailing list