[R] Basic data question
David Winsemius
dwinsemius at comcast.net
Thu Oct 14 06:00:26 CEST 2010
On Oct 13, 2010, at 11:52 PM, Santosh Srinivas wrote:
> I have a question about the output given below after running few
> lines of
> code. Surely a 101 query!
>
> MF_Data <- read.csv("MF_Data_F.txt", header = F, sep="|")
> temp <- head(MF_Data) #Get the sample Data
> temp1 <- subset(temp, select= c(V1,V4,V6)) #where V1, V4, V6 are the
> col
> names .. to Get the relevant data
> names(temp1) <- c('Ticker', 'Price','Date') #Adjusted column names
>
> Now as expected, I get:
>> temp1
> Ticker Price Date
> 1 106270 10.3287 01-Apr-2008
> 2 106269 10.3287 01-Apr-2008
> 3 102767 12.6832 01-Apr-2008
> 4 102766 10.5396 01-Apr-2008
> 5 102855 9.7833 01-Apr-2008
> 6 102856 12.1485 01-Apr-2008
>
> BUT, for the below:
> temp1$Price
> [1] 10.3287 10.3287 12.6832 10.5396 9.7833 12.1485
> 439500 Levels: -101.2358 -102.622 -2171.1276 -6796.4926 -969.5193 ...
> Repurchase Price
>
> What is this line? "439500 Levels: -101.2358 -102.622 -2171.1276
> -6796.4926
> -969.5193 ... Repurchase Price"??
>
It tells you that the Price column got constructed as a factor. One of
the items in the input data couldn't be coerced to numeric hence
looked like a character variable and the default stringsAsFactors
setting of TRUE resulted in classifying that column as factor rather
than as numeric (or character. Your Date column is surely a factor
variable.
You may want to look at colClasses in the read.table help page.
The read.zoo function in the zoo package may have better behavior for
this sort of data input task.
> Many thanks for the help.
>
> Santosh
--
David.
More information about the R-help
mailing list