[R] Two last questions: about output
Ted Byers
r.ted.byers at gmail.com
Thu Oct 16 06:44:37 CEST 2008
Thanks Gabor,
To be clear, would something like testframe$est[[i]] <- fp$estimate be
valid within my loop, as in (assuming I created testframe before the
loop):
for (i in 1:length(V4) ) {
x = read.csv(as.character(V4[[i]]), header = FALSE, na.strings="");
y = x[,1];
fp = fitdistr(y,"exponential");
print(c(V1[[i]],V2[[i]],V3[[i]],fp$estimate,fp$sd))
testframe$est[[i]] <- fp$estimate
testframe$sd[[i]] <- fp$sd
}
Thanks
Ted
On Thu, Oct 16, 2008 at 12:08 AM, Gabor Grothendieck
<ggrothendieck at gmail.com> wrote:
> testframe$newvar <- ...whatever...
> (or see ?transform for another way)
> adds a new column to the data frame. The table does not
> have to pre-exist in your MySQL database and you don't need
> a create statement; however, if the table does pre-exist the columns
> of your data frame and those of the database table should have the
> same names in the same order and use dbWriteTable(..., append = TRUE)
>
>
> On Wed, Oct 15, 2008 at 11:54 PM, Ted Byers <r.ted.byers at gmail.com> wrote:
>> Thanks Gabor,
>>
>> I get how to make a frame using existing vectors. In my example, the
>> following puts my first three columns into a frame (and displays it:
>>
>>> testframe <- data.frame(mid=V1,year=V2,week=V3)
>>> testframe
>> mid year week
>> 1 251 2008 18
>> 2 251 2008 19
>> 3 251 2008 20
>> 4 251 2008 22
>> 5 251 2008 23
>> 6 251 2008 24
>> 7 251 2008 25
>>
>> I show the first of about 60 rows, and I am pleased that these values
>> appear as integers.
>>
>> But what I don't see is how to add the fp$estimate,fp$sd values
>> obtained from my analyses to vectors to form the last two columns in
>> the data frame. Is there something like a vector type, analogous to
>> the vector class std::vector from C++, that has a push_back function
>> allowing a vector to grow as new values are generated?
>>
>> And suppose I have the following table in MySQL (ignoring for the
>> moment keys and indeces):
>>
>> CREATE TABLE (
>> id INTEGER UNSIGNED NOT NULL auto_increment,
>> mid INTEGER NOT NULL,
>> y INTEGER NOT NULL,
>> w INTEGER NOT NULL,
>> rate DOUBLE NOT NULL,
>> sd DOUBLE NOT NULL
>> process_date DATETIME NOT NULL DEFAULT CURRENT_TIMESTAMP
>> ) ENGINE=InnoDB;
>>
>> How would I tell dbWriteTable() that my frame's five columns
>> correspond to mid,y,w,rate and sd in that order, and that the fields
>> id and process_date will take the appropriate default values? Or do I
>> need a temporary table, in memory, that has only the five columns, and
>> use a stored procedure to move the data to its final home?
>>
>> Thanks again,
>>
>> Ted
>>
>>
>> On Wed, Oct 15, 2008 at 9:57 PM, Gabor Grothendieck
>> <ggrothendieck at gmail.com> wrote:
>>> Put the data in an R data frame and use dbWriteTable() to
>>> write it to your MySQL database directly.
>>>
>>> On Wed, Oct 15, 2008 at 9:34 PM, Ted Byers <r.ted.byers at gmail.com> wrote:
>>>>
>>>> Here is my little scriptlet:
>>>>
>>>> optdata =
>>>> read.csv("K:\\MerchantData\\RiskModel\\AutomatedRiskModel\\soptions.dat",
>>>> header = FALSE, na.strings="")
>>>> attach(optdata)
>>>> library(MASS)
>>>> setwd("K:\\MerchantData\\RiskModel\\AutomatedRiskModel")
>>>> for (i in 1:length(V4) ) {
>>>> x = read.csv(as.character(V4[[i]]), header = FALSE, na.strings="");
>>>> y = x[,1];
>>>> fp = fitdistr(y,"exponential");
>>>> print(c(V1[[i]],V2[[i]],V3[[i]],fp$estimate,fp$sd))
>>>> }
>>>>
>>>>
>>>> And here are the first few lines of output:
>>>>
>>>> rate rate
>>>> 2.510000e+02 2.008000e+03 1.800000e+01 6.869301e-02 6.462095e-03
>>>> rate rate
>>>> 2.510000e+02 2.008000e+03 1.900000e+01 5.958023e-02 4.491029e-03
>>>> rate rate
>>>> 2.510000e+02 2.008000e+03 2.000000e+01 8.631714e-02 7.428996e-03
>>>> rate rate
>>>> 2.510000e+02 2.008000e+03 2.200000e+01 1.261538e-01 1.137491e-02
>>>> rate rate
>>>> 2.510000e+02 2.008000e+03 2.300000e+01 1.339523e-01 1.332875e-02
>>>> rate rate
>>>> 2.510000e+02 2.008000e+03 2.400000e+01 8.916084e-02 1.248501e-02
>>>>
>>>> There are only two things wrong, here.
>>>>
>>>> 1) the first three columns are integers, and are output variously as
>>>> integers, floating point numbers and, as shown here, in scientific notation.
>>>> 2) this output isn't going to a file or to my DB. This second issue isn't
>>>> much of a problem, as I think I know now how to deal with it.
>>>>
>>>> This output data is, in one sense, perfectly organized, and there is a table
>>>> with a nearly identical structure (these five columns, plus one to hold the
>>>> date on which the analysis is performed (and of course, therefore, it has a
>>>> default value of the current timestamp - handled in MySQL). If I can get
>>>> the data written to a CSV file, with the first three columns provided as
>>>> integers, I can use the DB's bulk load utility to get the data into the DB,
>>>> and this may be faster than having this scriptlet connecting directly to the
>>>> DB to insert the data (unless the DBI has a function for a bulk load that
>>>> helps here).
>>>>
>>>> Any idea how best to handle my formatting problem here?
>>>>
>>>> Thanks
>>>>
>>>> Ted
>>>> --
>>>> View this message in context: http://www.nabble.com/Two-last-questions%3A-about-output-tp20005519p20005519.html
>>>> Sent from the R help mailing list archive at Nabble.com.
>>>>
>>>> ______________________________________________
>>>> R-help at r-project.org mailing list
>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>>>> and provide commented, minimal, self-contained, reproducible code.
>>>>
>>>
>>
>
More information about the R-help
mailing list