[R] Problems with Merge
Prof Brian Ripley
ripley at stats.ox.ac.uk
Tue Jun 5 14:38:24 CEST 2007
Take a look at the help for merge(): in all the examples by.x is a
character string, not a one-column data frame which is what
rhistory["Date"] would appear to be.
Please note the trailer of this messsage.
On Tue, 5 Jun 2007, Patnaik, Tirthankar wrote:
> Hi,
> I have a history dataset, a matrix with about 1590 obs, and 242 cols,
> and I need to update this matrix with an 'update' matrix that has about
> 30 rows, and roughly similar number of columns as the history ds (but
> not necessarily equal). The update dataset is read from an Excel ODBC
> connection. When I try and merge these datasets, I get counter-intuitive
> results.
>
> library(RODBC)
> chn <- odbcConnectExcel(UpdateFile)
> sqlTables(chn)
> UpdateData <- sqlFetch(chn,"MCap243")
> colnames(UpdateData) <- gsub("#",".",colnames(UpdateData))
> close(chn)
> # specify just how many rows we need from the Update file. We'd only
> read five
> # rows at a time.
> UpdateRows = 20
> UpdateData <- UpdateData[1:UpdateRows,]
>
> # Delete Unwanted stocks.
> UpdateData <- UpdateData[,!names(UpdateData) %in% ToBeDeleted]
> x <- tail(UpdateData[c("Date","ABAN.BO")],n=50)
> print(x)
>
>
> Gives x as:
>
> Date ABAN.BO
> 1 2007-04-30 96448.40
> 2 2007-05-01 96448.40
> 3 2007-05-02 96448.40
> 4 2007-05-03 96300.44
> 5 2007-05-04 93718.52
> 6 2007-05-05 93718.52
> 7 2007-05-06 93718.52
> 8 2007-05-07 92743.82
> 9 2007-05-08 90374.60
> 10 2007-05-09 89126.18
> 11 2007-05-10 87082.47
> 12 2007-05-11 85493.73
> 13 2007-05-12 85493.73
> 14 2007-05-13 85493.73
> 15 2007-05-14 85033.21
> 16 2007-05-15 89209.41
> 17 2007-05-16 89089.19
> 18 2007-05-17 90472.62
> 19 2007-05-18 90326.51
> 20 2007-05-19 90326.51
>
> But when I merge this file with the history dataset, I get the dates
> misaligned by one row.
>
> whistory <- merge(rhistory,UpdateData,by.x=rhistory["Date"],all=TRUE)
>
> tail(WHist4[c("Date","ABAN.BO")],n=30)
>
> Date ABAN.BO
> 1581 2007-04-19 83632.60
> 1582 2007-04-20 85942.00
> 1583 2007-04-23 88244.00
> 1584 2007-04-24 90309.50
> 1585 2007-04-25 92048.00
> 1586 2007-04-26 92051.70
> 1587 2007-04-27 95863.10
> 1588 2007-04-29 96448.40
> 1589 2007-04-30 96448.40
> 1590 2007-04-30 96343.40
> 1591 2007-05-01 96448.40
> 1592 2007-05-02 96300.44
> 1593 2007-05-03 93718.52
> 1594 2007-05-03 96195.60
> 1595 2007-05-04 93718.52
> 1596 2007-05-04 93616.50
> 1597 2007-05-05 93718.52
> 1598 2007-05-06 92743.82
> 1599 2007-05-07 90374.60
> 1600 2007-05-08 89126.18
> 1601 2007-05-09 87082.47
> 1602 2007-05-10 85493.73
> 1603 2007-05-11 85493.73
> 1604 2007-05-12 85493.73
> 1605 2007-05-13 85033.21
> 1606 2007-05-14 89209.41
> 1607 2007-05-15 89089.19
> 1608 2007-05-16 90472.62
> 1609 2007-05-17 90326.51
> 1610 2007-05-18 90326.51
>
> Any reasons why the dates are shifted by one date? Am I missing some
> parameters in the merge statement?
>
> TIA and best,
> -Tir
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
--
Brian D. Ripley, ripley at stats.ox.ac.uk
Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel: +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UK Fax: +44 1865 272595
More information about the R-help
mailing list