[R] Accelerating binRead
Philippe de Rochambeau
phiroc at free.fr
Sat Sep 17 20:45:52 CEST 2016
Hi Jim,
this is exactly the answer I was look for. Many thanks. I didn’t R had a pack function, as in PERL.
To answer your earlier question, I am trying to update legacy code to read a binary file with unknown size, over a network, slice up it into rows each containing an integer, an integer, a long, a short, a float and a float, and stuff the rows into a matrix.
Best regards,
Philippe
> Le 17 sept. 2016 à 20:38, jim holtman <jholtman at gmail.com> a écrit :
>
> Here is an example of how to do it:
>
> x <- 1:10 # integer values
> xf <- seq(1.0, 2, by = 0.1) # floating point
>
> setwd("d:/temp")
>
> # create file to write to
> output <- file('integer.bin', 'wb')
> writeBin(x, output) # write integer
> writeBin(xf, output) # write reals
> close(output)
>
>
> library(pack)
> library(readr)
>
> # read all the data at once
> allbin <- read_file_raw('integer.bin')
>
> # decode the data into a list
> (result <- unpack("V V V V V V V V V V d d d d d d d d d d", allbin))
>
>
>
>
> Jim Holtman
> Data Munger Guru
>
> What is the problem that you are trying to solve?
> Tell me what you want to do, not how you want to do it.
>
> On Sat, Sep 17, 2016 at 11:04 AM, Ismail SEZEN <sezenismail at gmail.com <mailto:sezenismail at gmail.com>> wrote:
> I noticed same issue but didnt care much :)
>
> On Sat, Sep 17, 2016, 18:01 jim holtman <jholtman at gmail.com <mailto:jholtman at gmail.com>> wrote:
> Your example was not reproducible. Also how do you "break" out of the
> "while" loop?
>
>
> Jim Holtman
> Data Munger Guru
>
> What is the problem that you are trying to solve?
> Tell me what you want to do, not how you want to do it.
>
> On Sat, Sep 17, 2016 at 8:05 AM, Philippe de Rochambeau <phiroc at free.fr <mailto:phiroc at free.fr>>
> wrote:
>
> > Hello,
> > the following function, which stores numeric values extracted from a
> > binary file, into an R matrix, is very slow, especially when the said file
> > is several MB in size.
> > Should I rewrite the function in inline C or in C/C++ using Rcpp? If the
> > latter case is true, how do you « readBin » in Rcpp (I’m a total Rcpp
> > newbie)?
> > Many thanks.
> > Best regards,
> > phiroc
> >
> >
> > -------------
> >
> > # inputPath is something like http://myintranet/getData <http://myintranet/getData>?
> > pathToFile=/usr/lib/xxx/yyy/data.bin <http://myintranet/getData <http://myintranet/getData>?
> > pathToFile=/usr/lib/xxx/yyy/data.bin>
> >
> > PLTreader <- function(inputPath){
> > URL <- file(inputPath, "rb")
> > PLT <- matrix(nrow=0, ncol=6)
> > compteurDePrints = 0
> > compteurDeLignes <- 0
> > maxiPrints = 5
> > displayData <- FALSE
> > while (TRUE) {
> > periodIndex <- readBin(URL, integer(), size=4, n=1,
> > endian="little") # int (4 bytes)
> > eventId <- readBin(URL, integer(), size=4, n=1,
> > endian="little") # int (4 bytes)
> > dword1 <- readBin(URL, integer(), size=4, signed=FALSE,
> > n=1, endian="little") # int
> > dword2 <- readBin(URL, integer(), size=4, signed=FALSE,
> > n=1, endian="little") # int
> > if (dword1 < 0) {
> > dword1 = dword1 + 2^32-1;
> > }
> > eventDate = (dword2*2^32 + dword1)/1000
> > repNum <- readBin(URL, integer(), size=2, n=1,
> > endian="little") # short (2 bytes)
> > exp <- readBin(URL, numeric(), size=4, n=1,
> > endian="little") # float (4 bytes, strangely enough, would expect 8)
> > loss <- readBin(URL, numeric(), size=4, n=1,
> > endian="little") # float (4 bytes)
> > PLT <- rbind(PLT, c(periodIndex, eventId, eventDate,
> > repNum, exp, loss))
> > } # end while
> > return(PLT)
> > close(URL)
> > }
> >
> > ----------------
> > [[alternative HTML version deleted]]
> >
> > ______________________________________________
> > R-help at r-project.org <mailto:R-help at r-project.org> mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help <https://stat.ethz.ch/mailman/listinfo/r-help>
> > PLEASE do read the posting guide http://www.R-project.org/ <http://www.r-project.org/>
> > posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org <mailto:R-help at r-project.org> mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help <https://stat.ethz.ch/mailman/listinfo/r-help>
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html <http://www.r-project.org/posting-guide.html>
> and provide commented, minimal, self-contained, reproducible code.
[[alternative HTML version deleted]]
More information about the R-help
mailing list