[R] Accelerating binRead
jim holtman
jholtman at gmail.com
Sat Sep 17 20:38:23 CEST 2016
Here is an example of how to do it:
x <- 1:10 # integer values
xf <- seq(1.0, 2, by = 0.1) # floating point
setwd("d:/temp")
# create file to write to
output <- file('integer.bin', 'wb')
writeBin(x, output) # write integer
writeBin(xf, output) # write reals
close(output)
library(pack)
library(readr)
# read all the data at once
allbin <- read_file_raw('integer.bin')
# decode the data into a list
(result <- unpack("V V V V V V V V V V d d d d d d d d d d", allbin))
Jim Holtman
Data Munger Guru
What is the problem that you are trying to solve?
Tell me what you want to do, not how you want to do it.
On Sat, Sep 17, 2016 at 11:04 AM, Ismail SEZEN <sezenismail at gmail.com>
wrote:
> I noticed same issue but didnt care much :)
>
> On Sat, Sep 17, 2016, 18:01 jim holtman <jholtman at gmail.com> wrote:
>
>> Your example was not reproducible. Also how do you "break" out of the
>> "while" loop?
>>
>>
>> Jim Holtman
>> Data Munger Guru
>>
>> What is the problem that you are trying to solve?
>> Tell me what you want to do, not how you want to do it.
>>
>> On Sat, Sep 17, 2016 at 8:05 AM, Philippe de Rochambeau <phiroc at free.fr>
>> wrote:
>>
>> > Hello,
>> > the following function, which stores numeric values extracted from a
>> > binary file, into an R matrix, is very slow, especially when the said
>> file
>> > is several MB in size.
>> > Should I rewrite the function in inline C or in C/C++ using Rcpp? If the
>> > latter case is true, how do you « readBin » in Rcpp (I’m a total Rcpp
>> > newbie)?
>> > Many thanks.
>> > Best regards,
>> > phiroc
>> >
>> >
>> > -------------
>> >
>> > # inputPath is something like http://myintranet/getData?
>> > pathToFile=/usr/lib/xxx/yyy/data.bin <http://myintranet/getData?
>> > pathToFile=/usr/lib/xxx/yyy/data.bin>
>> >
>> > PLTreader <- function(inputPath){
>> > URL <- file(inputPath, "rb")
>> > PLT <- matrix(nrow=0, ncol=6)
>> > compteurDePrints = 0
>> > compteurDeLignes <- 0
>> > maxiPrints = 5
>> > displayData <- FALSE
>> > while (TRUE) {
>> > periodIndex <- readBin(URL, integer(), size=4, n=1,
>> > endian="little") # int (4 bytes)
>> > eventId <- readBin(URL, integer(), size=4, n=1,
>> > endian="little") # int (4 bytes)
>> > dword1 <- readBin(URL, integer(), size=4, signed=FALSE,
>> > n=1, endian="little") # int
>> > dword2 <- readBin(URL, integer(), size=4, signed=FALSE,
>> > n=1, endian="little") # int
>> > if (dword1 < 0) {
>> > dword1 = dword1 + 2^32-1;
>> > }
>> > eventDate = (dword2*2^32 + dword1)/1000
>> > repNum <- readBin(URL, integer(), size=2, n=1,
>> > endian="little") # short (2 bytes)
>> > exp <- readBin(URL, numeric(), size=4, n=1,
>> > endian="little") # float (4 bytes, strangely enough, would expect 8)
>> > loss <- readBin(URL, numeric(), size=4, n=1,
>> > endian="little") # float (4 bytes)
>> > PLT <- rbind(PLT, c(periodIndex, eventId, eventDate,
>> > repNum, exp, loss))
>> > } # end while
>> > return(PLT)
>> > close(URL)
>> > }
>> >
>> > ----------------
>> > [[alternative HTML version deleted]]
>> >
>> > ______________________________________________
>> > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> > https://stat.ethz.ch/mailman/listinfo/r-help
>> > PLEASE do read the posting guide http://www.R-project.org/
>> > posting-guide.html
>> > and provide commented, minimal, self-contained, reproducible code.
>>
>> [[alternative HTML version deleted]]
>>
>> ______________________________________________
>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/
>> posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
>
[[alternative HTML version deleted]]
More information about the R-help
mailing list