[R] Parsing "back" to API strcuture

jim holtman jholtman at gmail.com
Thu Sep 13 02:54:15 CEST 2012


This is close, but it does quote the header names, but does produce
the same dataframe when read back in:

> RAW.API <- structure("id,event_arm,name,dob,pushed_text,pushed_calc,complete\n\"01\",\"event_1_arm_1\",\"John\",\"1979-05-01\",\"\",\"\",2\n\"01\",\"event_2_arm_1\",\"John\",\"2012-09-02\",\"abc\",\"123\",1\n\"01\",\"event_3_arm_1\",\"John\",\"2012-09-10\",\"\",\"\",2\n\"02\",\"event_1_arm_1\",\"Mary\",\"1951-09-10\",\"def\",\"456\",2\n\"02\",\"event_2_arm_1\",\"Mary\",\"1978-09-12\",\"\",\"\",2\n", "`Content-Type`" = structure(c("text/html", "utf-8"), .Names = c("", "charset")))
> x <- read.csv(textConnection(RAW.API), as.is = TRUE)
> x
  id     event_arm name        dob pushed_text pushed_calc complete
1  1 event_1_arm_1 John 1979-05-01                      NA        2
2  1 event_2_arm_1 John 2012-09-02         abc         123        1
3  1 event_3_arm_1 John 2012-09-10                      NA        2
4  2 event_1_arm_1 Mary 1951-09-10         def         456        2
5  2 event_2_arm_1 Mary 1978-09-12                      NA        2
>
> # now put it back into the same string; write.csv does quote alphanumerics
> write.csv(x, textConnection('output', 'w'), row.names = FALSE)
> x.out <- paste(output, collapse = '\n')
> # read it back in to show it is the same
> x.in <- read.csv(textConnection(x.out), as.is = TRUE)
> x.in
  id     event_arm name        dob pushed_text pushed_calc complete
1  1 event_1_arm_1 John 1979-05-01                      NA        2
2  1 event_2_arm_1 John 2012-09-02         abc         123        1
3  1 event_3_arm_1 John 2012-09-10                      NA        2
4  2 event_1_arm_1 Mary 1951-09-10         def         456        2
5  2 event_2_arm_1 Mary 1978-09-12                      NA        2
>


On Wed, Sep 12, 2012 at 8:21 PM, Eric Fail <eric.fail at gmx.us> wrote:
> Dear R experts,
>
> I'm reading data from an online database via API and it gets delivered in this messy comma separated structure,
>
>> RAW.API <- structure("id,event_arm,name,dob,pushed_text,pushed_calc,complete\n\"01\",\"event_1_arm_1\",\"John\",\"1979-05-01\",\"\",\"\",2\n\"01\",\"event_2_arm_1\",\"John\",\"2012-09-02\",\"abc\",\"123\",1\n\"01\",\"event_3_arm_1\",\"John\",\"2012-09-10\",\"\",\"\",2\n\"02\",\"event_1_arm_1\",\"Mary\",\"1951-09-10\",\"def\",\"456\",2\n\"02\",\"event_2_arm_1\",\"Mary\",\"1978-09-12\",\"\",\"\",2\n", "`Content-Type`" = structure(c("text/html", "utf-8"), .Names = c("", "charset")))
>
> I have this script that nicely parses it into a data frame,
>
>> (df <- read.table(file = textConnection(RAW.API), header = TRUE,
> sep = ",", na.strings = "", stringsAsFactors = FALSE))
>>   id     event_arm name        dob pushed_text pushed_calc complete
>> 1  1 event_1_arm_1 John 1979-05-01        <NA>          NA        2
>> 2  1 event_2_arm_1 John 2012-09-02         abc         123        1
>> 3  1 event_3_arm_1 John 2012-09-10        <NA>          NA        2
>> 4  2 event_1_arm_1 Mary 1951-09-10         def         456        2
>> 5  2 event_2_arm_1 Mary 1978-09-12        <NA>          NA        2
>
> I then do some calculations and write them to pushed_text and pushed_calc whereafter I need to format the data back to the messy comma separated structure it came in.
>
> I imagine something like this,
>
>> API.back <- `some magic command`(df, ...)
>
>> identical(RAW.API, API.back)
>> [1] TRUE
>
> Some command that can format my data from the data frame I made, df, back to the structure that the raw API-object came in, RAW.API.
>
> Any help would be appreciated.
>
> Thanks for reading.
>
> Eric
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



-- 
Jim Holtman
Data Munger Guru

What is the problem that you are trying to solve?
Tell me what you want to do, not how you want to do it.




More information about the R-help mailing list