[R] Merge the data from multiple text files

Jeff Newmiller jdnewmil @ending from dcn@d@vi@@c@@u@
Mon Jan 7 18:04:11 CET 2019


I think it is rather presumptuous of you to think that anyone is going to write an expression optimizer for some unspecified language on the R-help mailing list. I am sure that such tasks can be handled in R, but it is non-trivial and the background needed would be very off-topic here.

On January 7, 2019 4:49:04 AM PST, Priya Arasu via R-help <r-help using r-project.org> wrote:
>Thank you David Winsemius and David L Carlson. 
>@David L Carlson, Thank you for the code. I have one more issue, while
>merging the files. Please advice.For example
>In text file 1:
>A = not(B or C)B = A and CC = D
>In text file 2:
>A = not(C or D) and (D and E)
>
>So when I merge using your code, it merges A = not(B or C) and (D and
>E). How do I merge A as A= not(B or C or D) and (D and E) ?  I also
>have duplicates like A= not(B or C) and not (C or D) instead as A=
>not(B or C or D) ThanksPriya 
>
>On Sunday, 6 January 2019 4:39 AM, David L Carlson <dcarlson using tamu.edu>
>wrote:
> 
>
>To expand on David W's answer, here is an approach to your example. If
>you have many text files, you would want to process them together
>rather than individually. You gave us two examples so I'll use those
>and read them from the console using readLines(), but you would use the
>same function to open the files on your computer:
>
>> TF1 <- readLines(n=3)
>A = not(B or C)
>B = A and C
>C = D
>> 
>> TF2 <- readLines(n=2)
>A = D and E
>B = not(D)
>> 
>> TF <- sort(c(TF1, TF2))
>> TF
>[1] "A = D and E"    "A = not(B or C)" "B = A and C"    "B = not(D)"
>[5] "C = D"
>
>Now we have combined the files into a single character vector called TF
>and sorted them. Next we need to parse them into the left and right
>hand sides. We will replace " = " with "\t" (tab) to do that:
>
>> TF.delim <- gsub(" = ", "\t", TF)
>> TF.data <- read.delim(text=TF.delim, header=FALSE, as.is=TRUE)
>> colnames(TF.data) <- c("LHS", "RHS")
>> print(TF.data, right=FALSE)
>  LHS RHS
>1 A  D and E
>2 A  not(B or C)
>3 B  A and C
>4 B  not(D)
>5 C  D
>
>TF.data is a data frame with two columns. The tricky part is to add
>surrounding parentheses to rows 1 and 3 to get your example output:
>
>> paren1 <- grepl("and", TF.data$RHS)
>> paren2 <- !grepl("\\(*\\)", TF.data$RHS)
>> paren <- apply(cbind(paren1, paren2), 1, all)
>> TF.data$RHS[paren] <- paste0("(", TF.data$RHS[paren], ")")
>> print(TF.data, right=FALSE)
>  LHS RHS
>1 A  (D and E)
>2 A  not(B or C)
>3 B  (A and C)
>4 B  not(D)
>5 C  D
>
>The first three lines identify the rows that have the word "and" but do
>not already have parentheses. The fourth line adds the surrounding
>parentheses. Finally we will combine the rows that belong to the same
>LHS value with split and create a list:
>
>> TF.list <- split(TF.data$RHS, TF.data$LHS)
>> TF.list
>$`A`
>[1] "(D and E)"  "not(B or C)"
>
>$B
>[1] "(A and C)" "not(D)"  
>
>$C
>[1] "D"
>
>> TF.and <- lapply(TF.list, paste, collapse=" and ")
>> TF.final <- lapply(names(TF.and), function(x) paste(x, "=",
>TF.and[[x]]))
>> TF.final <- do.call(rbind, TF.final)
>> TF.final
>    [,1]                          
>[1,] "A = (D and E) and not(B or C)"
>[2,] "B = (A and C) and not(D)"
>[3,] "C = D"
>> write(TF.final, file="TF.output.txt")
>
>The text file "TF.output.txt" contains the three lines.
>
>----------------------------------------------
>David L. Carlson
>Department of Anthropology
>Texas A&M University
>
>-----Original Message-----
>From: R-help [mailto:r-help-bounces using r-project.org] On Behalf Of David
>Winsemius
>Sent: Saturday, January 5, 2019 1:12 PM
>
>Subject: Re: [R] Merge the data from multiple text files
>
>
>On 1/5/19 7:28 AM, Priya Arasu via R-help wrote:
>> I have multiple text files, where each file has Boolean rules.
>> Example of my text file 1 and 2
>> Text file 1:
>> A = not(B or C)
>> B = A and C
>> C = D
>> Text file 2:
>> A = D and E
>> B = not(D)
>>
>> I want to merge the contents in text file as follows
>> A = not(B or C) and (D and E)
>> B = not(D) and (A and C)
>> C = D
>> Is there a code in R to merge the data from multiple text files?
>
>
>There is a `merge` function. For this use case you would need to first 
>parse your expressions so that the LHS was in one character column and 
>the RHS was in another character column in each of 2 dataframes. Then 
>merge on the LHS columns and `paste` matching values from the two 
>columns. You will probably need to learn how to use `ifelse` and
>`is.na`.
>
>> Thank you
>> Priya
>>
>>     [[alternative HTML version deleted]]
>
>
>You also need to learn that R is a plain text mailing list and that
>each 
>mail client has its own method for building mail in plain text.

-- 
Sent from my phone. Please excuse my brevity.



More information about the R-help mailing list