[Rd] binary string conversion to a vector (PR#14120)

tplate at acm.org tplate at acm.org
Sat Dec 12 20:00:28 CET 2009


Just responding to some of the issues in this long post:

(1) Don't rely on the printed form of an object to decide whether or not 
they are identical.  The function str() is very useful in this regard, 
and sometimes also unclass().  To see whether two object are identical, 
use the function identical()

 > qvector <- c("0", "0", "0", "1", "1", "0", "1")
 > qvector[1]
[1] "0"
 > noquote(qvector[1])
[1] 0
 > str(noquote(qvector[1]))
Class 'noquote'  chr "0"
 > as.integer(qvector[1])
[1] 0
 > str(as.integer(qvector[1]))
 int 0
 > identical(noquote(qvector[1]), as.integer(qvector[1]))
[1] FALSE
 >

Does this alleviate the concern as to the possibility of a bug in 
noquote/as.integer? Or were there deeper issues?

(2) to see how some other users of R have package up miscellaneous 
functions that might be of use to other people, look for packages on 
CRAN with "misc" in their names -- I see almost 10 of them.  The problem 
with just posting snippets of code is that they get lost in all the 
other posts here, and many long term R users have dozens if not hundreds 
of their own functions that are streamlined for their own frequent tasks 
and style of programming.

(3) sounds like a great idea to use R to bring statistical rigor into 
the analysis of the performance of combinatorial optimization algorithms!

(4) install.packages("stringr") works fine for me.  Maybe it was a 
temporary glitch?  Have you checked whether you have a valid repository 
selected?  E.g., I have in my .Rprofile:
options(repos=c(CRAN="http://cran.cnr.Berkeley.edu" , 
CRANextra="http://www.stats.ox.ac.uk/pub/RWin"))

Enjoy learning R!

-- Tony Plate

Franc Brglez wrote:
> Hello!
>  
> Please accept my sincere apologies for annoying the R development team with my post this week. If I were required to register as "a developer" before submission, this would not have happened. To rehabilitate myself, please find at the bottom of this mail two R-functions, 'string2vector' and 'vector2string', with "comments and tests". Both functions may go a long way towards assisting a number of R-users to make their R-programming more productive. I am a novice R-programmer: I started dabbling in R less than two months ago, heavily influenced by examples of code I see, including within the R.org documents (monkey does what monkey sees). Before posting two functions, I would really appreciate constructive edits where they may be needed as well as their posting by someone-in-the-know so there will be conveniently accessible for R users.
>
> I am very impressed with potential of R and the community supporting it. I just wish I got to R sooner: I am looking to R to better support my work in "designed experiments to assess the statistically significant performance of combinatorial optimization algorithms on instance isomorphs of NP-hard problems" -- for better context of this mouthful, see the few postings under
>   http://www.cbl.ncsu.edu:16080/xBed/publications/
> I am working on a tutorial paper where I expect R to play a significant role in better explaining and illustrating, code-wise and graphically, the concepts discussed in the publications above. I would welcome a co-author with experience in R-programming as well as statistics and interests in the experimental methods addressed in these publications.
>
> As I elaborate in notes that follow, I was looking at a variety of "R-documents" before my "bug" submission. I would appreciate very much if some of you could take the time to scan through these notes and respond briefly with useful pointers. Here are the headlines:
>
>     (1) why I still think there may be a bug with 'noquote' vs 'as.integer'
>
>     (2) search on "split string" and "join string"; the missing package "stringr"
>
>     (3) a take on "Tcl" commands 'split', 'join', 'string', 'append', 'foreach'
>
>     (4) a take on "R" functions 'string2vector' and 'vector2string'
>
>     (5) code and comments for "R" functions 'string2vector' and 'vector2string
>
> (1) why I still think there may be a bug with 'noquote' vs 'as.integer'
> --------------------------------------------------------------------------------
>   
>> # MacOSX 10.6.2, R 2.9.1 GUI 1.28 Tiger build 32-bit (5444)
>> qvector
>>     
> [1] "0" "0" "0" "1" "1" "0" "1"
>   
>> qvector[1]
>>     
> [1] "0"
>   
>> tmp = noquote(qvector[1])
>> tmp
>>     
> [1] 0
>   
>> tmp = as.integer(qvector[1])
>> tmp
>>     
> [1] 0
>   
> When embedded in the function as per my "bug" report, 'noquote' and 'as.integer' are no longer equivalent whereas in the example above they appear to be equivalent!! I submitted the "function" with print/cat statements for sake of illustration.
>
> (2) search on "split string" and "join string"; the missing package "stringr"
> --------------------------------------------------------------------------------
> http://search.r-project.org/ reveals
>    orderof 850 messages for search on "split string"
>    orderof 160 messages for search on "join string"
>
> http://finzi.psych.upenn.edu/search.html reveals
>     for search on "split string"
>    	• Rhelp08:   [ split: 890 ] [ string: 1676 ] [ TOTAL: 77 ]
>         • functions: [ split: 954 ] [ string: 6453 ] [ TOTAL: 204 ]
>     for search on "join string"
> 	• Rhelp08:   [ join: 176 ] [ string: 1676 ] [ TOTAL: 8 ]
> 	• functions: [ join: 192 ] [ string: 6453 ] [ TOTAL: 36 ]
>     This site also provides a link to the package "stringr"
>     http://finzi.psych.upenn.edu/R/library/stringr/html/00Index.html
> However, the download does not deliver ...
>   
>> install.packages("stringr")
>>     
>   ....
>    package ‘stringr’ is not available
>
> There are a lot of hard-to-understand and not-so-relevant code snippets in all these 1000's of postings. I would argue that had robust functions such as 'string2vector' and 'vector2string' been included in the R-package, many R-programmers could take longer vacations, spend their time more productively,
> and significantly reduce duplication of coding efforts on basically the same
> problems.
>
> Since vector is such and important "primitive" in R, I argue that functions such as 'string2vector' and 'vector2string' should be made to play a role similar to commands 'split', 'join', 'string', and 'append' that support programmers in Tcl. See my take on Tcl in the section below.
>
> (3) a take on "Tcl" commands 'split', 'join', 'string', 'append', 'foreach'
> --------------------------------------------------------------------------------
> I have been using Tcl to "wrap" a number of combinatorial solvers and automate workflows that implement and execute a number of my experiments on instance isomorphs. I even used Tcl to prototype few combinatorial optimization algorithm prototypes and write code for statistical analysis -- as task for which I now find R much better suited.
>
> I intend to alert my Tcl colleagues in-the-know about the wonderful infrastructure provided in R when it comes to the R-shell (at least under MacOSX), and the ability to name and initialize function variable defaults explicitly, and the ability to install new packages so transparently. Before coming across R, I already took the trouble to create Tcl wrapper programs with command lines that feature identical order-indepent syntax as the syntax used in R. This being said, what I miss about R is gathering all commands on a single page such as
>    http://www.tcl.tk/man/tcl8.5/TclCmd/contents.htm
> Note that once you click on any of the commands, a number of classes that extend each command become visible, including the example section(s). 
>
> Here I illustrate my use of just five tcl commands that subsequently guided my "design" of the function 'string2vector' in 'vector2string' "R"
>
> # few "Tcl" examples before designing the function 'string2vector' in "R"
> % set binS "10011"
> % join [split $binS ""] ", "
> 1, 0, 0, 1, 1
> %
> % set strS "I \t am\tdone" 
> % foreach item [split $strS "\t"] {append strSQ \"$item\",}
> % set strSQ [string trimright $strSQ ,]
> "I "," am","done"
> # 
> # few "Tcl" examples before designing the function 'vector2string' in "R"
> % set strV "1,0,0,1"
> 1,0,0,1
> % split $strV ","
> 1 0 0 1
> join [split $strV ","] ":"
> 1:0:0:1
>
> (4) a take on "R" functions 'string2vector' and 'vector2string'
> --------------------------------------------------------------------------------
>   
>> # few tests of the function 'string2vector' in "R"
>> binS = "10011"
>> binV = string2vector(binS, SS="", type="int")
>> binV[2] ; binV[5]
>>     
> [1] 0
> [1] 1
>   
>> strS = "I am done" 
>> vecS = string2vector(strS, SS=" ", type="char")
>> vecS[1] ; vecS[3]
>>     
> [1] "I"
> [1] "done"
>   
>> # few tests of the function 'vector2string' in "R"
>> binV = c(1,0,0,1) 
>> vector2string(binV, type="int")
>>     
> [1] "1001"
>   
>> vector2string(binV, SS=" ", type="char")
>>     
> [1] "1 0 0 1"
>   
>> subsV = c("I", "am", "done")  
>> vector2string(subsV, SS=":", type="char")
>>     
> [1] "I:am:done"
>   
>
> (5) code and comments for "R" functions 'string2vector' and 'vector2string'
> --------------------------------------------------------------------------------
>
> string2vector = function(string="ch-2 \t sec-7\tex-5", SS="\t", type="char")
> #
> # This procedure splits a string and assigns substrings to an R-vector.
> # The split is controlled by the string separator SS (default value:  SS="\t").
> # Here we convert  a binary string into a binary vector:
> #   let  binS = "10011"  
> #   then binV = string2vector(binS, SS="", type="int")
> # Here we convert a string into a vector of substrings:
> #   let  strS = "I am done" 
> #   then vecS = string2vector(strS, SS=" ", type="char")
> #
> # LIMITATION: The function interprets all substrings either as of type 
> #             "int" or "char".  A function that interprets the type of each
> #             substring dynamically may one day be written by an R-guru.
> #              
> # Franc Brglez, Wed Dec  9 14:19:16 EST 2009
> {   
>     qlist   = strsplit(string, SS) ; qvector = qlist[[1]]
>     n = length(qvector) ; xvector = NULL
>     for (i in 1:n) {
>         if (type == "int") {
>             tmp = as.integer(qvector[i])
>         } else {
>             tmp = qvector[i]
>         }
> 	xvector = c(xvector, tmp)
>     }
>     return(xvector)
> } # string2vector
>
> vector2string = function(vector=c("ch-2", "sec-7", "ex-5"), SS="_", type="char") 
> #
> # This procedure converts values from a vector to a concatenation of substrings 
> # separated by user-specified string separator SS (default value:  SS="_").
> # Each substring represents a vector component value, either as a numerical 
> # value or as an alphanumeric string. 
> # Here we convert a binary vector to a binary string representing an integer:
> #   let  binV = c(1,0,0,1)  
> #   then strS = vector2string(binV, type="int")
> # Here we convert a binary vector to string representing a binary sequence:
> #   let  binV = c(1,0,0,1)  
> #   then seqS = vector2string(binV, SS=" ", type="char")
> # Here we convert a vector of substrings to colon-separated string:
> #   let subsV = c("I", "am", "done")  
> #   then strS = vector2string(subsV, SS=":", type="char")
> #
> # LIMITATION: The function interprets all substrings in the vector either as of 
> #             type "int" or "char".  A function that interprets the type of each
> #             substring dynamically may one day be written by an R-guru.
> #
> # Franc Brglez, Wed Dec  9 15:43:59 EST 2009
> {   
>     if (type == "int") {
>         string = paste(strsplit(paste(vector), " "), collapse="")
>     } else {
>         n = length(vector) ; nm1 = n-1 ; string = ""
>         for (i in 1:nm1) {
>             tmp    = noquote(vector[i])
>             string = paste(string, tmp, SS, sep="")
>         }
>         tmp    = noquote(vector[n])
>         string = paste(string, tmp, sep="")     
>     }
>     return(string)
> } # vector2string
>
> ----------------
> Dr. Franc Brglez                                        email: brglez at ncsu.edu 
> Department of Computer Science, Box 8206     http://sitta.csc.ncsu.edu/~brglez
> North Carolina State University                            TEL: (919) 515-9675
> Raleigh NC 27695-8206 USA  
>
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>
>



More information about the R-devel mailing list