[R] Blank spaces are replaced by period in read.csv, I want to replace blacks with an underline

John Sorkin jsorkin at grecc.umaryland.edu
Mon Jun 8 17:42:05 CEST 2015


Sarah, 
Many, many thanks.
John

> John David Sorkin M.D., Ph.D.
> Professor of Medicine
> Chief, Biostatistics and Informatics
> University of Maryland School of Medicine Division of Gerontology and Geriatric Medicine
> Baltimore VA Medical Center
> 10 North Greene Street
> GRECC (BT/18/GR)
> Baltimore, MD 21201-1524
> (Phone) 410-605-7119
> (Fax) 410-605-7913 (Please call phone number above prior to faxing)


> On Jun 8, 2015, at 11:04 AM, Sarah Goslee <sarah.goslee at gmail.com> wrote:
> 
> I've taken the liberty of copying this back to the list, so that others can participate in or benefit from the discussion.
> 
>> On Mon, Jun 8, 2015 at 10:49 AM, John Sorkin <jsorkin at grecc.umaryland.edu> wrote:
>> Sarah,
>> I am not sure how I use check.names to replace every space in the names of my variables with an underline. Can you show me how to do this? My current code is as follows:
> 
> check.names just tells R not to reformat your column names. If they aren't already what you want, you'll need to do something else. 
>  
>> data <- read.csv("C:\\Users\\john\\Dropbox (Personal)\\HanlonMatt\\fullgenus3.csv")
>> 
>> The problem I has is that my column names are not unique, e.g., I have multiple columns whose column names are (in CSV format):
>> X Y, X Y, X Y, X Y
>> R reads the names as follows:
>> X.Y, X.Y.1, X.Y.2, X.Y.3
>> I need to have the names look like:
>> X_Y, X_Y.1, X_Y.2, X_Y.3
> 
> You've been saying that you want to replace every space with an underscore, but that's not what your example shows. Instead, you want to let R import the names and add the identifying number (though if you do it yourself you can get the number to match the column number, which is neater), then change the FIRST underscore to a period.
> 
> I'd import them with check.names=FALSE, then modify them explicitly:
> 
> 
> > mynames <- c("x y", "x y", "x y", "x y")
> > mynames
> [1] "x y" "x y" "x y" "x y"
> > mynames <- sub(" ", ".", mynames)
> > mynames
> [1] "x.y" "x.y" "x.y" "x.y"
> > mynames <- paste(mynames, seq_along(mynames), sep="_")
> > mynames
> [1] "x.y_1" "x.y_2" "x.y_3" "x.y_4"
> 
> 
> You could also let R modify them, then use sub() to change the first underscore to a period and leave the rest alone.
> 
> Sarah

Confidentiality Statement:
This email message, including any attachments, is for the sole use of the intended recipient(s) and may contain confidential and privileged information. Any unauthorized use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy all copies of the original message. 


More information about the R-help mailing list