[R] how to factor in the ID of the imported subtable to R table?
David Winsemius
dk@w|n@em|u@ @end|ng |rom gm@||@com
Thu May 21 22:00:37 CEST 2020
On 5/21/20 9:24 AM, YANJUN CHEN via R-help wrote:
> Dear R community,
>
> I am new to R—did some online tutorials and exercises in R playground. I was wondering if I could seek guidance on the following matter.
>
> I have a set of 403 .csv files. Each.csv file contains the same layouts and distinguished by subject ID and date in the file name. The dataset looks like this:
>
> Sub1-20170305.csv
> Sub2-20180214.csv
> …
> Sub403-20191109.csv
Something along the lines of:
?regex ; ?sub
?read.table
?data.frame
?do.call
?rbind
myfiles <- lapply( list.files(your_path) , # each file name will be
passed to anonymous function
function(nm) data.frame( subID = sub("-.+",
"", nm), # remove chars after "-"
date=sub("^.+-(.{8})[.]csv", "\\1", nm), #extract date as capture class
#assuming all files have
same number of columns with no headers
read.table(
paste0(your_path, nm) )
big_file <- do.call(rbind, myfiles)
>
> I will use rbind function to combine 403 csv files in a single file (myFile). I will create two new variables (use mutate function) in myFile (subject ID and date). Is there a way to subtract subject ID (shown as “Sub1, 2,,,403”) and date from the name of the csv file and then place them in “subject ID” and “date” in myFile?
>
> Any info on the issue itself or where to look for will be appreciated.
If you search StackOverflow or Rseek with topic terms " stacking
multiple data files" you should find many worked examples.
> Thanks,
>
> CJ
>
>
>
>
>
>
> [[alternative HTML version deleted]]
You should now read the Posting Guide which will explain why you should
NOT post in HTML.
Best;
David.
> ______________________________________________
> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list