[R] Fwd: Merging multiple csv files to new file

Avi Gross @v|gro@@ @end|ng |rom ver|zon@net
Wed Nov 3 16:55:39 CET 2021


Gabrielle,

Why would you expect that to work?

rbind() binds rows of internal R data structures that are some variety of data.frame with exactly the same columns in the same order into a larger object of that type.

You are not providing rbind() with the names of variables holding the info but file names of Comma Separated Values.

If you have many files with different numbers of columns of data with some overlap, you need to decide on quite a few things first. If a file has say 4 columns out of a possible 20 unique columns across the files, do you want to add 16 columns to the contents of the file, after reading it in, and re-arrange it into a specific order by column? What will you fill in the new columns with? NA is a popular choice but you need to decide.

You then need to repeat the same thing with all the other files and read in 6 columns then add 14 filled as you wish and rearrange the columns to the same order.

When done, you have an assortment of variables of class data.frame (or other similar ones) and you can use rbind() on those variables to get a result.

But it may not be what you want. You may actually want more of a database merge type of operation combining columns from each into the same userID field or whatever. rbind() is not the function to do that with and I won't go on to give a long tutorial. 

My main point is what you are doing is at the wrong level. You need to read all the files into variable before doing additional calculations in R.

-----Original Message-----
From: R-help <r-help-bounces using r-project.org> On Behalf Of gabrielle aban steinberg
Sent: Tuesday, November 2, 2021 6:31 PM
To: r-help using r-project.org
Subject: [R] Fwd: Merging multiple csv files to new file

Hello, I would like to merge 18 csv files into a master data csv file, but each file has a different number of columns (mostly found in one or more of the other cvs files) and different number of rows.

I have tried something like the following in R Studio (cloud):

all_data_fit_files <- rbind("dailyActivity_merged.csv", "dailyCalories_merged.csv", "dailyIntensities_merged.csv", "dailySteps_merged.csv", "heartrate_seconds_merged.csv", "hourlyCalories_merged.csv", "hourlyIntensities_merged.csv", "hourlySteps_merged.csv", "minuteCaloriesNarrow_merged.csv",
"minuteCaloriesWide_merged.csv", "minuteIntensitiesNarrow_merged.csv",
"minuteIntensitiesWide_merged.csv", "minuteMETsNarrow_merged.csv", "minuteSleep_merged.csv", "minuteStepsNarrow_merged.csv", “minuteStepsWide_merged.csv", "sleepDay_merged.csv", "minuteStepsWide_merged.csv", "sleepDay_merged.csv",
"weightLogInfo_merged.csv")



But I am getting the following error:

Error: unexpected input in "rlySteps_merged.csv", "minuteCaloriesNarrow_merged.csv", "minuteCaloriesWide_merged.csv", "minuteIntensitiesNarrow_merged.csv",
"minuteIntensitiesWide_merged.csv", "minuteMETsNarrow_merged.csv"


(Maybe the R Studio free trial/usage is underpowered for my project?)

	[[alternative HTML version deleted]]

______________________________________________
R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list