[R] Split strings based on multiple patterns (plain text)
joeceradini at gmail.com
Sat Oct 15 03:53:08 CEST 2016
Hopefully this looks better. I did not realize gmail default was html.
I have a dataframe with a column that has many field smashed together.
I need to split the strings in the column into separate columns based
Example of a string that needs to be split:
ugly <- c("Water temp:14: F Waterbody type:Permanent Lake/Pond: Water
pH:Unkwn: Conductivity:Unkwn: Water color: Clear: Water turbidity:
clear: Manmade:no Permanence:permanent: Max water depth: <3: Primary
substrate: Silt/Mud: Evidence of cattle grazing: none: Shoreline
Emergent Veg(%): 1-25: Fish present: yes: Fish species: unkwn: no
Far as I can tell, there is not a single pattern that would work for
splitting. Splitting on ":" is close, but not quite right. Each of the
below attributes should be in a separate column, and are present in
the string (above) that needs to be split:
attributes <- c("Water temp", "Waterbody type", "Water pH",
"Conductivity", "Water color", "Water turbidity", "Manmade",
"Permanence", "Max water depth", "Primary substrate", "Evidence of
cattle grazing", "Shoreline Emergent Veg(%)", "Fish present", "Fish
Conceptually, I want to use the vector of attributes to split the
string. However, strsplit only uses the 1st value of the attributes
Should I loop through the values of "attributes"?
Is there an argument in strsplit I'm missing that will do what I want?
Different approach altogether?
Thanks! Happy Friday.
More information about the R-help