[R] extract fixed width fields from a string

Sam Steingold sds at gnu.org
Fri Jan 20 21:25:47 CET 2012


On Fri, Jan 20, 2012 at 14:05, Sarah Goslee <sarah.goslee at gmail.com> wrote:
> Reproducible example, please. This doesn't make a whole lot of sense
> otherwise.

here is the string:
"1288915200|00000704000000905a00000A118"

I want the following data extracted from it:
1. the decimal number before "|": 1288915200
2. the string after "|" split into 3 parts, each of length 9 bytes,
and then split into 3 more parts:
id: the first 6 bytes, int, base 36;
count: the next 2 bytes, int, base 10;
offset: the last 1 byte, int, base 64 (0-9a-zA-Z-_)
i.e., the above line is:
id=7, count=4, days=0
id=9; count=5; offset=10
id=10; count=11; offset=8

thanks.

> On Fri, Jan 20, 2012 at 1:52 PM, Sam Steingold <sds at gnu.org> wrote:
>> Hi,
>> I have a data frame with one column containing string of the form "ABC...|XYZ..."
>> where ABC etc are fields of 6 alphanumeric characters each
>> and XYZ etc are fields of 8 alphanumeric characters each;
>> "|" is a mandatory separator;
>> I do not know in advance how many fields of each kind will each row contain.
>> I need to extract these fields from the string.
>
> This is already a data frame, so you don't need to import it into R,
> just process it?

yes.

> I don't know. Save them as a list, most likely.

can a column contain lists?

>> First thing I want to do is to have a count table of them.
>> Then I thought of adding an extra column for each field value and
>> putting 0/1 there, e.g., frame
>> 1,AB
>> 2,BCD
>
> I thought we had integers at this point?

yes, A..D are placeholders for integers

>> What do people do?
>> Can I have a columns of "sets" in data frame?
>> Does R support the "set" data type?
>
> factor() seems to be what you're looking for.

no, a column of factors will contain a single factor item in each row.
e.g.:
1 A
2 B
3 A
4 C
I want each row to contain a set of factor items:
1 A&B
2 A
3 C
4 <void>


-- 
Sam Steingold <http://sds.podval.org>



More information about the R-help mailing list