[R] analizing .txt file with R or an other program

Daniel Malter daniel at umd.edu
Sat Jul 23 21:51:39 CEST 2011


Hi,

The blunt answer is: by learning R. In particular, you will need pattern
matching techniques as in ?grep and (somewhat advanced, some would call it
basic) knowledge of R. So if you aren't familiar with either, I would
suggest an introductory manual or one of the many websites you find online
and then to dig deeper into the pattern matching stuff.

Generally, please adhere to the posting guide (provide a self-contained,
i.e., copy/paste-able, example of code/data for people to work with). Also,
you will be much more likely to receive a response if you have demonstrated
own coding effort (contributors are willing to solve problems but unwilling
to do other people's work).

Best,
Daniel



aRe wrote:
> 
> Hello together
> 
> I have a .txt file with about 1Mio! rows.
> 
> Sometimes the rows are in the following order (whereas the number of rows
> between the rows marked with an x differ):
> 
> ...
> *SBLINK R 5261507*x
> 5261439	  516.4	  364.3	 9148.0	... 	  816.0	-1133.0	   48.4 MA.C.TB...BL.
> 5261441	  516.4	  364.0	 9145.0	... 	  799.0	-1135.0	   48.7 MA.C.TB...B..
> 5261443	  516.4	  363.9	 9140.0	... 	  817.0	-1171.0	   49.3 MA.C.TB.....R
> *MSG	5261445 Prime 11_fe_ha*x
> 5261445	  516.7	  363.8	 9133.0	... 	  813.0	-1097.0	   49.3 MA.C.TB......
> 5261447	  517.0	  363.8	 9127.0	... 	  818.0	-1144.0	   49.9 MA.C.T.LRTB..
> *EBLINK R 5261507	5261645	140*x
> 5261509	   .	   .	    0.0	... 	   .	   .	   . .............
> 5261511	   .	   .	    0.0	... 	   .	   .	   . .............
> *MSG	5261512 Mask 8_ma_ma*x
> 5261513	   .	   .	    0.0	... 	   .	   .	   . .............
> 5261515	   .	   .	    0.0	... 	   .	   .	   . .............
> ...
> 
> Here I would like to generate an output, that gives me the two parts
> "...Prime 11_fe_ha" and "...Mask 8_ma_ma" if and only if "...Prime
> 11_fe_ha" is situated between "SBLINK..." and "EBLINK...".
> 
> 
> 
> 
> Sometimes the rows are in the following order (whereas the number of rows
> between the rows marked with an x differ):
> 
> ...
> *MSG	5261445 Prime 11_fe_ha*x
> 5261439	  516.4	  364.3	 9148.0	... 	  816.0	-1133.0	   48.4 MA.C.TB...BL.
> 5261441	  516.4	  364.0	 9145.0	... 	  799.0	-1135.0	   48.7 MA.C.TB...B..
> 5261443	  516.4	  363.9	 9140.0	... 	  817.0	-1171.0	   49.3 MA.C.TB.....R
> *SBLINK R 5261507*x5261445	  516.7	  363.8	 9133.0	... 	  813.0	-1097.0	  
> 49.3 MA.C.TB......
> 5261447	  517.0	  363.8	 9127.0	... 	  818.0	-1144.0	   49.9 MA.C.T.LRTB..
> *EBLINK R 5261507	5261645	140*x
> 5261509	   .	   .	    0.0	... 	   .	   .	   . .............
> 5261511	   .	   .	    0.0	... 	   .	   .	   . .............
> *MSG	5261512 Mask 8_ma_ma*x
> 5261513	   .	   .	    0.0	... 	   .	   .	   . .............
> 5261515	   .	   .	    0.0	... 	   .	   .	   . .............
> ...
> 
> Here I would like to generate an output, that consists of the two parts
> "...Prime 11_fe_ha" and "...Mask 8_ma_ma" if and only if "SBLINK..." is
> situated between "... Prime 11_fe_ha" and "...Mask 8_ma_ma". The place of
> the "EBLINK..." is not important. that means also the following structure
> should lead to the same output:
> 
> ...
> *MSG	5261445 Prime 11_fe_ha*x
> 5261439	  516.4	  364.3	 9148.0	... 	  816.0	-1133.0	   48.4 MA.C.TB...BL.
> 5261441	  516.4	  364.0	 9145.0	... 	  799.0	-1135.0	   48.7 MA.C.TB...B..
> 5261443	  516.4	  363.9	 9140.0	... 	  817.0	-1171.0	   49.3 MA.C.TB.....R
> *SBLINK R 5261507*x5261445	  516.7	  363.8	 9133.0	... 	  813.0	-1097.0	   
> 5261447	  517.0	  363.8	 9127.0	... 	  818.0	-1144.0	   49.9 MA.C.T.LRTB..
> 5261509	   .	   .	    0.0	... 	   .	   .	   . .............
> 5261511	   .	   .	    0.0	... 	   .	   .	   . .............
> *MSG	5261512 Mask 8_ma_ma*x
> 5261513	   .	   .	    0.0	... 	   .	   .	   . .............
> 5261515	   .	   .	    0.0	... 	   .	   .	   . .............
> *EBLINK R 5261507	5261645	140*x
> ...
> 
> 
> can someone give me a advice how I could manage this task?
> 
> thanks
> 
> best
> 

--
View this message in context: http://r.789695.n4.nabble.com/analizing-txt-file-with-R-or-an-other-program-tp3689025p3689393.html
Sent from the R help mailing list archive at Nabble.com.



More information about the R-help mailing list