[R] Stacking several vectors from the list

astarodo at uci.edu astarodo at uci.edu
Tue Jun 29 01:30:46 CEST 2010


Hi everybody,

I'm working on the very
messy data, I have tried to clean it up in SAS and
SAS/IML but there is not enough info on how to handle certain things
in SAS so I have turned to R. The thing itself should be rather
simple, so i was wondering if someone could help me out.

The original .csv has ([1] 7138 6338 ) dimensions with funds with the corresponding dates and observations for each date for around 10 years and 4000+ funds, meaning in COL5 has the next fund's name and so on.

COL1	              COL2		 COL3	        COL4
HBNNF US Equity	Date		EQY_SH_OUT	PX_VOLUME
			#NAME?	       #N/A N/A	  135000
			7/7/2008	#N/A N/A	  105000
			7/17/2008	#N/A N/A	  590000
			7/22/2008	#N/A N/A	  40000


so in R this .csv is somehow read as list (using typeof) and not as dataframe, and a lot of stuff like regexpr searches in the whole file do not work or behave strangely. I want to stack the fund data, and create a long dataset with a fund name, date, eqy_sh_out and px_volume, with fund name present for each date.
That should look like this,

Fund_name		Date		EQY_SH_OUT	PX_VOLUME
HBNNF US Equity	7/7/2008	#N/A N/A	105000
HBNNF US Equity	7/17/2008	#N/A N/A	590000
HBNNF US Equity	7/22/2008	#N/A N/A	40000
HBNNF US Equity	7/24/2008	#N/A N/A	3000
HBNNF US Equity	7/31/2008	#N/A N/A	1000
HBNNF US Equity	8/20/2008	#N/A N/A	1000
HBNNF US Equity	8/26/2008	#N/A N/A	2000
HBNNF US Equity	8/27/2008	#N/A N/A	2000
HBNNF US Equity	9/2/2008	#N/A N/A	5000
HND CN Equity		1/17/2008	#N/A N/A	28000
HND CN Equity		1/18/2008	#N/A N/A	25000
HND CN Equity		1/21/2008	#N/A N/A	5000
HND CN Equity		1/22/2008	#N/A N/A	101000
HND CN Equity		1/23/2008	#N/A N/A	122000


Any way to accomplish this? Should be an easy way, but i have never worked with lists and somehow it doesn't read as a dataframe with strange results.

> small_raw[1,1]
[1] HBNNF US Equity
Levels:  0.26 0.46 COL1 HBNNF US Equity

> grep("Equity",as.character(small_raw))
integer(0)

> small_raw[[1]]
  [1] HBNNF US Equity                                                
  [5]                                                                
  [9]                                                                
 [13]                                                                
 [17]                                                                
 [21]                                                                
 [25]                                                                
 [29]                                                                
 [33]                                                                
 [37]                                                                
 [41]                                                                
 [45]                                                                
 [49]                                                                
 [53]                                                                
 [57]                                                                
 [61]                                                                
 [65]                                                                
 [69]                                                                
 [73]                                                                
 [77]                                                                
 [81]                                                                
 [85]                                                                
 [89]                                                                
 [93]                                                                
 [97] 0.46                            0.46                           
[101] 0.46                            0.26                           
[105] 0.26                            0.26                           
[109] 0.26                            0.26                           
[113] 0.26                            0.26                           
[117] 0.26                            0.26                           
[121] 0.26                            0.26                           
[125] 0.26                            0.26                           
[129] 0.26                            0.26                           
[133] 0.26                            0.26                           
[137] 0.26                            0.26                           
[141] 0.26                            0.26                           
[145] 0.26                            0.26                           
[149] 0.26                            0.26                           
[153] 0.26                            0.26                           
[157] 0.26                            0.26                           
[161] 0.26                            0.26                           
[165] 0.26                                                           
[169]                                                                
[173]                                                                
[177]                                                                
[181]                                                                
[185]                                                                
[189]                                                                
[193]                                                                
[197]                                                
Levels:  0.26 0.46 COL1 HBNNF US Equity

I have been on this for a while. Thank you in advance! 

Arsenio



More information about the R-help mailing list