[R] subsetting large data frames.

hesicaia dboyce at dal.ca
Sun Dec 7 22:25:35 CET 2008


It worked wonderfully - Thanks very much!



hesicaia wrote:
> 
> Hi all,
>   I have a question regarding subsetting of large data frames. I have two
> data frames “catches” and “tows” and they both have the same 30 variables
> (columns). I would like to select rows in the data frame “tows” where all
> 5 specific variables are NOT matched in “catches. That is to say, the
> combination of these 5 variables is unique. One or more of the variables
> could be the same but the combination would be unique. This is confusing
> to explain so here is a short example to explain what I am trying to
> explain:
> 
> Example data catches:
> 
> Row	Cruise	Order	Townumber	Towtype	Ship	Netlocation	Var1	Var2
> 1	 22    	1	               4	              A	   B	        S      	X1	X2
> 2	 22	        1	               4	              A	   B 	        S      	X1
> X2
> 3	 22	        1	               4	              BL	   AM	S      	X1	X2
> 4	 22	        1	               4 	              BL	   AM	S      	X1	X2
> 5	 260	        1	               4	              BL	    B  	S      	X1	X2
> 6	 260	        1	               4	              BL     	    B  	S      	X1
> X2
>  
> Example data tows:
> 
> Row	Cruise	Order	Townumber	Towtype	Ship	Netlocation	Var1	Var2
> 1	22     	1               	4       	A      	B      	S      	X1	X2
> 2	400    	1               	4       	BL	        AM    	S      	X1	X2
> 3	260    	1               	4       	BL     	B      	S      	X1	X2
> 4	260   	10             	10     	BL     	B      	S      	X1	X2
> 5	22     	99             	4       	BL     	B      	S      	X1	X2
> 
> I would want to select rows 2, 4, and 5 from “tows” due to the fact that
> the same collection of “cruise”, ”order”, ”townumber”, ”towtype”, ”ship”,
> and ”netlocation” are not found in “catches”. All rows in data set “tows”
> are unique. Clear as mud? Sorry I couldn’t provide real data, but these
> datasets are quite large. 
> 
> So far I have tried:
> 
> New<-tows[(tows$cruise != catches$cruise) & (tows$order != catches$order)
> & (tows$townumber !=  catches$townumber) & (tows$towtype !=
> catches$towtype) & (tows$ship != catches$ship) & (tows$netlocation !=
> catches$netlocation),]
>  
> But this didn’t work. 
> Thanks for your time and help (in advance).
> Dan.
> 
> 
> 

-- 
View this message in context: http://www.nabble.com/subsetting-large-data-frames.-tp20883217p20886141.html
Sent from the R help mailing list archive at Nabble.com.



More information about the R-help mailing list