[R-SIG-Finance] Distribution fitting to loss data - Operational Risk

Thu Jul 23 13:51:55 CEST 2015

Amelia:
You are following the correct procedure. Unfortunately, you are also experiencing a very common problem: the loss data for operational risk do not follow any simple distribution. Fitting that data to a particular distribution is usually difficult or meaningless from a statistician's point of view.
In operational risk, the loss events are from multiple sources (fraud, legal events, IT events, natural disaster, etc), and each source has it's own quirks. I suggest grouping the loss events according to source, then determining if each source follows a reasonable distribution. That could lead to a mixture model.
There are several excellent books on modeling loss data. Don't forget to review one or two for ideas on appropriate distributions.
You could try a statistical bootstrap. I'll warn you however, that the estimates desired by the regulators will be unstable, due to the extreme tail probability that they require.
The only good news is that regulators are aware of the problem. I suggest that you work with your Risk Management committee to determine which compromise will satisfy both them and the regulators. There is no good, simple solution here until the regulatory framework catches up with the reality.
Good luck. Paul Teetor, Elgin, IL USAhttp://quantdevel.com/public
      From: Amelia Marsh via R-SIG-Finance <r-sig-finance at r-project.org>
 To: "r-sig-finance at r-project.org" <r-sig-finance at r-project.org> 
 Sent: Wednesday, July 22, 2015 4:07 AM
 Subject: [R-SIG-Finance] Distribution fitting to loss data - Operational Risk

Hello!

I am into risk management and deal with Operatioanl risk. As a part of BASEL II guidelines, we need to arrive at the capital charge the banks must set aside to counter any operational risk, if it happens. As a part of Loss Distribution Approach (LDA), we need to collate past loss events and use these loss amounts. The usual process as being practised in the industry is as follows - 

Using these historical loss amounts and using the various statistical tests like KS test, AD test, PP plot, QQ plot etc, we try to identify best statistical (continuous) distribution fitting this historical loss data. Then using these estimated parameters w.r.t. the statistical distribution, we simulate say 1 miliion loss anounts and then taking appropriate percentile (say 99.9%), we arrive at the capital charge. 

However, many a times, loss data is such that fitting of distribution to loss data is not possible. May be loss data is multimodal or has significant variability, making the fitting of distribution impossible. Can someone guide me how to deal with such data and what can be done to simulate losses using this historical loss data in R. 

My data is as follows - 

mydat <- c(829.53,4000,6000,1000,1063904,102400,22000,4000,4200,2000,10000,400, 459006, 7276,4000,100,4000,10000,613803.36, 825,1000,5000,4000,3000,84500,200, 2000,68000,97400,6267.8, 49500,27000,2100,10489.92,2200,2000,2000,1000,1900, 6000,5600,100,4000,14300,100,94100,1200,7000,2000,3000,1100,6900,1000,18500,6000,2000,4000,8400,11200,1000,15100,23300,4000,13100,4500,200,2000,50000,3900,3200,2000,2000,67000,2000,500,2000,1000,1900,10400,1900,2000,3200,6500,10000,2900,1000,14300,1000,2700,1500,12000,40000,25000,2800,5000,15000,4000,1000,21000,15000,16000,54000,1500,19200,2000,2000,1000,39000,5000,1100,18000,10000,3500,1000,10000,5000,14000,1800,4000,1000,300,4000,1000,100,1000,4400,2000,2000,12000,200,100,1000,1000,2000,1600,2000,4000,14000,4000,13500,1000,200,200,1000,18000,23000,41400,60000,500,3000,21000,6900,14600,1900,4000,4500,1000,2000,2000,1000,4100,2000,1000,2000,8000,3000,1500,2000,2000,3500,2000,2000,1000,3800,30000,55000,500,1000,1000,2000,62400,2000,3000,200,200!
 0,3500,2000,2000,500,3000,4500,1000,10000,2000,3000,3600,1000,2000,2000,5000,23000,2000,1900,2000,60000,2000,60000,20000,2000,2000,4600,1000,2000,1000,18000,6000,62000,68000,26800,50000,45900,16900,21500,2000,22700,2000,2000,32000,10000,5000,138000,159700,13000,2000,17619,2000,1000,4000,2000,1500,4000,20000,158900,74100,6000,24900,60000,500,1000,40000,10000,50000,800,4000,4900,6500,5000,400,500,3000,32300,24000,300,11500,2000,5000,1000,500,5000,5500,17450,56800,2000,1000,21400,22000,60000,3000,7500,3000,1000,1000,2000,1500,83700,2000,4000,170005,70000,6700,1500,3500,2000,10563.97,1500,25000,2000,2000,2267.57,1100,3100,2000,3500,10000,2000,6000,1500,200,20000,4000,46400,296900,150000,3700,7500,20000,48500,3500,12000,2500,4000,8500,1000,14500,1000,11000,2000,2000,120000,20000,7600,3000,2000,8000,1600,40000,2000,5000,34187.67,279100,9900,31300,814000,43500,5100,49500,4500,6262.38,100,10400,2400,1500,5000,2500,15000,40000,32500,41100,358600,109600,514300,258200,225900,402700,27!
 4300,75000,1000,56000,10000,4100,1000,15000,100,40000,7900,5000,105000
,15100,2000,1100,2900,1500,600,500,1300,100,5000,5000,10000,10100,7000,40000,10500,5000,9500,1000,15200,2000,10000,10000,100,7800,3500,189900,58000,345000,151700,11000,6000,7000,15700,6000,3000,5000,10000,2000,1000,36000,1000,500,8000,9000,6000,2000,26500,6000,5000,97200,2000,5100,17000,2500,25500,24000,5400,90000,41500,6200,7500,5000,7000,41000,25000,1500,40000,5000,10000,21500,100,32000,32500,70000,500,66400,21000,5000,5000,12600,3000,6200,38900,10000,1000,60000,41100,1200,31300,2500,58000,4100,58000,42500) 

Sorry for the inconvenience. I do understand fitting of distribution to such data is not a full proof method, but this is what is the procedure that has been followed in the risk management risk industry. Please note that my question is not pertaining to operational risk. My question if if distributions are not fitting to a particular data, how do we proceed further to simualte data based on this data. 

Regards 

Amelia Marsh

_______________________________________________
R-SIG-Finance at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-finance
-- Subscriber-posting only. If you want to post, subscribe first.
-- Also note that this is not the r-help list where general R questions should go.

	[[alternative HTML version deleted]]