Importance Sampling Examples¶
from qmcpy import *
from numpy import *
import pandas as pd
pd.options.display.float_format = '{:.2e}'.format
from matplotlib import pyplot as plt
import matplotlib
%matplotlib inline
Game Example¶
Consider a game where
are drawn
with a payoff of
payoff = lambda x: 10*(x.sum(1)>1.7)
abs_tol = 1e-3
Vanilla Monte Carlo¶
With ordinary Monte Carlo we do the following:
d = 2
integral = CustomFun(
true_measure = Uniform(Lattice(d)),
g = payoff)
solution1,data1 = CubQMCLatticeG(integral, abs_tol).integrate()
data1
LDTransformData (AccumulateData Object)
solution 0.450
error_bound 6.70e-04
n_total 2^(16)
time_integrate 0.058
CubQMCLatticeG (StoppingCriterion Object)
abs_tol 0.001
rel_tol 0
n_init 2^(10)
n_max 2^(35)
CustomFun (Integrand Object)
Uniform (TrueMeasure Object)
lower_bound 0
upper_bound 1
Lattice (DiscreteDistribution Object)
d 2^(1)
dvec [0 1]
randomize 1
order natural
entropy 277709943434788460192509560078836861709
spawn_key ()
Monte Carlo with Importance Sampling¶
We may add the importance sampling to increase the number of samples with positive payoffs. Let
This means that and
are IID with common CDF
and common PDF
.
Thus,
p = 1
d = 2
integral = CustomFun(
true_measure = Uniform(Lattice(d)),
g = lambda x: payoff(x**(1/(p+1))) / ((p+1)**2 * (x.prod(1))**(p/(p+1))))
solution2,data2 = CubQMCLatticeG(integral, abs_tol).integrate()
data2
LDTransformData (AccumulateData Object)
solution 0.450
error_bound 6.59e-04
n_total 2^(14)
time_integrate 0.023
CubQMCLatticeG (StoppingCriterion Object)
abs_tol 0.001
rel_tol 0
n_init 2^(10)
n_max 2^(35)
CustomFun (Integrand Object)
Uniform (TrueMeasure Object)
lower_bound 0
upper_bound 1
Lattice (DiscreteDistribution Object)
d 2^(1)
dvec [0 1]
randomize 1
order natural
entropy 173672388111166095812144815858861052514
spawn_key ()
print('Imporance Sampling takes %.3f the time and %.3f the samples'%\
(data2.time_integrate/data1.time_integrate,data2.n_total/data1.n_total))
Imporance Sampling takes 0.397 the time and 0.250 the samples
Asian Call Option Example¶
The stock price must raise significantly for the payoff to be positive. So we will give a upward drift to the Brownian motion that defines the stock price path. We can think of the option price as the multidimensional integral
We will replace by
where a positive will create more positive payoffs. This
corresponds to giving our Brownian motion a drift. To do this we
re-write the integral as
Finally note that
This drift in the Brownian motion may be implemented by changing the
drift
input to the BrownianMotion
object.
abs_tol = 1e-2
dimension = 32
Vanilla Monte Carlo¶
integrand = AsianOption(Sobol(dimension))
solution1,data1 = CubQMCSobolG(integrand, abs_tol).integrate()
data1
LDTransformData (AccumulateData Object)
solution 1.784
error_bound 0.007
n_total 2^(12)
time_integrate 0.012
CubQMCSobolG (StoppingCriterion Object)
abs_tol 0.010
rel_tol 0
n_init 2^(10)
n_max 2^(35)
AsianOption (Integrand Object)
volatility 2^(-1)
call_put call
start_price 30
strike_price 35
interest_rate 0
mean_type arithmetic
dim_frac 0
BrownianMotion (TrueMeasure Object)
time_vec [0.031 0.062 0.094 ... 0.938 0.969 1. ]
drift 0
mean [0. 0. 0. ... 0. 0. 0.]
covariance [[0.031 0.031 0.031 ... 0.031 0.031 0.031]
[0.031 0.062 0.062 ... 0.062 0.062 0.062]
[0.031 0.062 0.094 ... 0.094 0.094 0.094]
...
[0.031 0.062 0.094 ... 0.938 0.938 0.938]
[0.031 0.062 0.094 ... 0.938 0.969 0.969]
[0.031 0.062 0.094 ... 0.938 0.969 1. ]]
decomp_type PCA
Sobol (DiscreteDistribution Object)
d 2^(5)
dvec [ 0 1 2 ... 29 30 31]
randomize LMS_DS
graycode 0
entropy 269848504287400991447340062704076098907
spawn_key ()
Monte Carlo with Importance Sampling¶
drift = 1
integrand = AsianOption(BrownianMotion(Sobol(dimension),drift=drift))
solution2,data2 = CubQMCSobolG(integrand, abs_tol).integrate()
data2
LDTransformData (AccumulateData Object)
solution 1.785
error_bound 0.007
n_total 2^(11)
time_integrate 0.010
CubQMCSobolG (StoppingCriterion Object)
abs_tol 0.010
rel_tol 0
n_init 2^(10)
n_max 2^(35)
AsianOption (Integrand Object)
volatility 2^(-1)
call_put call
start_price 30
strike_price 35
interest_rate 0
mean_type arithmetic
dim_frac 0
BrownianMotion (TrueMeasure Object)
time_vec [0.031 0.062 0.094 ... 0.938 0.969 1. ]
drift 0
mean [0. 0. 0. ... 0. 0. 0.]
covariance [[0.031 0.031 0.031 ... 0.031 0.031 0.031]
[0.031 0.062 0.062 ... 0.062 0.062 0.062]
[0.031 0.062 0.094 ... 0.094 0.094 0.094]
...
[0.031 0.062 0.094 ... 0.938 0.938 0.938]
[0.031 0.062 0.094 ... 0.938 0.969 0.969]
[0.031 0.062 0.094 ... 0.938 0.969 1. ]]
decomp_type PCA
transform BrownianMotion (TrueMeasure Object)
time_vec [0.031 0.062 0.094 ... 0.938 0.969 1. ]
drift 1
mean [0.031 0.062 0.094 ... 0.938 0.969 1. ]
covariance [[0.031 0.031 0.031 ... 0.031 0.031 0.031]
[0.031 0.062 0.062 ... 0.062 0.062 0.062]
[0.031 0.062 0.094 ... 0.094 0.094 0.094]
...
[0.031 0.062 0.094 ... 0.938 0.938 0.938]
[0.031 0.062 0.094 ... 0.938 0.969 0.969]
[0.031 0.062 0.094 ... 0.938 0.969 1. ]]
decomp_type PCA
Sobol (DiscreteDistribution Object)
d 2^(5)
dvec [ 0 1 2 ... 29 30 31]
randomize LMS_DS
graycode 0
entropy 235852111986323844533130741538557337093
spawn_key ()
print('Imporance Sampling takes %.3f the time and %.3f the samples'%\
(data2.time_integrate/data1.time_integrate,data2.n_total/data1.n_total))
Imporance Sampling takes 0.829 the time and 0.500 the samples
Importance Sampling MC vs QMC¶
Test Parameters
dimension = 16
abs_tol = .025
trials = 3
df = pd.read_csv('../workouts/mc_vs_qmc/out/importance_sampling.csv')
df['Problem'] = df['Stopping Criterion'] + ' ' + df['Distribution'] + ' (' + df['MC/QMC'] + ')'
df = df.drop(['Stopping Criterion','Distribution','MC/QMC'],axis=1)
problems = ['CubMCCLT IIDStdUniform (MC)',
'CubMCG IIDStdUniform (MC)',
'CubQMCCLT Sobol (QMC)',
'CubQMCLatticeG Lattice (QMC)',
'CubQMCSobolG Sobol (QMC)']
df = df[df['Problem'].isin(problems)]
mean_shifts = df.mean_shift.unique()
df_samples = df.groupby(['Problem'])['n_samples'].apply(list).reset_index(name='n')
df_times = df.groupby(['Problem'])['time'].apply(list).reset_index(name='time')
df.loc[(df.mean_shift==0) | (df.mean_shift==1)].set_index('Problem')
# Note: mean_shift==0 --> NOT using importance sampling
mean_shift | solution | n_samples | time | |
---|---|---|---|---|
Problem | ||||
CubMCCLT IIDStdUniform (MC) | 0.00e+00 | 1.80e+00 | 3.20e+05 | 4.90e-01 |
CubMCCLT IIDStdUniform (MC) | 1.00e+00 | 1.77e+00 | 8.15e+04 | 1.30e-01 |
CubMCG IIDStdUniform (MC) | 0.00e+00 | 1.80e+00 | 4.18e+05 | 6.43e-01 |
CubMCG IIDStdUniform (MC) | 1.00e+00 | 1.77e+00 | 1.20e+05 | 1.91e-01 |
CubQMCCLT Sobol (QMC) | 0.00e+00 | 1.78e+00 | 4.10e+03 | 7.69e-03 |
CubQMCCLT Sobol (QMC) | 1.00e+00 | 1.78e+00 | 4.10e+03 | 7.79e-03 |
CubQMCLatticeG Lattice (QMC) | 0.00e+00 | 1.78e+00 | 2.05e+03 | 8.56e-03 |
CubQMCLatticeG Lattice (QMC) | 1.00e+00 | 1.78e+00 | 1.02e+03 | 4.98e-03 |
CubQMCSobolG Sobol (QMC) | 0.00e+00 | 1.78e+00 | 1.02e+03 | 2.06e-03 |
CubQMCSobolG Sobol (QMC) | 1.00e+00 | 1.78e+00 | 1.02e+03 | 1.99e-03 |
fig,ax = plt.subplots(nrows=1, ncols=2, figsize=(20, 6))
idx = arange(len(problems))
width = .35
ax[0].barh(idx+width,log(df.loc[df.mean_shift==0]['n_samples'].values),width)
ax[0].barh(idx,log(df.loc[df.mean_shift==1]['n_samples'].values),width)
ax[1].barh(idx+width,df.loc[df.mean_shift==0]['time'].values,width)
ax[1].barh(idx,df.loc[df.mean_shift==1]['time'].values,width)
fig.suptitle('Importance Sampling Comparison by Stopping Criterion on Asian Call Option')
xlabs = ['log(Samples)','Time']
for i in range(len(ax)):
ax[i].set_xlabel(xlabs[i])
ax[i].spines['top'].set_visible(False)
ax[i].spines['bottom'].set_visible(False)
ax[i].spines['right'].set_visible(False)
ax[i].spines['left'].set_visible(False)
ax[1].legend(['Vanilla Monte Carlo','Importance Sampling\nMean Shift=1'],loc='upper right',frameon=False)
ax[1].get_yaxis().set_ticks([])
ax[0].set_yticks(idx)
ax[0].set_yticklabels(problems)
plt.tight_layout();

fig,ax = plt.subplots(nrows=1, ncols=2, figsize=(22, 8))
df_samples.apply(lambda row: ax[0].plot(mean_shifts,row.n,label=row['Problem']),axis=1)
df_times.apply(lambda row: ax[1].plot(mean_shifts,row.time,label=row['Problem']),axis=1)
ax[1].legend(frameon=False, loc=(-.85,.7),ncol=1)
ax[0].set_ylabel('Total Samples')
ax[0].set_yscale('log')
ax[1].set_yscale('log')
ax[1].set_ylabel('Runtime')
for i in range(len(ax)):
ax[i].set_xlabel('mean shift')
ax[i].spines['top'].set_visible(False)
ax[i].spines['right'].set_visible(False)
fig.suptitle('Comparing Mean Shift Across Problems');
