|
| 1 | +## GEKPLS Surrogate Tutorial |
| 2 | + |
| 3 | +Gradient Enhanced Kriging with Partial Least Squares Method (GEKPLS) is a surrogate modelling technique that brings down computation time and returns improved accuracy for high-dimensional problems. The Julia implementation of GEKPLS is adapted from the Python version by [SMT](https://github.com/SMTorg) which is based on this [paper](https://arxiv.org/pdf/1708.02663.pdf). |
| 4 | + |
| 5 | +The following are the inputs when building a GEKPLS surrogate: |
| 6 | + |
| 7 | +1. X - The matrix containing the training points |
| 8 | +2. y - The vector containing the training outputs associated with each of the training points |
| 9 | +3. grads - The gradients at each of the input X training points |
| 10 | +4. n_comp - Number of components to retain for the partial least squares regression (PLS) |
| 11 | +5. delta_x - The step size to use for the first order Taylor approximation |
| 12 | +6. xlimits - The lower and upper bounds for the training points |
| 13 | +7. extra_points - The number of additional points to use for the PLS |
| 14 | +8. theta - The hyperparameter to use for the correlation model |
| 15 | + |
| 16 | +The following example illustrates how to use GEKPLS: |
| 17 | + |
| 18 | +```@example gekpls_water_flow |
| 19 | +
|
| 20 | +using Surrogates |
| 21 | +using Zygote |
| 22 | +
|
| 23 | +function vector_of_tuples_to_matrix(v) |
| 24 | + #helper function to convert training data generated by surrogate sampling into a matrix suitable for GEKPLS |
| 25 | + num_rows = length(v) |
| 26 | + num_cols = length(first(v)) |
| 27 | + K = zeros(num_rows, num_cols) |
| 28 | + for row in 1:num_rows |
| 29 | + for col in 1:num_cols |
| 30 | + K[row, col]=v[row][col] |
| 31 | + end |
| 32 | + end |
| 33 | + return K |
| 34 | +end |
| 35 | +
|
| 36 | +function vector_of_tuples_to_matrix2(v) |
| 37 | + #helper function to convert gradients into matrix form |
| 38 | + num_rows = length(v) |
| 39 | + num_cols = length(first(first(v))) |
| 40 | + K = zeros(num_rows, num_cols) |
| 41 | + for row in 1:num_rows |
| 42 | + for col in 1:num_cols |
| 43 | + K[row, col] = v[row][1][col] |
| 44 | + end |
| 45 | + end |
| 46 | + return K |
| 47 | +end |
| 48 | +
|
| 49 | +function water_flow(x) |
| 50 | + r_w = x[1] |
| 51 | + r = x[2] |
| 52 | + T_u = x[3] |
| 53 | + H_u = x[4] |
| 54 | + T_l = x[5] |
| 55 | + H_l = x[6] |
| 56 | + L = x[7] |
| 57 | + K_w = x[8] |
| 58 | + log_val = log(r/r_w) |
| 59 | + return (2*pi*T_u*(H_u - H_l))/ ( log_val*(1 + (2*L*T_u/(log_val*r_w^2*K_w)) + T_u/T_l)) |
| 60 | +end |
| 61 | +
|
| 62 | +n = 1000 |
| 63 | +d = 8 |
| 64 | +lb = [0.05,100,63070,990,63.1,700,1120,9855] |
| 65 | +ub = [0.15,50000,115600,1110,116,820,1680,12045] |
| 66 | +x = sample(n,lb,ub,SobolSample()) |
| 67 | +X = vector_of_tuples_to_matrix(x) |
| 68 | +grads = vector_of_tuples_to_matrix2(gradient.(water_flow, x)) |
| 69 | +y = reshape(water_flow.(x),(size(x,1),1)) |
| 70 | +xlimits = hcat(lb, ub) |
| 71 | +n_test = 100 |
| 72 | +x_test = sample(n_test,lb,ub,GoldenSample()) |
| 73 | +X_test = vector_of_tuples_to_matrix(x_test) |
| 74 | +y_true = water_flow.(x_test) |
| 75 | +n_comp = 2 |
| 76 | +delta_x = 0.0001 |
| 77 | +extra_points = 2 |
| 78 | +initial_theta = 0.01 |
| 79 | +g = GEKPLS(X, y, grads, n_comp, delta_x, xlimits, extra_points, initial_theta) |
| 80 | +y_pred = g(X_test) |
| 81 | +rmse = sqrt(sum(((y_pred - y_true).^2)/n_test)) #root mean squared error |
| 82 | +println(rmse) |
| 83 | +``` |
| 84 | + |
0 commit comments