performance - Improving Matlab function within big simulation -
i have big matlab simulation project in hands, wanted optimize, since i'm running many times tune parameters , like.
using matlab's profile identified 1 function eating of time, line output(i,1)= max(mean(dens(i+1:a,1)),dens(i+1,1));
this function called lot, input 10x1 double passed argument, , output 10x1 vector.
function output = my_function(input)  = size(input,1); output = input*0; dens = density(input);  % each i, output(i) maximum between output(i+1) , mean(output(i+1:end)) = 1:a-1     output(i,1)= max(mean(dens(i+1:a,1)),dens(i+1,1)); end output(a,1) = dens(a,1);  end my ideas:
- i think vectorization maybe rid of loop (?), i'm not familiar @ technique.
- is there faster/alternative way calculate mean(maybe without matlab's built-in function call?)
edit tried vectorize function, , got following alternative result, performs same operations:
function output = my_function_vectorized(input)  = size(input,1); rho_ref = zeros(size(input)); dens = density(input);  temp_cumsum = flip(cumsum(flip(dens))./(1:1:a)'); output = [max(temp_cumsum(2:end),dens(2:a));dens(a)];  end i tried testing both function in following way:
ts = random('unif',40,80,10,1000); results_original = zeros(size(ts)); results_vectorized = zeros(size(ts)); times_original = zeros(size(ts,2),1); times_vectorized = zeros(size(ts,2),1);  ii = 1:size(ts,2)     tic;     results_original(:,ii) = my_function(ts(:,ii));     times_original(ii) = toc; end  ii = 1:size(ts,2)     tic;     results_vectorized(:,ii) = my_function_vectorized(ts(:,ii));     times_vectorized(ii) = toc; end  res = norm(res_1 - res_2); mtimes_original = mean(times_original); mtimes_vectorized = mean(times_vectorized); for get:
res =     3.1815e-12  mtimes_original/mtimez_vectorized =     3.0279 - should residual concerning me?
- is correct have fastened computation factor of 3?
vectorize it.
the re-read of dens killing you, not mean. mean optimized donald knuth can make it.
i don't know density function, can't sure indexing.
pseudocode snips:
%(1)faster predeclaration shows intent output=zeroes(size(input))  %(2)vectorize "mean between here , end" b = fliplr(fliplr(cumsum(dens(1:a-1)))./fliplr(1:a-1))  %(3)assemble interior nx2 matrix  c = [b,dens]  %(4)vectorized max, think output = max(c,[],2) (1) hard beat built-ins speed , efficiency. nice able figure out year code does. on time find myself trying more , more of literate programmer (link) because less time expensive in long run coming in year or ten , trying reverse engineer own work.
(2) idea here flip density vector around, make cumulative sum, divide each element of reversed cumulative sum how many points fed it, flip around again. when divide sum count - becomes mean. read description (link) , there internal switch can restate without fliplr's , make more fast.
b = cumsum(dens(1:a-1),'reverse')./(a-1:-1:1) %this might work (3) in theory when done should have matrix 2 columns wide, , has many rows "dens" does. resizing , predeclaring can expensive - if changing sizes might want pre-declare (1).
(4) "max" function going screaming fast too. not nor mr. knuth going make faster. think 1 compare (silicon op) each element of array , few shuffles (less 1 per element) required.
this element-wise max. (i forgot add buffer in middle). made fast , output array. may need 1 instead of 2, know doing there , can figure out.
let me know if works you. i'm guessing might give no more 5x improvement.
i stunned find labview can fundamentals 100x faster matlab because (always) compiled. when compiling in matlab 1 must impose many new constraints on types , values, in lv compiling pain-free because of constraining part of initial program creation. if find heart of matlab program isn't fast enough, can make wrapper lv , run (much much) faster there little heartache. lv doesn't elaborate - there reason why use text books instead of pictures (or individualized renderings of topic da vinci, more correct metaphor).
edit: (about speed)
it looks ~3x faster.
edit: (about code, note i'm using 2014a)
clc; format short g; = 1:15 mu = fliplr(cumsum(fliplr(a))./(1:length(a))) which gives:
a =       1     2     3     4     5     6     7     8     9    10    11    12    13    14    15   mu =    columns 1 through 9              8          8.5            9          9.5           10         10.5           11         11.5           12    columns 10 through 15           12.5           13         13.5           14         14.5           15 so make "a", vector starting @ 1 , going 15. last value 15. average between 2nd last value , last 14.5. average of last 3 values 14. math seems working here.
edit:
one great speedup switch off of current java-based system. have seen code large (better 3x) speed boost running in version 2010a. code runs substantially slower when run through java when run through fortran or c-based compiled libraries.
Comments
Post a Comment