python - Trouble calculating with large binomial coefficients -


intro.

i'm trying plot hypergeometric distribution ipython. probability function of distribution contains 3 binomial coefficients.

since values putting in coefficients large eg. 1e28, i've decided make own function calculate binomial coefficients use second order of stirling's approximation.

as binomial coefficients big put in variable , multiply directly, decided calculate logarithms instead , added them each other. obtain final probability put result in exp function. probability of relatively "normal" size (its maximum @ 2.7e24), there should no more problems there... except there are.

problem.

the 'result' referred to, log of probability ended way bigger should values ranging -6.2e24 1.3e14. in comparison, log of mathematical maximum 56.

another problem curve on plot jagged when zoomed in. looks fine when zoomed out. curve smooth, maximum @ mean of distribution:

plot on entire domain.

but when zoom in on mean of peak of probability function since standard deviation small this:

zoomed in on mean.

the red line marks average , black lines average +/- standard deviation. while looks pretty it's not need smooth curve.

question.

can explain why (1) values big , (2) why curve jagged , how can fix them?

code.

import matplotlib.pyplot plt import numpy np %matplotlib inline  #returns log of "n choose k" calculated 2nd order stirling's approximation def l_cb(n, k):     if (k > n) or (n < 0) or (n < 0):         print "invalid values binomial coefficient:", "n =", n, ", k =", k, "."         return 0.0     if (k == n) or (k == 0) or (n == 0):         return 0.0     = (n + 0.5) * np.log(n) - (k + 0.5) * np.log(k) - (n - k + 0.5) * np.log(n - k)     b = np.log(1 + 1 / (12.0 * n)) - np.log(1 + 1 / (12.0 * k)) - np.log(1 + 1 / (12.0 * (n - k)))     return - 1/2 * np.log(2 * np.pi) + + b  k = 2.24e28 k = 2.24e27 n = 2.7e25  #mathematical maximum of p. np.log(max) 56. l_p way big. max = (k + 1) * (n + 1) / (k + 2) #mathematical average of n avg = n * k / k #mathematical standard deviation of p. sd = np.sqrt(n * k * (k - n) * (k - k) / k ** 2 / (k - 1))  n = np.linspace(avg - 50e12, avg + 50e12, 1001) l_p = np.zeros(len(n))  #calculating log(p). in xrange(len(l_p)):     l_p[i] = l_cb(n, n[i]) + l_cb(k - n, k - n[i]) - l_cb(k, k)  #marking avg, avg - sd, avg + sd y = np.linspace(-4e14, 5e14, len(n)) x_avg = np.ones(len(n)) x_sd_l = np.ones(len(n)) - sd / avg x_sd_r = np.ones(len(n)) + sd / avg  plt.plot(n / avg, l_p, x_avg, y, 'r', x_sd_l, y, 'k', x_sd_r, y, 'k') plt.xlabel('n / avg') plt.ylabel('log(p)') 


Comments

Popular posts from this blog

python - mat is not a numerical tuple : openCV error -

c# - MSAA finds controls UI Automation doesn't -

wordpress - .htaccess: RewriteRule: bad flag delimiters -