java - Calculate probability of bivariate normal distribution over area / polygon -

September 15, 2013

i'm trying calculate probability of bivariate normal distribution on specific area respectively specific polygon in java.

the mathematical description integrate probability density function (pdf) of bivariate normal distribution on specific complex area.

my first approach use 2 normaldistribution objects aid of apache-commons-math library. given dataset x dimension 1 , dataset y dimension 2 i've computed mean , standard deviation each normaldistribution.

with method public double probability(double x0, double x1) org.apache.commons.math3.distribution.normaldistribution i'm able set individual interval each dimension, means can define rectangular area , probability

normaldistribution normalx = new normaldistribution(means[0], stddeviation_x); normaldistribution normaly = new normaldistribution(means[1], stddeviation_y);  double probabilityofrect = normalx.probability(x1, x2) * normaly.probability(y1, y2);

if standard deviations small enough , defined region large enough, probability approach number of 1.0 (0.99999999999), expected.

as i've said need compute specific area, first approach won't work way because i'm able define rectangular areas.

so second approach use class multivariatenormaldistribution, implemented in apache-commons-math.

by using multivariatenormaldistribution vector means , covariance matrix, i'm able pdf of specific point x public double density(double[] vals), description saying

returns probability density function (pdf) of distribution evaluated @ specified point x.

http://commons.apache.org/proper/commons-math/apidocs/org/apache/commons/math3/distribution/multivariatenormaldistribution.html#density(double[])

in approach i'm converting complex area in arraylist of points , subsequently summing densities iterating on arraylist this:

multivariatenormaldistribution mnd = new multivariatenormaldistribution(means, covariances); double sum = 0.0;     for(point p : complexarea) {     double[] pos = {p.x, p.y};     sum += mnd.density(pos); } return sum;

but i've encountered problem lacking precision when setting standard deviations low values pdf containing peaks > 1 @ position i'm calling mnd.density(pos). sum adding values > 1.

to avoid these peaks i'm trying sum average of summed value surrounding points in double precision of current point by

multivariatenormaldistribution mnd = new multivariatenormaldistribution(means, covariances); double sum = 0.0; for(point p : surfacepoints) {     double tmpres = 0.0;      for(double x = p.x - 0.5; x < p.x + 0.5; x+=0.1) {         for(double y = p.y - 0.5; y < p.y + 0.5; y+=0.1) {             double[] pos = {x, y};             tmpres += mnd.density(pos);         }     }     sum += tmpres / 100.0; }  return sum;

which works.

all in i'm not quite sure if approaches fundamentally correct. approach compute probability numerical integration i'm clueless how achieve in java.

are there other possibilities achieve this?

edit: beside fact of lacking accuracy, main question is: second approach "summing densities" valid method obtain probability in area of bivariate normal distribution? thinking 1-dimensional normal distributions, probability of 1 specific point 0. how public double density(double[] vals) method in apache math library obtain valid value?

your current approach perform numerical integral sampling @ points integer coordinates, assigning value @ each point whole square. has 2 main sources of error. 1 function may vary lot within square. boundary, integrate on squares not contained in region. third source of error roundoff, significant since other sources of errors huge.

one simple way reduce error use finer grid. if sample @ points coordinates integers divided n (and multiply area n^-2 of 1/n 1/n squares), reduce both sources of errors. problem sample @ n^2 many points.

i suggest writing double integral on region integral of integrals.

the inner integral (say, respect x) integral of one-dimensional gaussian on interval, if region convex, or @ worst on finite list of integrals. integrate pdf restricted particular y coordinate y0 along intersection of polygon horizontal line y=y0. can evaluate inner integrals using functions such erf, numerically approximated in libraries, or can using one-dimensional numerical integral.

the outer integral (say, respect y) naturally breaks pieces. there point of polygon, function inside outer integral might not smooth. so, break outer integral y-coordinates of vertices of polygon, , numerical integral such trapezoid rule or simpson's rule on each of intervals. these require evaluate inner integral @ few points in each interval , weight them appropriately.

this should produce more accurate results given amount of time refining grid.

Search This Blog

Ruby Code

java - Calculate probability of bivariate normal distribution over area / polygon -

Comments

Post a Comment

Popular posts from this blog

php - failed to open stream: HTTP request failed! HTTP/1.0 400 Bad Request -

command line - Use qwinsta in PowerShell ISE -

java - Show Soft Keyboard when EditText Appears -