HYPGEOM.DIST function

The HYPGEOM.DIST function in Excel calculates the hypergeometric distribution probability. The hypergeometric distribution is used to model situations where you are drawing a sample from a finite population without replacement. It is often used in cases like quality control, lottery draws, or card games.

Syntax:

HYPGEOM.DIST(x, samples_s, population_s, sample_s, cumulative)

Arguments:

  • x: The number of successes in the sample. This is the actual number of successes you are interested in.
  • samples_s: The number of successes in the population. This is the total number of successes in the population from which the sample is drawn.
  • population_s: The total size of the population. This is the total number of items or individuals in the population.
  • sample_s: The size of the sample. This is the number of items or individuals you are sampling from the population.
  • cumulative: A logical value (TRUE or FALSE) that specifies the form of the function:
    • TRUE: Returns the cumulative distribution function (CDF), which gives the probability of having x or fewer successes in the sample.
    • FALSE: Returns the probability mass function (PMF), which gives the probability of having exactly x successes in the sample.

How It Works:

The hypergeometric distribution is used when you are sampling without replacement from a finite population, which differs from the binomial distribution where sampling is done with replacement. In the hypergeometric distribution, each draw reduces the population size, and this changes the probabilities for subsequent draws.

The formula for the hypergeometric distribution is:

P(X=x)=(samples_sx)(population_ssamples_ssample_sx)(population_ssample_s)P(X = x) = \frac{\binom{\text{samples\_s}}{x} \binom{\text{population\_s} – \text{samples\_s}}{\text{sample\_s} – x}}{\binom{\text{population\_s}}{\text{sample\_s}}}

Where:

  • (ab)\binom{a}{b} is the binomial coefficient (the number of ways to choose bb items from aa items).

Example:

  1. Example 1: Exact Probability of Drawing 3 Red Balls Suppose you are drawing 5 balls from a total of 20 balls, where 7 of them are red. To calculate the probability of drawing exactly 3 red balls, you can use:
    =HYPGEOM.DIST(3, 7, 20, 5, FALSE)
    

    Here:

    • 3 is the number of red balls you’re interested in drawing (x).
    • 7 is the number of red balls in the population (samples_s).
    • 20 is the total number of balls (population_s).
    • 5 is the number of balls you’re drawing (sample_s).
    • FALSE specifies that you want the exact probability (PMF).

    The result will give you the probability of drawing exactly 3 red balls from a sample of 5.

  2. Example 2: Cumulative Probability of Drawing 3 or Fewer Red Balls If you want to find the cumulative probability of drawing 3 or fewer red balls from the same setup, use:
    =HYPGEOM.DIST(3, 7, 20, 5, TRUE)
    

    This will return the cumulative probability of drawing 0, 1, 2, or 3 red balls.

  3. Example 3: Drawing 2 Defective Items In a factory, there are 15 defective items and 50 non-defective items. You are drawing a sample of 10 items without replacement, and you want to find the probability of drawing exactly 2 defective items:
    =HYPGEOM.DIST(2, 15, 65, 10, FALSE)
    

    This calculates the probability of drawing exactly 2 defective items in a sample of 10 from a population of 65 items.

Key Points:

  • The hypergeometric distribution is useful when sampling without replacement, such as in situations where the population is finite, and the probability changes with each draw.
  • Use the PMF (with cumulative = FALSE) to find the exact probability of a specific number of successes in your sample.
  • Use the CDF (with cumulative = TRUE) to find the probability of getting up to a certain number of successes in your sample.
  • The HYPGEOM.DIST function helps with problems that involve drawing samples from a population, where the outcomes are dependent on previous draws.

Use Cases:

  • Quality Control: The hypergeometric distribution can model the probability of finding a certain number of defective items in a sample taken from a batch of products.
  • Card Games and Lotteries: It can be used to calculate the likelihood of drawing a specific combination of cards or lottery numbers from a limited set.
  • Biology: It can be used in genetic studies to determine the likelihood of drawing a specific number of a certain type of gene from a population.
  • Polls and Surveys: It can be used to calculate probabilities in surveys when the responses are not independent, such as when sampling is done without replacement.

Notes:

  • Unlike the binomial distribution, which assumes replacement, the hypergeometric distribution accounts for the fact that the population size changes as items are selected.
  • Ensure that all arguments make sense for a sampling scenario. For instance, x (the number of successes) should not exceed the smaller of samples_s (the number of successes in the population) or sample_s (the sample size).
Leave a Reply 0

Your email address will not be published. Required fields are marked *