How To Do Hardy Weinberg Problems Step By Step

How to Do Hardy‑Weinberg Problems Step by Step

The Hardy‑Weinberg principle is a cornerstone of population genetics that lets us predict how allele and genotype frequencies should behave in an ideal, non‑evolving population. Mastering the step‑by‑step method for solving Hardy‑Weinberg (HW) problems is essential for biology students, because it trains you to translate real‑world data into the mathematical language of evolution and to test whether a population is actually evolving. Below you will find a complete guide that walks you through the theory, the calculations, practical examples, and common pitfalls—all written in clear, beginner‑friendly language.

Detailed Explanation

At its core, the Hardy‑Weinberg equilibrium (HWE) states that, in the absence of evolutionary forces (mutation, selection, gene flow, genetic drift, and non‑random mating), the frequencies of alleles and genotypes in a population remain constant from one generation to the next. The principle is expressed with two simple equations:

Allele frequency equation: (p + q = 1)
- (p) = frequency of the dominant allele (A)
- (q) = frequency of the recessive allele (a) 2. Genotype frequency equation: (p^{2} + 2pq + q^{2} = 1)
- (p^{2}) = frequency of homozygous dominant genotype (AA)
- (2pq) = frequency of heterozygous genotype (Aa)
- (q^{2}) = frequency of homozygous recessive genotype (aa)

If a population truly meets the HW assumptions, the observed genotype frequencies should match those predicted by the equations above. When they do not, we infer that at least one evolutionary force is acting on the population.

Understanding why the equations look the way they do helps avoid rote memorization. Imagine randomly drawing two alleles from a large gene pool to form a zygote. The probability of getting an A allele is (p); the probability of getting an a allele is (q). The chance of drawing two A’s (AA) is (p \times p = p^{2}). The chance of drawing an A then an a (or a then A) is (p \times q + q \times p = 2pq). Finally, the chance of drawing two a’s (aa) is (q \times q = q^{2}). Because these three possibilities exhaust all outcomes, their probabilities must sum to 1.

Step‑by‑Step or Concept Breakdown

Solving a Hardy‑Weinberg problem typically follows a logical sequence. Below is a detailed workflow you can apply to virtually any HW question, whether you are given allele frequencies, genotype frequencies, or a mixture of both.

1. Identify What Is Given

Allele frequencies (p or q) directly.
Genotype frequencies (observed numbers or percentages of AA, Aa, aa).
Phenotype data (e.g., number of individuals showing a recessive trait).

2. Convert All Data to Frequencies (if needed)

If you receive raw counts, divide each count by the total number of individuals (N) to get a proportion. [ \text{Frequency} = \frac{\text{Number of individuals with a genotype}}{\text{Total individuals}} ]
Express the result as a decimal (e.g., 0.36) or keep it as a percentage (36 %). Remember that (p) and (q) must be expressed as decimals when you plug them into the equations.

3. Determine p and q

If you are given p or q directly: simply use the value and compute the other with (q = 1 - p) (or (p = 1 - q)).
If you are given the frequency of the recessive phenotype (aa): because aa = (q^{2}), take the square root:
[ q = \sqrt{\text{frequency of aa}} \quad \text{and} \quad p = 1 - q ]
If you are given both homozygous genotype frequencies: you can compute p from (\sqrt{f_{AA}}) and q from (\sqrt{f_{aa}}); they should be consistent (if not, the population is not in HW equilibrium).
If you are given heterozygote frequency (Aa): use the relationship (2pq = f_{Aa}). You will need to solve a quadratic or use the known p or q from another piece of data.

4. Calculate Expected Genotype Frequencies

Plug your p and q into the HW equation:

Expected (AA = p^{2})
Expected (Aa = 2pq)
Expected (aa = q^{2})

If you need expected numbers rather than frequencies, multiply each expected frequency by the total population size (N).

5. Compare Observed vs. Expected (Optional but Common)

Many HW problems ask you to test whether the population is in equilibrium. Compute a chi‑square ((\chi^{2})) statistic:
[ \chi^{2} = \sum \frac{(O - E)^{2}}{E} ] where O = observed count, E = expected count for each genotype. Compare the result to the critical value from a chi‑square distribution with 1 degree of freedom (df = number of genotypes – number of alleles = 3 – 2 = 1) at your chosen significance level (usually 0.05). If (\chi^{2}) exceeds the critical value, reject the HW assumption; otherwise, you fail to reject it (the data are consistent with equilibrium).

6. State Your Conclusion Clearly

Summarize what the calculations tell you: allele frequencies, expected genotype distribution, and whether the population appears to be evolving.

Real Examples

Example 1: Flower Color in a Plant Population

A population of 500 plants shows the following flower colors:

Red (dominant, AA or Aa): 320 plants
White (recessive, aa): 180 plants

Step 1: Convert to frequencies.
[ f_{aa} = \frac{180}{500} = 0.36 ]

Step 2: Find q from the recessive genotype.
[q = \sqrt{0.36} = 0.60 ]
[ p = 1 - q = 0.40 ]

Step 3: Expected genotype frequencies.
[AA = p^{2} = 0.40^{2} = 0.16 ;(16%) ]

Step 4 – ExpectedNumbers for the Flower‑Color Data
The total population size is (N = 500). Multiplying the expected frequencies by (N) gives the numbers we would anticipate if the population were perfectly in Hardy–Weinberg equilibrium:

Expected (AA) (red, homozygous dominant)
[ E_{AA}=p^{2}\times N = 0.16 \times 500 = 80 ]
Expected (Aa) (red, heterozygous) First compute (2pq): [ 2pq = 2(0.40)(0.60)=0.48 ]
Then
[ E_{Aa}=0.48 \times 500 = 240 ]
Expected (aa) (white, homozygous recessive)
[ E_{aa}=q^{2}\times N = 0.36 \times 500 = 180 ]

Notice that the expected count for the recessive phenotype matches the observed count (180), a coincidence that will be examined in the chi‑square test.

Step 5 – Chi‑Square Goodness‑of‑Fit Test
The observed counts are:

(O_{AA}=320-180 = 140) red plants that are not white? Actually we have 320 red total, which includes both (AA) and (Aa). Since we already know (E_{aa}=180) matches the observed white count, the remaining 320 plants are the combined (AA) and (Aa). To perform the chi‑square test we need the observed genotype counts, not just the phenotype counts. If the problem supplies only phenotype totals, we can allocate the 320 red plants proportionally according to the observed genotype distribution (or, in many textbook versions, they give the breakdown: 140 (AA) and 180 (Aa) – but those numbers are not provided here). For illustration, let’s assume the observed genotype counts are (O_{AA}=140) and (O_{Aa}=180) (the latter being the same as the white count in the original statement, but we will treat it as the observed heterozygote count). The chi‑square contributions are then:

[\chi^{2}{AA}= \frac{(O{AA}-E_{AA})^{2}}{E_{AA}} = \frac{(140-80)^{2}}{80}= \frac{60^{2}}{80}= \frac{3600}{80}=45.0 ]

[\chi^{2}{Aa}= \frac{(O{Aa}-E_{Aa})^{2}}{E_{Aa}} = \frac{(180-240)^{2}}{240}= \frac{(-60)^{2}}{240}= \frac{3600}{240}=15.0 ]

[ \chi^{2}{aa}= \frac{(O{aa}-E_{aa})^{2}}{E_{aa}} = \frac{(180-180)^{2}}{180}=0 ]

Summing the three components:

[ \chi^{2}_{\text{total}} = 45.0 + 15.0 + 0 = 60.0 ]

With three genotype classes and two alleles, the degrees of freedom are (3-2 = 1). The critical value of (\chi^{2}) at (\alpha = 0.05) for 1 df is 3.84. Because (60.0 \gg 3.84), we reject the null hypothesis of Hardy–Weinberg equilibrium. The population shows a significant excess of the homozygous dominant genotype relative to the expected distribution.

Step 6 – Interpretation
The allele frequencies derived from the observed recessive phenotype give (p = 0.40) and (q = 0.60). Under equilibrium, we would predict 80 homozygous dominant, 240 heterozygous, and 180 homozygous recessive individuals. The actual data, however, contain far more homozygous dominant plants (140 observed versus 80 expected) and far fewer heterozygotes (180 observed versus 240 expected). The chi‑square test confirms that forces such as selection, non‑random mating, mutation, migration, or genetic drift are likely acting on this plant population, preventing it from staying in Hardy–Weinberg proportions.

Example 2: Blood‑Group Frequencies in Humans

Suppose a sample of 1 200 individuals

...has been typed for blood groups, with the following observed counts:

Type O: 600 individuals
Type A: 360 individuals
Type B: 32 individuals
Type AB: 8 individuals

Blood groups are determined by three alleles: (I^O), (I^A), and (I^B), with (I^A) and (I^B) codominant and both dominant to (I^O). To test for Hardy–Weinberg equilibrium, we first estimate the allele frequencies from the phenotype data. Let (p = \text{freq}(I^O)), (q = \text{freq}(I^A)), and (r = \text{freq}(I^B)), with (p + q + r = 1).

From the phenotypes:

Type O corresponds to genotype (I^O I^O), so (p^2 = 600/1200 = 0.5) → (p = \sqrt{0.5} \approx 0.7071).
Type A includes (I^A I^A) and (I^A I^O). The frequency of (I^A) in type A individuals is not directly separable without additional assumptions, so we use the standard method:
(q \approx \text{freq}(A) - \text{freq}(AB)/2)? Actually, for three alleles, we compute allele counts directly: Total alleles = (2 \times 1200 = 2400). Count of (I^O): (2 \times 600 + 360 \times 1 + 8 \times 0 = 1200 + 360 = 1560) → (p = 1560/2400 = 0.65). Count of (I^A): (360 \times 1 + 8 \times 1 = 360 + 8 = 368)? Wait—type A contributes one (I^A) per individual (since they are either (I^A I^A) or (I^A I^O)), and type AB contributes one (I^A). So (I^A) count = (360 + 8 = 368) → (q = 368/2400 \approx 0.1533). Count of (I^B): (32 + 8 = 40) → (r = 40/2400 \approx 0.0167). Check: (0.65 + 0.1533 + 0.0167 = 0.82)? That sums to 0.82, not 1—error. Let's recalculate carefully:

Actually, type A (360) includes both (I^A I^A) and (I^A I^O). Each type A individual has at least one (I^A), but we don't know how many are homozygous. The correct allele counting:
- (I^O) alleles: from type O (600 × 2 = 1200) + from type A (each type A has one (I^O) if heterozygous, but we don't know). Instead, use: Total (I^O) = (2 \times \text{O} + 1 \times \text{A} + 0 \times \text{B} + 0 \times \text{AB})? No—that assumes all type A are heterozygous (I^A I^O), which is incorrect. The standard approach for three alleles is to solve: (p^2 = \text{freq}(O)) (2pr = \text{freq}(B))? Not exactly. For AB

Example 2: Blood-Group Frequencies in Humans

Suppose a sample of 1,200 individuals has been typed for blood groups, with the following observed counts:

Type O: 600 individuals
Type A: 360 individuals
Type B: 32 individuals
Type AB: 8 individuals

From the phenotypes:

Type O corresponds to genotype (I^O I^O), so (p^2 = 600/1200 = 0.5) → (p = \sqrt{0.5} \approx 0.7071).
Type AB corresponds to genotype (I^A I^B), so (2qr = 8/1200 \approx 0.0067).
Type B corresponds to genotypes (I^O I^B) and (I^B I^B). The frequency of (I^B I^B) is (r^2), and the frequency of (I^O I^B) is (2pr). Thus, (r^2 + 2pr = 32/1200 \approx 0.0267).
Type A corresponds to genotypes (I^A I^A) and (I^A I^O). The frequency of (I^A I^A) is (q^2), and the frequency of (I^A I^O) is (2pq). Thus, (q^2 + 2pq = 360/1200 = 0.3).

We now have three equations:

(p^2 = 0.5)
(2qr = 0.0067)
(r^2 + 2pr = 0.0267)
(q^2 + 2pq = 0.3)
(p + q + r = 1)

From equation 1, (p = \sqrt{0.5} \approx 0.7071). Using equation 5, (q + r = 1 - 0.7071 \approx 0.2929). From equation 2, (qr = 0.0067/2 \approx 0.00335). We can solve for (q) and (r) using these two equations:

(q = 0.2929 - r)
((0.2929 - r)r = 0.00335)
(0.2929r - r^2 = 0.00335)
(r^2 - 0.2929r + 0.00335 = 0)

Solving this quadratic equation, we find (r \approx 0.0113) and (q \approx 0.2816). Checking equation 3: (r^2 + 2pr = (0.0113)^2 + 2(0.7071)(0.0113) \approx 0.000128 + 0.01598 \approx 0.0161), which is close to 0.0267, indicating some approximation error. Similarly, checking equation 4: (q^2 + 2pq = (0.2816)^2 + 2(0.7071)(0.2816) \approx 0.0793 + 0.3987 \approx 0.478), which is not close to 0.3, indicating that the population is not in Hardy–Weinberg equilibrium.

The discrepancies suggest that factors such as selection, non-random mating, mutation, migration, or genetic drift are likely acting on this human population, preventing it from staying in Hardy–Weinberg proportions.

How To Do Hardy Weinberg Problems Step By Step

Table of Contents