Factorial Designs: Higher-Order and Fractional Designs

Extending Factorials Beyond 2×2 — Week 7

Dr. Samuel B Fernandes

2026-01-01

Learning Objectives

By the end of this lecture, you should be able to:

Extend the 2×2 factorial workflow to higher-order designs (3×3, 2×2×2)
Analyze and interpret interactions with more than two levels or three factors
Recognize three-way interactions and decide when they matter practically
Understand the concept and motivation for fractional factorial designs (\(2^{k-p}\))
Make informed design decisions about factorial complexity, sample size, and power tradeoffs

3×3 Factorial Designs

Expanding to 3 Levels

3×3 Factorial:

Two factors, each with 3 levels
9 treatment combinations (3×3)
More detailed dose-response information

Example: Nitrogen (0, 50, 100 kg/ha) × Irrigation (Low, Medium, High)

Treatment combinations table:

Irrigation	N = 0	N = 50	N = 100
Low	L0	L50	L100
Medium	M0	M50	M100
High	H0	H50	H100

When to prefer 3 levels: Use 3 levels when you need to detect a dose-response curve or find an optimal rate (fertilizer, herbicide). Two levels only test high vs. low; three levels reveal whether the response is linear or quadratic.

ANOVA for 3×3 Factorial

ANOVA structure:

\[\text{SS}_{\text{Total}} = \text{SS}_A + \text{SS}_B + \text{SS}_{A \times B} + \text{SS}_{\text{Error}}\]

Degrees of freedom:

Factor A (3 levels): \(a - 1 = 2\) df
Factor B (3 levels): \(b - 1 = 2\) df
Interaction A×B: \((a-1)(b-1) = 4\) df
Error: \(ab(r-1) = 9(r-1)\) df

ANOVA table for 3×3 with 8 reps:

Source	df	MS	F	p
Nitrogen	2	\(MS_N\)	\(F_N\)	\(p_N\)
Irrigation	2	\(MS_I\)	\(F_I\)	\(p_I\)
N × I	4	\(MS_{N \times I}\)	\(F_{N \times I}\)	\(p_{N \times I}\)
Error	63	\(MS_E\)	—	—
Total	71	—	—	—

Example: 3×3 Nitrogen × Irrigation

Code

library(dplyr)
library(ggplot2)
library(car)
library(emmeans)

# Simulate 3×3 factorial data
set.seed(2026)
data_3x3 <- expand.grid(
  Nitrogen   = factor(c("0", "50", "100"), levels = c("0", "50", "100")),
  Irrigation = factor(c("Low", "Medium", "High"), levels = c("Low", "Medium", "High")),
  Rep        = 1:8
) |>
  mutate(
    mu = case_when(
      Nitrogen == "0"   & Irrigation == "Low"    ~ 2800,
      Nitrogen == "50"  & Irrigation == "Low"    ~ 2900,
      Nitrogen == "100" & Irrigation == "Low"    ~ 3000,
      Nitrogen == "0"   & Irrigation == "Medium" ~ 3200,
      Nitrogen == "50"  & Irrigation == "Medium" ~ 3500,
      Nitrogen == "100" & Irrigation == "Medium" ~ 3800,
      Nitrogen == "0"   & Irrigation == "High"   ~ 3500,
      Nitrogen == "50"  & Irrigation == "High"   ~ 4200,
      Nitrogen == "100" & Irrigation == "High"   ~ 4800
    ),
    Yield = mu + rnorm(n(), mean = 0, sd = 150)
  )

# Treatment mean table
data_3x3 |>
  group_by(Nitrogen, Irrigation) |>
  summarise(Mean_Yield = round(mean(Yield), 0), .groups = "drop") |>
  tidyr::pivot_wider(names_from = Irrigation, values_from = Mean_Yield)

#> # A tibble: 3 × 4
#>   Nitrogen   Low Medium  High
#>   <fct>    <dbl>  <dbl> <dbl>
#> 1 0         2807   3129  3492
#> 2 50        2857   3471  4233
#> 3 100       2990   3683  4850

Visualize 3×3 Interaction

Code

summary_3x3 <- data_3x3 |>
  group_by(Nitrogen, Irrigation) |>
  summarise(mean_yield = mean(Yield), se_yield = sd(Yield) / sqrt(n()), .groups = "drop")

ggplot(summary_3x3, aes(x = Nitrogen, y = mean_yield, color = Irrigation, group = Irrigation)) +
  geom_line(linewidth = 2) +
  geom_point(size = 6) +
  geom_errorbar(aes(ymin = mean_yield - se_yield, ymax = mean_yield + se_yield),
                width = 0.15, linewidth = 1) +
  labs(title = "3×3 Factorial: Nitrogen Response Increases with Irrigation",
       subtitle = "Non-parallel lines indicate interaction",
       x = "Nitrogen Level (kg/ha)", y = "Corn Yield (kg/ha)", color = "Irrigation") +
  scale_color_manual(values = c("Low" = "#9D2235", "Medium" = "#FF8C42", "High" = "#2E8B57")) +
  scale_y_continuous(limits = c(2500, 5500), breaks = seq(2500, 5500, 500)) +
  theme_minimal(base_size = 18) +
  theme(plot.title = element_text(color = "#2E8B57", face = "bold", size = 20),
        plot.subtitle = element_text(size = 16), legend.position = "right")

Figure 1: Interaction plot: Nitrogen × Irrigation (3×3 factorial)

Observation: Lines are not parallel — slopes steepen from Low → Medium → High irrigation. The N × Irrigation interaction is strong: nitrogen investment pays off most under high water availability.

ANOVA: 3×3 Factorial

Code

# Sum-to-zero contrasts required for correct Type III SS
options(contrasts = c("contr.sum", "contr.poly"))
model_3x3 <- lm(Yield ~ Nitrogen * Irrigation, data = data_3x3)
Anova(model_3x3, type = "III")

#> Anova Table (Type III tests)
#> 
#> Response: Yield
#>                        Sum Sq Df   F value    Pr(>F)    
#> (Intercept)         882639207  1 46405.764 < 2.2e-16 ***
#> Nitrogen              5860899  2   154.072 < 2.2e-16 ***
#> Irrigation           20691962  2   543.952 < 2.2e-16 ***
#> Nitrogen:Irrigation   2922096  4    38.408 2.965e-16 ***
#> Residuals             1198262 63                        
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Code

# Restore default contrasts
options(contrasts = c("contr.treatment", "contr.poly"))

ANOVA results: The N × Irrigation interaction is significant (p < 0.001) → do not interpret main effects in isolation. Conduct simple effects analysis.

Simple Effects: 3×3 Analysis

Code

# Simple effects: nitrogen comparisons within each irrigation level
emm_3x3 <- emmeans(model_3x3, ~ Nitrogen | Irrigation)
pairs(emm_3x3, adjust = "tukey")

#> Irrigation = Low:
#>  contrast                 estimate SE df t.ratio p.value
#>  Nitrogen0 - Nitrogen50      -49.9 69 63  -0.723  0.7509
#>  Nitrogen0 - Nitrogen100    -183.0 69 63  -2.654  0.0268
#>  Nitrogen50 - Nitrogen100   -133.2 69 63  -1.931  0.1383
#> 
#> Irrigation = Medium:
#>  contrast                 estimate SE df t.ratio p.value
#>  Nitrogen0 - Nitrogen50     -341.7 69 63  -4.955  <.0001
#>  Nitrogen0 - Nitrogen100    -553.9 69 63  -8.032  <.0001
#>  Nitrogen50 - Nitrogen100   -212.2 69 63  -3.077  0.0086
#> 
#> Irrigation = High:
#>  contrast                 estimate SE df t.ratio p.value
#>  Nitrogen0 - Nitrogen50     -740.5 69 63 -10.739  <.0001
#>  Nitrogen0 - Nitrogen100   -1357.4 69 63 -19.684  <.0001
#>  Nitrogen50 - Nitrogen100   -616.9 69 63  -8.946  <.0001
#> 
#> P value adjustment: tukey method for comparing a family of 3 estimates

Summary of nitrogen effect by irrigation level:

Irrigation	N effect (0 → 100 kg/ha)	Practical recommendation
Low	~+200 kg/ha (modest)	Marginal ROI
Medium	~+600 kg/ha (moderate)	Worthwhile
High	~+1300 kg/ha (large)	Strongly recommended

Conclusion: Do not invest in high N rates under low irrigation — water availability limits the crop’s ability to respond to nitrogen.

2×2×2 Factorial Designs

Adding a Third Factor

2×2×2 Factorial:

Three factors, each with 2 levels
\(2^3 =\) 8 treatment combinations
Tests 7 effects: 3 main effects + 3 two-way interactions + 1 three-way interaction

Example: Herbicide timing (PRE, POST) × Adjuvant (None, NIS) × Crop Residue (Low, High)

Biological rationale:

PRE herbicides must reach the soil — residue can intercept them
NIS enhances foliar uptake — relevant mainly for POST applications
H×R interaction is biologically expected

Treatment combinations:

Herbicide	Adjuvant	Residue
PRE	None	Low
PRE	None	High
PRE	NIS	Low
PRE	NIS	High
POST	None	Low
POST	None	High
POST	NIS	Low
POST	NIS	High

ANOVA for 2×2×2 Factorial

7 sources of variation:

Main effect: Herbicide timing (H) — 1 df
Main effect: Adjuvant (A) — 1 df
Main effect: Residue (R) — 1 df
Two-way: H×A — 1 df
Two-way: H×R — 1 df
Two-way: A×R — 1 df
Three-way: H×A×R — 1 df
Error — 40 df (8 combos × 6 reps − 8)

ANOVA table (2×2×2, 6 reps, N = 48):

Source	df	F	p
H	1	\(F_H\)	\(p_H\)
A	1	\(F_A\)	\(p_A\)
R	1	\(F_R\)	\(p_R\)
H×A	1	\(F_{HA}\)	\(p_{HA}\)
H×R	1	\(F_{HR}\)	\(p_{HR}\)
A×R	1	\(F_{AR}\)	\(p_{AR}\)
H×A×R	1	\(F_{HAR}\)	\(p_{HAR}\)
Error	40	—	—

Interpretation rule: Always test the highest-order interaction first. If H×A×R is significant, the two-way interactions cannot be interpreted independently.

Three-Way Interaction: What Does It Mean?

Three-way interaction (H×A×R):

The Herbicide × Adjuvant two-way interaction changes depending on the level of Crop Residue.

Under Low Residue:

PRE: ~78% control (adjuvant irrelevant — soil-applied)
POST: ~72% without NIS → ~90% with NIS
H×A interaction is moderate

Under High Residue:

PRE: drops to ~55% (residue intercepts herbicide)
POST: ~70% without NIS → ~92% with NIS
H×A interaction is larger (PRE is now strongly penalized)

Good news: Three-way interactions are rare in practice. When they occur, use faceted interaction plots — one panel per level of the third factor — to visualize and interpret.

Example: 2×2×2 Herbicide Factorial

Code

# Simulate 2×2×2 herbicide factorial — response: % weed control
set.seed(2026)
data_2x2x2 <- expand.grid(
  Herbicide = factor(c("PRE", "POST"), levels = c("PRE", "POST")),
  Adjuvant  = factor(c("None", "NIS"), levels = c("None", "NIS")),
  Residue   = factor(c("Low", "High"), levels = c("Low", "High")),
  Rep       = 1:6
) |>
  mutate(
    mu = case_when(
      Herbicide == "PRE"  & Adjuvant == "None" & Residue == "Low"  ~ 78,
      Herbicide == "PRE"  & Adjuvant == "NIS"  & Residue == "Low"  ~ 80,
      Herbicide == "PRE"  & Adjuvant == "None" & Residue == "High" ~ 55,
      Herbicide == "PRE"  & Adjuvant == "NIS"  & Residue == "High" ~ 58,
      Herbicide == "POST" & Adjuvant == "None" & Residue == "Low"  ~ 72,
      Herbicide == "POST" & Adjuvant == "NIS"  & Residue == "Low"  ~ 90,
      Herbicide == "POST" & Adjuvant == "None" & Residue == "High" ~ 70,
      Herbicide == "POST" & Adjuvant == "NIS"  & Residue == "High" ~ 92
    ),
    Weed_Control = mu + rnorm(n(), mean = 0, sd = 5)
  )

data_2x2x2 |>
  group_by(Herbicide, Adjuvant, Residue) |>
  summarise(Mean_Control = round(mean(Weed_Control), 1), .groups = "drop")

#> # A tibble: 8 × 4
#>   Herbicide Adjuvant Residue Mean_Control
#>   <fct>     <fct>    <fct>          <dbl>
#> 1 PRE       None     Low             77.9
#> 2 PRE       None     High            54.3
#> 3 PRE       NIS      Low             79.7
#> 4 PRE       NIS      High            54.7
#> 5 POST      None     Low             73.4
#> 6 POST      None     High            69.5
#> 7 POST      NIS      Low             91.2
#> 8 POST      NIS      High            92.6

Visualize 2×2×2: Faceted Interaction Plot

Code

data_2x2x2 |>
  group_by(Herbicide, Adjuvant, Residue) |>
  summarise(mean_control = mean(Weed_Control), .groups = "drop") |>
  ggplot(aes(x = Adjuvant, y = mean_control, color = Herbicide, group = Herbicide)) +
  geom_line(linewidth = 1.8) + geom_point(size = 5) +
  facet_wrap(~Residue, labeller = label_both) +
  labs(title = "2×2×2 Herbicide Factorial: Timing × Adjuvant by Residue",
       subtitle = "NIS boosts POST control; PRE is suppressed by high residue",
       x = "Adjuvant", y = "Weed Control (%)", color = "Herbicide\nTiming") +
  scale_color_manual(values = c("PRE" = "#9D2235", "POST" = "#2E8B57")) +
  scale_y_continuous(limits = c(40, 100), breaks = seq(40, 100, 10)) +
  theme_minimal(base_size = 18) +
  theme(plot.title = element_text(color = "#2E8B57", face = "bold", size = 18),
        strip.text = element_text(size = 16, face = "bold"), legend.position = "right")

Figure 2: Herbicide × Adjuvant interaction by Residue level

Key message: PRE drops to ~55% under high residue (soil interception). POST + NIS maintains ~92% regardless of residue. The H×R interaction drives the recommendation: use POST + NIS in no-till (high-residue) systems.

ANOVA: 2×2×2 Herbicide Factorial

Code

# Sum-to-zero contrasts required for correct Type III SS
options(contrasts = c("contr.sum", "contr.poly"))
model_2x2x2 <- lm(Weed_Control ~ Herbicide * Adjuvant * Residue, data = data_2x2x2)
Anova(model_2x2x2, type = "III")

#> Anova Table (Type III tests)
#> 
#> Response: Weed_Control
#>                            Sum Sq Df    F value    Pr(>F)    
#> (Intercept)                264004  1 10005.8416 < 2.2e-16 ***
#> Herbicide                    2698  1   102.2382 1.402e-12 ***
#> Adjuvant                     1395  1    52.8717 7.880e-09 ***
#> Residue                      1954  1    74.0735 1.214e-10 ***
#> Herbicide:Adjuvant           1124  1    42.6108 8.574e-08 ***
#> Herbicide:Residue            1589  1    60.2305 1.672e-09 ***
#> Adjuvant:Residue               12  1     0.4575    0.5027    
#> Herbicide:Adjuvant:Residue     34  1     1.3003    0.2609    
#> Residuals                    1055 40                         
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Code

# Restore default contrasts
options(contrasts = c("contr.treatment", "contr.poly"))

Interpretation workflow:

Check H×A×R first — if p ≥ 0.05, no three-way interaction → proceed to two-way
Check H×R and H×A — biologically expected to be significant
Agronomic threshold: identify combinations achieving ≥ 85% weed control

Fractional Factorial Designs

Motivation: Too Many Factors!

Problem: Treatment combinations grow exponentially with each added factor.

Factors (k)	Full factorial (\(2^k\))	Total plots
2	4	16
3	8	32
4	16	64
5	32	128
6	64	256
7	128	512

Practical constraint: Most field experiments in weed science and agronomy accommodate 50–150 plots. Full factorials with 5+ factors quickly exceed this capacity.

Solution: Test a strategically chosen fraction of the combinations.

Fractional Factorials: The Concept

Fractional factorial (\(2^{k-p}\)):

k factors, test only \(2^{k-p}\) combinations
p = number of times to halve the design
\(2^{5-1}\) = 5 factors, only \(2^4 = 16\) runs (half of 32)
\(2^{6-2}\) = 6 factors, only \(2^4 = 16\) runs (quarter of 64)

Trade-off:

Gain: Feasible plot count
Cost: Some effects are aliased (confounded with higher-order interactions)

Strategy: Screen many factors → identify the 2–3 that matter → follow up with full factorial on those.

Common designs:

Design	Factors	Runs	Fraction
\(2^{4-1}\)	4	8	1/2
\(2^{5-1}\)	5	16	1/2
\(2^{5-2}\)	5	8	1/4
\(2^{6-2}\)	6	16	1/4
\(2^{7-3}\)	7	16	1/8

Aliasing principle: Main effects are not confounded with each other — only with high-order interactions, which are usually negligible in agricultural systems.

Example: Screening Experiment

Scenario: Optimize an integrated weed management (IWM) program — 6 factors to screen:

Factor	Levels
Herbicide timing	PRE, POST
Herbicide rate	1×, 0.5× labeled rate
Cover crop	None, Cereal rye
Row spacing	15”, 30”
Cultivation	None, Inter-row
Adjuvant	None, NIS

Full factorial: \(2^6 = 64\) combinations × 4 reps = 256 plots (infeasible)

Fractional factorial (\(2^{6-2}\)): 16 combinations × 4 reps = 64 plots ✓

Two-phase research strategy:

Phase 1 — Screen: Run \(2^{6-2}\) to identify which factors have large main effects on weed control
Phase 2 — Optimize: Run full \(2^3\) factorial on the 3 most important factors

Fractional Factorial: R Package

FrF2 package generates fractional factorial designs with correct aliasing structure:

library(FrF2)

# Generate 2^(6-2): 16 runs, 6 IWM factors, randomized
design_frac <- FrF2(nruns = 16, nfactors = 6, randomize = TRUE,
                    factor.names = list(Timing = c("PRE", "POST"),
                                        Rate   = c("Half", "Full"),
                                        Cover  = c("None", "Rye"),
                                        Rows   = c("15in", "30in"),
                                        Cult   = c("None", "Inter"),
                                        Adj    = c("None", "NIS")))
summary(design_frac)

What the output shows: - The 16 treatment combinations to run - The alias structure — which effects are confounded

Fractional factorial analysis will not be discussed here check TextBook. The key takeaway: these designs exist and are powerful tools for multi-factor screening in applied weed science research.

Design Decision Framework

Choosing Factorial Structure

How many factors?

2–3 factors: Full factorial usually feasible
4–5 factors: Full factorial if resources allow; otherwise fractional
6+ factors: Fractional factorial for initial screening

How many levels per factor?

2 levels: Compare two conditions (yes/no, low/high)
3 levels: Detect dose-response, find optimum
4+ levels: Detailed curve (many plots)

Are interactions biologically plausible?

Ask before designing:

“Does Factor A change how Factor B works mechanistically?”
If yes → full factorial (need to estimate interactions)
If no → separate experiment may be more efficient

Examples where interactions are expected:

Herbicide rate × adjuvant (foliar uptake)
N fertilizer × irrigation (nutrient uptake needs water)
Cover crop × herbicide timing (additive or substitutive?)

Sample Size Considerations

Total plots = Combinations × Reps:

Design	Combos	Reps	Total
2×2	4	8	32
3×3	9	8	72
2×2×2	8	6	48
3×2×2	12	6	72
3×3×3	27	6	162

Power analysis with pwr:

library(pwr)

# 2×2 factorial: 4 groups, 8 reps per group
# Effect size f = 0.35 (moderate)
pwr.anova.test(k         = 4,
               n         = 8,
               f         = 0.35,
               sig.level = 0.05)

Always run power analysis during design, not after data collection. Use pilot data or published literature to estimate the expected effect size.

Design Decision Checklist

Before finalizing your factorial design, answer:

✓ How many factors? (2–3 ideal; ≥ 4 consider fractional)
✓ How many levels per factor? (2 for presence/absence; 3 for dose-response)
✓ Are interactions biologically plausible? (Justify each two-way)
✓ Total plots feasible? (Combinations × Reps ≤ capacity)
✓ Power adequate? (Run pwr.anova.test() — target ≥ 80%)
✓ CRD or RCBD? (Block if spatial/temporal variation is expected)
✓ Can I fully randomize? (If not → split-plot may be required)
✓ Planned analysis? (If interaction expected: emmeans(~ B | A) for simple effects)

Your Turn: Choose a Design

Activity (3 minutes)

Scenario:

You want to study factors affecting Palmer amaranth control in soybean:

Herbicide program (PRE-only, POST-only, PRE+POST) — 3 levels
Row spacing (15-inch, 30-inch) — 2 levels
Cover crop (None, Cereal rye) — 2 levels

You have space for 72 plots.

Questions:

What factorial structure is this? (e.g., 3×2×2)
How many treatment combinations?
How many reps with 72 plots?
Full factorial or simplify? Why?
What interactions are most biologically relevant for weed management?

Discuss with a partner (2 min), then share answers.

Activity: Solutions

Answers:

3×2×2 factorial (Herbicide × Row spacing × Cover crop)
12 treatment combinations (3 × 2 × 2)
6 reps per combination (72 ÷ 12 = 6)
Full factorial recommended — only 12 combinations; 6 reps provides solid power

Design recommendation: Run full 3×2×2 RCBD with 6 reps. Block by field position to control spatial variability.

5. Biologically relevant interactions:

Herbicide × Cover crop: Cereal rye suppresses early weed cohorts — does a POST-only program become sufficient when combined with rye?
Herbicide × Row spacing: Narrow rows close canopy faster — does this complement or substitute certain herbicide programs?

Herbicide × Cover crop is the most publishable: evidence that cover crops can reduce herbicide dependency has direct IWM application.

Summary

Key Takeaways

3×3 factorials — 2 factors × 3 levels = 9 combinations; useful for dose-response and detecting quadratic trends
2×2×2 factorials — 3 factors × 2 levels = 8 combinations; produces 7 F-tests (3 main + 3 two-way + 1 three-way)
Test highest-order interaction first — if three-way is significant, two-way interactions cannot be interpreted independently
Contrast requirement — always set options(contrasts = c("contr.sum", "contr.poly")) before fitting any model where Anova(..., type = "III") will be used; restore with c("contr.treatment", "contr.poly") afterward
Fractional factorials (\(2^{k-p}\)) — efficient for screening ≥ 4 factors when full factorial is infeasible
Plan sample size before the experiment — total plots = combinations × reps; target ≥ 80% power

Factorial Design Summary Table

Design	Factors	Levels	Combos	Reps	Total plots	Best use
2×2	2	2, 2	4	8	32	Simple comparisons
3×3	2	3, 3	9	8	72	Dose-response
2×3	2	2, 3	6	8	48	Binary + dose
2×2×2	3	2, 2, 2	8	6	48	Three binary factors
3×2×2	3	3, 2, 2	12	6	72	One dose + two binary
3×3×3	3	3, 3, 3	27	6	162	Response surface (large)
\(2^{5-1}\)	5	2 each	16	4	64	Screening (fractional)

Rule of thumb: Keep total plots ≤ 100 when possible. Prioritize designs with strong biological motivation for each factor and interaction included.

Next Lecture: Split-Plot Designs

Coming Up: Split-Plot and Strip-Plot

Motivation:

Factorial designs assume all factors can be fully randomized. But is that realistic?

Irrigation is applied to large field sections — can’t randomize to individual rows
Tillage requires large equipment passes across whole plots
Fumigation zones cover entire blocks

When some factors cannot be randomized at the same scale as others, the design is a split-plot, not a standard factorial.

Key concepts for next lecture:

Whole-plot factor: Applied to large units (hard to randomize)
Subplot factor: Applied within whole plots (easy to randomize)
Two error terms: One for whole-plot, one for subplot comparisons
Analyzing a split-plot as a simple factorial inflates Type I error for whole-plot effects

Prepare: Review the factorial model \(y_{ijk} = \mu + \alpha_i + \beta_j + (\alpha\beta)_{ij} + \varepsilon_{ijk}\)

Think about: Which factors in your own research are difficult to randomize at a small scale?

Resources

Required reading:

Oehlert (2010), Ch. 5 (Sections 5.1–5.5) — Higher-order factorials
Oehlert (2010), Ch. 6 (Sections 6.1–6.3) — Fractional factorials

R packages:

car — Anova(..., type = "III") with contr.sum
emmeans — marginal means and simple effects
ggplot2 — interaction and faceted plots
FrF2 — fractional factorial design generation (optional)
pwr — power analysis for factorial designs

Key R patterns:

# Always set before lm() when using Anova type III
options(contrasts = c("contr.sum", "contr.poly"))
model <- lm(response ~ A * B, data = dat)
Anova(model, type = "III")
options(contrasts = c("contr.treatment", "contr.poly"))

# Simple effects (when A×B is significant)
emm <- emmeans(model, ~ B | A)
pairs(emm, adjust = "tukey")

Additional references:

Montgomery (2017), Design and Analysis of Experiments, Ch. 5–8
Box, Hunter & Hunter (2005), Statistics for Experimenters, Ch. 5