AI-Assisted Teaching
Research Plan

A multi-course study on how AI-generated content impacts student engagement, confidence, and learning across disciplines

Spring 2026 IRB Approved Recruiting Collaborators

PI: Weihao Qu  •  Co-PI: Ling Zheng  •  Monmouth University

Use arrow keys to navigate

The Vision: Two Tiers of Evidence

TIER 1: DEEP STUDY (CS 205 — Data Structures) ┌─────────────────────────────────────────────────────────────┐ │ Treatment Section Control Section │ │ (Prof. Weihao) (Prof. Rolf) │ │ AI-generated slides, Traditional lectures, │ │ demos, exercises textbook, whiteboard │ │ │ │ ✓ Full-semester RCT with control group │ │ ✓ Pre-test → 6 biweekly quizzes → Post-survey │ │ ✓ Shared assessments, identical content │ │ ✓ Rigorous causal evidence │ └─────────────────────────────────────────────────────────────┘ TIER 2: BREADTH STUDY (Multiple Courses, Multiple Disciplines) ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐ │ CS 336 │ │ CS 305 │ │ CS 310 │ │ SE 641 │ + more └──────────┘ └──────────┘ └──────────┘ └──────────┘ │ Each course: professor uses AI-generated content for │ │ selected topics. One retrospective survey after the unit. │ │ │ │ ✓ Cross-discipline generalizability │ │ ✓ Minimal burden on collaborating faculty │ │ ✓ "Across N courses, students consistently reported..." │

Why both tiers?

Tier 1 (CS 205) gives rigorous causal evidence with a control group. Tier 2 gives breadth and generalizability across disciplines. Together, this is a much stronger paper than either alone.

Research Questions

RQ1: Learning Outcomes

Does AI-assisted teaching improve student performance on assessments?

Tier 1: Quiz scores, pre/post concept test gains (treatment vs control)

RQ2: Student Experience

How does it affect engagement, confidence, and AI perceptions?

Tier 1: Pre/post surveys
Tier 2: Retrospective before/after surveys

RQ3: Generalizability

Do these effects hold across disciplines and course types?

Tier 2: Cross-course analysis of N courses spanning CS, SE, and beyond

Why RQ3 matters

Most AI-in-education studies are single-course. Reviewers always ask: "Does this generalize?" Having data from multiple courses across multiple disciplines directly answers that question — a major differentiator.

Tier 1: CS 205 Deep Study

Full-semester controlled experiment — already in progress.

Treatment (Prof. Weihao)

  • AI-generated slides and presentations
  • Interactive demos and exercises
  • AI-powered hints and guidance
  • Custom web app used during lectures

Control (Prof. Rolf)

  • Traditional lectures and whiteboard
  • Standard textbook and materials
  • Same content, same assessments
  • No AI-generated content

Data Collection

Pre-Test (Week 5) Done

20 MCQs + 16 Likert items (engagement, self-efficacy, AI perceptions)

6 Biweekly Quizzes (Weeks 7–15) In Progress

8 MCQs each, identical across sections, instant feedback

Post-Survey (Week 16)

Same 16 Likert items + 8 AI experience reflection items

Tier 2: Multi-Course Expansion

Professors integrate AI-generated content into selected topics. Students take a paired pre/post survey — one before the AI unit, one after.

┌──────────────────┐ Professor teaches ┌──────────────────┐ │ PRE-SURVEY │ with AI content │ POST-SURVEY │ │ (~3 min) │ ┌────────────────┐ │ (~6 min) │ │ │ │ AI slides │ │ │ │ "Right now, │───▶│ AI demos │────▶│ Retrospective │ │ I feel [1-5]" │ │ AI exercises │ │ before/after │ │ │ │ 2+ weeks │ │ + AI experience │ │ R1–R6 baseline │ └────────────────┘ │ + preferences │ └──────────────────┘ └──────────────────┘

Why both pre-survey AND retrospective?

The pre-survey captures a true baseline before any AI content. The post-survey's retrospective "before/after" stays too — giving us three data points per student. The gap between actual pre and retrospective "before" measures response shift bias, which is itself a publishable methodological finding.

What Collaborators Need to Do

We keep the burden minimal. We do the heavy lifting.

What WE Do For You

  • Generate your AI content — slides, demos, exercises, explanations tailored to your topics
  • Plan which topics to integrate AI into (we'll work with you)
  • Build the survey — already built, ready to deploy
  • Handle all data collection — anonymous, Google Sheets backend
  • Run the analysis — we analyze the data and write the paper
  • IRB is already approved — covers all collaborating courses

What YOU Do

  • Tell us your topics so we can generate relevant content
  • Give 3 minutes of class time for the pre-survey (before AI unit)
  • Use the AI content in 2+ weeks of your course
  • Give 6 minutes of class time for the post-survey (after AI unit)

That's it. ~9 minutes of class time + ~30 minutes of your time total.

What You GET

  • Free AI-generated teaching materials customized for your course
  • Co-authorship on the research paper
  • Student feedback data on how AI content was received
  • A ready-made research contribution for your portfolio

Your Step-by-Step To-Do List

If you'd like to join, here's exactly what happens:

Before Using AI Content

Step 1: Tell us your topics
Pick 2+ topics (or chapters/weeks) where you'd like AI-generated content. Email Weihao or meet for 15 min.
Step 2: We generate your materials
We create AI-generated slides, demos, exercises, or explanations for your chosen topics. You review and give feedback. Takes us 1–2 weeks.
Step 3: You review & approve
Quick check: does the content match what you'd teach? Any corrections? We revise as needed.
Step 4: Administer the PRE-SURVEY
Before starting the AI unit, give students 3 minutes in class to take the pre-survey. This captures their baseline.

During & After

Step 5: Use the AI content in class
Teach your chosen topics using the AI-generated materials. Use them however works best — lectures, labs, homework.
Step 6: Administer the POST-SURVEY
After the AI unit, give students 6 minutes in class to take the post-survey. Just share the link — everything is automated.
Step 7: Done!
We handle data analysis and paper writing. You'll review the draft and be listed as a co-author.

Survey links (ready to use)

Pre-survey: https://weihaoqu.github.io/cs205aieducation/survey-pre.html

Post-survey: https://weihaoqu.github.io/cs205aieducation/survey-general.html

Students select their course from a dropdown. Responses are paired by anonymous ID.

The Survey Instruments (Tier 2)

Two paired surveys — a short pre-survey before the AI unit, and a full post-survey after.

Pre-Survey (~3 min)

IRB consent Same consent screen
Anonymous ID + course Links to post-survey
R1–R6 baseline 6 Likert items: engagement, motivation, confidence rated right now

Post-Survey (~6 min)

Part 1 Context What AI content was used, how long
Part 2 Before vs. After Same R1–R6 rated retrospectively (before + after)
Part 3 AI Experience 8 items: usage, helpfulness, understanding
Part 4 Preferences 4 items: future AI courses, trust

Three Data Points Per Student

Actual pre (R1=3) → Retrospective "before" (R1=2) → Retrospective "after" (R1=4). The gap between actual and retrospective pre measures response shift bias.

Anonymous & Paired

Students use the same anonymous ID (last 4 phone digits + birth month) on both surveys. This pairs responses without using names.

How We'll Analyze the Data

Tier 1: CS 205 (Controlled)

Treatment vs Control scores Mann-Whitney U + Cohen's d
Controlling for baseline ANCOVA (pre-test covariate)
Learning gains over time Repeated measures (6 quizzes)
Survey attitude shifts Paired t-test / Wilcoxon

Tier 2: Multi-Course (Breadth)

Before vs After (per student) Paired Wilcoxon signed-rank
Cross-course patterns Mixed-effects model (course as random effect)
AI type → engagement Regression (slides vs demos vs exercises)
Overall themes Descriptive statistics + effect sizes

The story we can tell

Tier 1: "In a controlled experiment, AI-assisted teaching improved [scores/engagement/confidence] with effect size d = X."
Tier 2: "This finding generalizes across N courses in M disciplines, where students consistently reported higher engagement after AI-integrated units (average before: 2.8 → after: 3.9)."

Timeline: Where We Are Now

Spring 2026

Week 5 (Feb) Done
Pre-test & pre-survey administered (CS 205)
Weeks 7–15 (Mar–May) Now
6 biweekly quizzes (CS 205)
Recruiting Tier 2 collaborators
Generating AI content for partner courses
Weeks 12–15
Tier 2 courses administer retrospective survey
Week 16 (May)
CS 205 post-survey

Summer–Fall 2026

May–June
Data cleaning & analysis
July
Draft paper with all collaborators
August TARGET
Submit to SIGCSE 2027 or similar venue

For Tier 2 collaborators

If you join now, we generate your AI content in the next 2–3 weeks, you use it, and students take the 6-minute survey before the semester ends. That's it.

What's Already Built & Ready

CS 205 Instruments

  • Pre-test (20 MCQs + survey)
  • 6 quizzes (48 questions)
  • Post-survey (16 + 8 items)
  • All deployed on GitHub Pages
  • Google Sheets data collection

Live & collecting data

Paired Surveys

  • Pre-survey (6 baseline items, ~3 min)
  • Post-survey (retrospective + AI experience, ~6 min)
  • IRB consent on both
  • Course selector (CS 336, CS 305, CS 310, SE 641, BF 422, CS 414)
  • Anonymous ID pairing across surveys

Both deployed

Infrastructure

  • IRB approval covers all courses
  • Anonymous ID system
  • Google Sheets backend
  • AI content generation pipeline (Claude Code)
  • Slide/demo generation tools

All set up

Everything is ready — we just need your topics

Tell us which topics you'd like AI content for, and we'll generate slides, demos, and exercises within days. The survey is already deployed. You just use the materials and give 6 minutes of class time.

Why This Study Stands Out

Most "AI in education" studies have major weaknesses. Here's how ours avoids them:

Common WeaknessHow Our Study Avoids It
"We just let students use ChatGPT" Purpose-built AI-generated content tailored to each course
Single course, single discipline Multi-course, multi-discipline (CS, SE, and more)
No control group CS 205 has a true control section with shared assessments
Only test scores, no student voice Scores + surveys + retrospective reflections across courses
"Does this generalize?" Tier 2 directly answers this with cross-course data
No baseline measurement CS 205: pre/post tests. Others: paired pre-survey + retrospective post-survey

The narrative for reviewers

"We conducted a controlled experiment in one course and replicated the survey across N additional courses spanning M disciplines. In the controlled study, AI-assisted teaching improved [X]. Across all courses, students consistently reported [Y]. This multi-tier design provides both causal rigor and cross-discipline generalizability."

Join the Study

We're looking for professors who are interested in using AI-generated content in their teaching this semester.

~30 min

of your time total

9 min

of class time (3 min pre + 6 min post)

2+ weeks

of AI content in your course

What you get

Free AI teaching materials + co-authorship + student feedback data

Contact

Weihao Qu
wqu@monmouth.edu
732-263-5396