A Classroom Project for Generation Z

The Guidelines for Assessment and Instruction in Statistics Education (GAISE): College Report recommends using real data in introductory statistics courses to provide context and authenticity to research and analysis, as well as incorporating projects to assess statistical thinking. While individually tailored projects with real data would be ideal to truly motivate and engage students, instructors may not have the time or resources to manage the scope of that endeavor.

In the age of selfies, classroom-generated data can provide a happy marriage of a manageable and flexible project with real data that caters to Generation Z’s self-awareness.

Generally, academic misconduct can serve as an intriguing topic for classroom-generated data. It is relevant to their daily lives, and it affects how fairly student performance is assessed. Moreover, reports of academic misconduct have been prominently featured in the news. For example, Kentucky Senator Rand Paul was reported to have plagiarized part of a speech from Wikipedia in 2013, and in 2014 John Walsh withdrew from Montana’s senate race when it was revealed that a good portion of his master’s thesis was plagiarized.

Also, collaboration with your institution’s honor council can transform your class project into a service learning opportunity. Of course, we cannot claim that all of Generation Z will have their curiosity piqued by this topic. However, this particular combination of mischievousness and drama can ignite strong passions regarding academic integrity and defensiveness against preconceived notions of who commits academic misconduct. It’s relevant, it’s controversial, and it’s about them.

Getting Started

Ask students what they think constitutes academic misconduct, then show them corresponding documentation from your institution. For example, Emory College lists more than 30 examples of academic misconduct classified in four areas: writing, collaboration, exams, and other. Consider inviting a representative from your institution’s honor council to your class to participate in the discussion and elaborate on what gaps exist in their knowledge of student behavior. Discuss how academic misconduct can be measured and on what time scale. Last, hypothesize about what sorts of factors may be related to academic misconduct, as well as the direction of the association.

Study Design

The instructor, students, and honor council representative can design a survey together to address their specific questions. The group can discuss the most effective way of collecting the data, such as interviews or online surveys. In a large class with hundreds of students, data can be collected in the form of a convenience sample by requesting that enrolled students participate in the survey. Motivation can be provided by offering extra credit for a specified student response rate.

Alternatively, smaller classes could come up with their own scheme to randomly sample the student body. Assurances should be provided that any admission of violations would not result in sanctions or punishments of any kind, and that the data will be unidentified thoroughly. As an example, Emory’s survey was administered anonymously via SurveyMonkey to 273 students enrolled in two sections of a first-semester introduction to statistical inference course. It included 46 questions that generated 132 variables, concentrated in the areas of personal information, academic misconduct, and understanding of policies, as well as other information specifically of interest to the honor council. Personal information included demographics (race, gender), academics (major type, GPA), and extracurricular activities (participation in Greek life, hours worked per week).

Questions regarding academic misconduct were phrased as, “For each item listed, please indicate how many times you have engaged in that form of misconduct while in college,” with possible responses of “0 times,” “1 time,” “2–4 times,” and “>4 times.” Understanding of policies asked about awareness of specific violations and typical sanctions.

Data Manipulation

Depending on the nature of the data collection, some data cleaning will likely be required due to implausible values. Students may be interested in categorizing quantitative variables, or reclassifying categorical variables. More interestingly, because academic misconduct can be assessed by multiple variables, students can classify survey respondents as different types of offenders. This can be somewhat straightforward, like classifying respondents as ever having committed misconduct in writing.

This also can be more open-ended and require creativity and critical thinking by having students create their own definition of what constitutes a serious offender. Defining such variables requires students to use logical evaluations with and/or statements, which can be particularly challenging to think through. For example, suppose a student investigates whether the survey respondents have conducted any academic misconduct in the realm of collaboration. This is based on four variables:

  • Collaboration_A: Copying any part of an assignment
  • Collaboration_B: Allowing another student to copy any part of an assignment
  • Collaboration_C: Sharing your assignment with another student without the professor’s permission
  • Collaboration_D: Including someone’s name on a project for credit when she didn’t contribute to the work

The simplest approach to defining a variable for whether a respondent has committed any misconduct in collaboration is if Collaboration_A equals “0 times” and Collaboration_B equals “0 times” and Collaboration_C equals “0 times” and Collaboration_D equals “0 times.” Then, the respondent is classified as no; otherwise the respondent is classified as yes.

Students often attempt to use a more complex route by examining combinations using “or” statements, which usually do not correspond to the desired result. It is a good idea to provide benchmark information such as, “You should have 146 survey respondents who have committed at least one form of academic misconduct in collaboration,” so students can check their work. This also will save you a headache when it’s time to grade.


Analysis of the data can be guided based on your specifications or the honor council’s interests, or it can be more open-ended by letting students explore relationships between variables of their choosing. Techniques from elementary to advanced can be used for estimation, testing, and modeling.

Estimation is relevant for estimating the prevalence of various forms of misconduct. Tests taught in a first-semester statistics course, like the two-sample t-test or chi-squared test, could be used to compare characteristics of offenders and non-offenders. Linear regression can be applied to the number of offenses committed, which tends to be right-skewed. This can further serve as a platform for discussing regression assumptions, variable transformation, and other techniques like Poisson regression. Logistic regression can be used to model the probability of a student committing a particular offense.

Last, students need to consider assumptions of statistical methods as they are likely to encounter groups with small sample size.


Data regarding academic misconduct can lend itself easily to a full report in the style of a scientific article, including a literature review. Simpler routes also can be taken, where students are limited to one page to explain their findings. Students also could present results during in-class presentations.

Regardless of the length or format of the assignment, a multistep submission process is recommended to encourage students to work on the assignment over the course of the semester. Waiting until the last minute to do the project would be disastrous with the amount of coding required.

The process could include a project proposal in which students identify their variables of interest and hypothesize about the directions of the associations. Moreover, a draft submission that requires descriptive statistics compels students to begin coding and working with the data long before the due date.


Inquiring about academic misconduct has some attractive pedagogical strength. First, it can be used throughout the semester to discuss topics such as random sampling, biases in observational studies, and implementation of various statistical methods. It can provide students with the opportunity to create complex variables that require creativity and critical thinking. It also can encourage students to think deeply about more nuanced topics, such as the difference between statistical significance and practical significance.

The project can range in difficulty regarding study design, analysis, and presentation of results, which can be specifically tailored to the learning objectives of your curriculum. On a more holistic note, it promotes enhanced student awareness of what constitutes academic misconduct and your institution’s policies. Last, both you and Generation Z will likely be surprised by some of the results.


The data collected for this class project were not intended for research to contribute to the body of literature regarding academic misconduct at higher education institutions, but rather for instructional purposes.

Utmost consideration was given to protecting student identity. For example, categorical demographic variables with fewer than 20 respondents in a given category were either excluded from the data set or reclassified. In gathering student-generated data, instructors should be aware of what necessitates institutional review board (IRB) approval. Depending on the nature of the project, it may be excluded, exempted, or reviewed.

At a minimum, any student survey data collected should involve informed consent. We strongly recommend consulting with your institution’s IRB office before proceeding. As discussed by Sam Wilcock at JSM 2014, while avoiding the IRB process may seem desirable, it could be considered a valuable and integral part of the students’ statistical learning process.

Further Reading

Aliaga, M., G. Cobb, C. Cuff, J. Garfield, B. Gould, R. Lock, T. Moore, A. Rossman, B. Stephenson, J. Utts, P. Velleman, and J. Witmer. 2012. Guidelines for assessment and instruction in statistics education: College report (PDF download). American Statistical Association: Alexandria, VA.

Blake, Aaron. 2013. Rand Paul’s plagiarism allegations, and why they matter. The Washington Post.

Martin, Jonathan. 2014. Senator quits Montana race after charge of plagiarism. The New York Times.

Emory University: Examples of Academic Dishonesty and Misconduct

About the Authors

Shannon McClintock Pileggi is a lecturer at Emory University. She has worked in the Division of Parasitic Diseases at the Centers for Disease Control and Prevention, and her research interests include spatial disease modeling, statistics pedagogy, and incorporating technology into the classroom. She is also a contributing member of the OpenIntro project.

Mine Çetinkaya-Rundel is an assistant professor of the practice at Duke University. Her research interests include statistics pedagogy, spatial statistics, small-area estimation, and  survey and public health data. She is a co-author of OpenIntro Statistics and a contributing member of the OpenIntro project, whose mission is to make educational products that are open-licensed, transparent, and help lower barriers to education.

Dalene Stangl is professor of the practice of statistical science and public policy and associate chair of the department of statistical science at Duke University in North Carolina. She has served in editorial positions for the Journal of the American Statistical Association, The American Statistician, and Bayesian Analysis and has co-edited two books with Donald Berry, Bayesian Biostatistics and Meta-Analysis in Medicine and Health Policy. Her primary interest is promoting Bayesian ideas in the reform of statistics education and statistical practice.

In Taking a Chance in the Classroom, column editors Dalene Stangl, Mine Çetinkaya-Rundel, and Kari Lock Morgan focus on pedagogical approaches to communicating the fundamental ideas of statistical thinking in a classroom using data sets from CHANCE and elsewhere.

Back to Top

Tagged as: , , , , ,