2  Statistics 1

The course Statistics 1 covers the basics of statistics and data analysis. This GitBook contains all relevant information about this course. It is assumed that every student reads it carefully. If you have any questions, first consult this GitBook, then ask a fellow student, and only if your question is still not answered, then contact the course coordinator.

Communication about the course occurs through Canvas (Login with your student ID and password).

2.1 Course Description

In the course, the following techniques will be discussed:

  • Information on the use of SPSS and interpretation of the output.
  • Descriptive statistics;
  • Normal distribution; standard scores;
  • Sampling distributions; Z and t distributions;
  • Hypothesis tests and confidence intervals for the mean.
  • The power of a statistical test.
  • One way Between Subjects Analysis of Variance.
  • Linear regression analysis

2.2 Learning goals

After taking this course, students will be able to…

  1. compute and interpret commonly used descriptive statistics such as the sample mean, the median, the mode, variance and standard deviation, the standard error, and the correlation coefficient.
  2. recognize different probability distributions such as the normal distribution, and make computations for these probability distributions.
  3. explain the essential aspects of null-hypothesis significance testing, including sampling distributions, Type I and Type II errors, one-tailed versus two-tailed testing, and statistical power.
  4. apply different statistical tests such as the Z-test, the one sample t-test, the one way Between Subjects Analysis of Variance test, and statistical tests related to (multiple) linear regression analysis with continuous and categorical predictors; and clarify the statistical and/or methodological assumptions that apply to the techniques that are discussed in this course.
  5. explain basic concepts in regression analysis, including: linear association, least-squares estimation, explained variance, Multiple R, multiple correlation, adjusted R-square, raw and standardized regression coefficients, model-comparison tests, predicted scores, residuals and the assumptions;
  6. choose the appropriate analysis technique for answering a specific research problem from the range of techniques that are covered in the course.
  7. use the software package SPSS to perform several statistical data analyses and be able to correctly interpret and report the output to an informed audience (e.g., Liberal arts students, researchers from the social sciences/business and economics/cognitive neuroscience).
  8. draw valid conclusions from the results of empirical data analyses given specific research questions envisaged.

2.3 Course Schedule

The official course schedule is available on TimeEdit. The information below might go out of date. For a general overview of the content, see below:

2.4 Attendance

Attendance is mandatory based on our experience that students who actively participate tend to pass the course, whereas those who do not tend to drop out or fail. All lectures and practicals ‘build’ on each other, so if you have to miss either one, absolutely make sure you have caught up with the materials before the next session.

2.5 Study Load

Below is a breakdown of the expected study load:

Activity Duration Times Total
Synchronous
Lectures 2.0 14 28
Tutorials 2.0 14 28
Asynchronous
Knowledge clips 1.0 14 14
Formative tests 1.0 14 14
Reading time 1.0 14 14
Studying 1.0 1 37
Assessment type(s)
Portfolio 15.0 2 30
Exam 1.5 2 3
Total 168
ECTS 6

2.6 Staff

Coordinator:

dr. Caspar J. van Lissa

Lab sessions

Amirali Rezazadeh

2.7 Teaching Philosophy

  1. Student-paced learning: instead of having traditional lectures where you sit and listen for two hours, you will watch relatively short (~45 minutes) lecture videos to prepare for class. In class, we use the material from these videos to guide discussions, make exam questions, and work on your portfolios.
  2. Challenge-based learning: a substantial part of your grade is based on your ability to apply the techniques you’ve learned to a real research question in several portfolio assignments. You can choose your own research question, can find your own dataset (or use a default dataset), and work on a topic that actually interests you.
  3. Throughout the course, you will be working in small learning teams to promote interaction among students, peer support, and accountability. Learning to work effectively in groups is an important skill; we will focus on group skills in the first lecture.

2.7.1 Why group assignments?

Contact with fellow students is a key aspect of the university experience. We want to stimulate you to engage with the material and with one another. Therefore, the portfolio assignments are made in groups. There are also aspects of learning in groups that can really improve your knowledge, like peer feedback. To ensure that every group member pulls their weight, the final exam tests each student’s individual comprehension of all material covered in the portfolios.

Groups comprise 3-5 members and are assigned randomly when the course starts. However, it is allowed to switch with a consenting member of another group, or to join/merge with another small group if your group has become smaller than 3 members. There are three portfolio registration deadlines. Before these deadlines, one group member must submit the definitive group composition via a Google form.

2.7.2 Why use portfolio assessment?

Portfolio assignments are well-suited for a skills-based course like Statistics 1. They also take a lot of the pressure off because you can work at your own pace, and keep improving the work until it is good enough. We entrust you with the responsibility of making these portfolio assignments in good faith, without instrumental assistance from outside your group or plagiarism, so I kindly ask you to make good on this trust, and hand in original work to show what you’ve learned.

2.8 Grading

Your grade is based on two components:

  1. A portfolio composed of three assignments made in groups, and
  2. An individual exam, split into three sessions, to test comprehension of the material covered in the portfolios.

A grade of 5.5 or higher is required for both components to pass the course.

The first occasion for the exam is split into three sessions, administered throughout the semester, for the following reasons:

  • To reduce study load by administering small tests shortly after the material is taught
  • To ensure continued engagement with the course
  • To give students feedback on their current level of understanding

While you do receive an informal grade for each session, the final grade is simply calculated based on your correct answers in all sessions. If that grade falls below 5.5, you can take a resit which covers the material of the entire exam (all 3 sessions).

2.8.1 Portfolios 40% (2 x 20%)

You work on the portfolio assignments with your group, both during the lab sessions and outside of class. For each assignment, register your group membership before the set deadline at http://tiny.cc/stats12_portfolio. You hand in your group’s portfolio assignment before the set deadline, at which point it is graded. If your grade is below the passing level of 5.5, your group will have the opportunity to revise the portfolio based on teacher feedback to receive a maximum grade of 6.

Groups should equally distribute the work load for the portfolio assignments. In case doubts are raised about the equal distribution of labor in a particular group, the portfolio assignment in question will be supplemented with individual oral examination and an individual grade, which can not exceed the original grade for the group assignment. In other words, failing to distribute the work properly can not have positive effects, but it can have negative effects on your grade. To prevent this, make clear agreements about the distribution of work with your group mates.

2.8.2 Exam 60%

To make sure that all students are equally involved in the making of the portfolio assignments, an individual exam assesses comprehension of the material covered therein. It is a digital multiple choice exam, split into three sessions, to test comprehension of the material covered in the portfolios.

2.8.2.1 Exam 1

Covers Week 35 (Introduction to Statistics) up to Week 41 (Hypothesis Testing)

2.8.2.2 Exam 2

Covers Week 43 (General Linear Model (GLM) I: Bivariate regression) up to Week 47 (Open science and questionable research practices)

2.9 Assignments

Below is a description of the assignments. For each assignment, every element labeled with a lower case letter is graded fail (0 points), pass (1 point), or excellent (1.5 points). Grades are summed for each assignment, and rescaled from 1-10. The final grade is the average across assignments of the rescaled grades. Note the stated word limit for each section. If you can write a good report with fewer words, that’s fine. If you exceed the word limit however, your grade for that section cannot exceed a pass (1 point).

The focus of the assignments should me on motivating, reporting, interpreting, and discussing your analyses. You will get a good grade for well-reasoned and discussed analyses.

See the Appendices section to access data sources for the assignment.

2.9.1 Assignment 1

Descriptive statistics and statistical inference

  1. Select at least three variables for further analysis, and motivate your selection based on theory, using at least one reference to explain why are you interested in the properties of the selected variables (150 words)
    1. Include one continuous variable
    2. Include one nominal variable
    3. Include one ordinal variable
  2. Describe the dataset (200 words + tables/figures)
    1. Use appropriate univariate descriptive statistics for all variables
    2. Plot data using appropriate plots
    3. Include at least one frequency- or crosstable
  3. For a continuous variable:
    1. Select one or more values with clinical/societal/statistical relevance (i.e., provide some justification for the choice of value)
    2. Using probability calculus, calculate and report the probability of observing values that fall below/between/exceed the chosen value(s)
  4. For a continuous variable:
    1. Formulate a specific null- and alternative hypothesis
    2. Report a one-sample t-test or Z-test for the specific null-hypothesis
    3. Calculate the probability of comitting a Type II error
  5. Discuss your analyses (300 words)
    1. Explain your rationale for important modeling decisions
    2. Motivate your choice for the type of statistics and analyses
    3. Discuss assumptions
    4. Discuss what you have learned from it and how you might improve it
  6. Use APA style throughout your report
  7. Reflect on the group process (300 words). Note: I will grade your reflection, not your process. So: if your group’s process is not working well, but you reflect on it properly, you can still get full marks for this component. Use Gibbs’ Reflective Cycle:
    1. Describe what happened during the group work
    2. Explain how you felt during the group work
    3. Look at the good and bad aspects of the group work
    4. What were the obstacles you experienced? What factors contributed to success?
    5. What could you have done differently to improve the situation?
    6. What are your intentions to make the next group assignment work (even) better?

2.9.2 Assignment 2

General linear model

  1. Select at least three variables for further analysis, and using at least one reference, explain what research questions you will investigate and what hypotheses you will test (150 words)
    1. Include one continuous outcome variable
    2. Include one continuous predictor
    3. Include one nominal or ordinal predictor
  2. Construct a model with only the continuous predictor (200 words)
    1. Report and interpret the different sums of squares
    2. Report and interpret the explained variance
    3. Conduct a separate correlation analysis. Compare the results with the regression analysis.
  3. Construct a model with only the categorical predictor (200 words)
    1. Report and interpret the model results
    2. Conduct a separate ANOVA or t-test with the same variables, whichever one is suitable. Compare the results with the regression analysis.
  4. Construct a model with both the continuous and categorical predictor (200 words)
    1. Report and interpret the model results
    2. Conduct and report a nested model test
  5. Discuss your analyses (300 words)
    1. Explain your rationale for important modeling decisions
    2. Motivate your choice for the type of statistics and analyses
    3. Discuss assumptions
    4. Discuss what you have learned from it and how you might improve it
  6. Use APA style throughout your report
  7. Reflect on the group process (300 words). Note: I will grade your reflection, not your process. So: if your group’s process is not working well, but you reflect on it properly, you can still get full marks for this component. Use Gibbs’ Reflective Cycle:
    1. Describe what happened during the group work
    2. Explain how you felt during the group work
    3. Look at the good and bad aspects of the group work
    4. What were the obstacles you experienced? What factors contributed to success?
    5. What could you have done differently to improve the situation?
    6. What are your intentions to make the next group assignment work (even) better?

2.10 Use of Large Language Models (LLMs)

Honestly: I advise against using ChatGPT and similar LLMs for this course. Here’s why: LLMs learn from all text on the internet, which includes a lot of text posted by people who do not understand statistics. As a result, my experience is that LLMs produce a lot of plausible sounding nonsense for statistics assignments.

If you’re worried about the quality of your writing: that is not graded here. I’d rather have a simple and clear report in imperfect English than a beautifully written AI-fluff piece full of hallucinated nonsense.

If you decide to use LLMs, it is your responsibility to thoroughly check its output for logical consistency and correctness. You may not yet have the level of expertise required to know when ChatGPT generates irrelevant nonsense - but the teacher who grades your work does. Consider this carefully when deciding what makes more sense: doing your work manually, making sure each step is correct - or outsourcing it to AI, and then checking its work before submitting.