You are here: AU PhD  Graduate Schools Science and Technology Courses and how to sign up Scientific courses Programming and Statistics for the Aquatic Sciences (2016)

Programming and Statistics for the Aquatic Sciences (2016)

ECTS credits: 4


Course parameters:
Language: English
Level of course: PhD course
Time of year: Q1 2016
No. of contact hours/hours in total incl. preparation, assignment(s) or the like: 21 hrs preparation, 42 Contact hrs; 35 hrs assignment
Capacity limits: 18


Objectives of the course:
Hypothesis testing in aquatic sciences involves the development and analysis of statistical models, often in the face of data that are complex (e.g. non‐parametric) and big (e.g. large volume, highly variable and evolving). To face this challenge, many aquatic scientists are choosing to learn a programming language to handle all aspects of data analysis (exploring, summarizing, analyzing, visualizing), as well as mathematical modelling tasks. In this course, you will learn how to use an open-source programming language (R, for effective data analysis and statistical modeling to address research questions with an emphasis on those in the aquatic sciences.


Learning outcomes and competences:
At the end of the course, the student should be able to:


  • List motivation for learning a programming language
  • Access online resources for R (particularly those related to the aquatic sciences) and import new function packages into the R workspace
  • Import, review, manipulate and summarize data-sets in R
  • Explore data-sets relevant to research questions to create testable hypotheses and identify appropriate statistical tests
  • Perform appropriate statistical tests using R
  • Communicate model choices and results for publication
  • Create and edit visualizations with R for publication
  • Document analysis and results in a manner that supports transparency and reproducible research


Compulsory programme:
Active participation in all aspects of the course.


Course contents:
The course will be divided into 3 parts:

2.1.) Course preparation: Before the in-class portion of the course begins (see 2.2 below), students will be given basic programming tutorials to get them started with the R programming language.  Students will complete a worksheet submitted before the in-class portion of the course begins as evidence of their preparations and to guide the instructor to topic areas needing review.  In addition, students will be asked if they have a research hypothesis that they intend to analyze for their course project or would like to request one from the instructor (see 2.3. below).

2.2.) In-class: Students will gather for 5 days (~ 6 hours per day) of in-class combined lecture-computer tutorial sessions. Students are expected to bring their own laptop (Mac, PC or Unix-based) to each session (all software is free; links will be provided before the class). Topics covered will include:

  • Day 1
    • Programming Basics (Including a review of pre-class preparation)
    • Data structures (creating, importing and manipulating numeric, character
  • Day 2
    • Research hypothesis to statistical model (identifying response and predictor variables) (identifying response and predictor variables)
    • Probability distributions
    • From research hypothesis to statistical model (choosing a starting model, fitting your model, assessing model fit, model selection, communicating results for publications, using models to make predictions) 
    • Examples of linear modelling/ANOVAs, multiple linear modelling, non-linear modelling, etc.
  • Day 3
    • How data commonly violate assumptions and what to do about it (generalized linear models, generalized additive models, temporal/spatial autocorrelation, mixed models) 
  • Day 4
    • Visualizing data and results for publication (including maps and charts)
  • Day 5
    • Pulling it all togehther (summarizing research hypothesis strategies, using R to support open science, reproducible research and publication)

In addition, the instructor will meet with each student during the week to clarify, define and strategize their research hypothesis for the course project (see 2.3 below).

2.3) Course project: After the in-class portion, students will work independently to develop and analyze a statistical model to test a research hypothesis of their choice.  Students may also request a research hypothesis and data set from the instructor.  Students will submit a report that outlines their hypothesis and statistical model, and presents model assessment and results (i.e. the bulk of a manuscript/thesis chapter’s methods and results section).  


This course is aimed for PhD students but would be applicable to senior undergraduates, faculty, postdoctoral fellows and researchers. The course expects students to

  • have no previous programming experience (but GREAT if you do!)
  • have some basic statistical knowledge and a desire to learn more
  • be motivated enough to work through the learning curve associated with learning any programming language.

Name of lecturer:
Anna Neuheimer, University of Hawaii


Type of course/teaching methods:
In class exercises


To be announced


Course homepage:
To be announced


Course assessment:
Course grade will be Credit/No-credit (Pass/Fail) based on participation in i) course preparation exercises (Section 2.1), ii) lecture/lab exercises (Section 2.2), and iii) course project (Section 2.3). 


Department of Bioscience, Aarhus University


Special comments on this course:


2.1 Course preparation (22/8-1/9 2016)
2.2 In class (5, 6, 8, 12 & 13/9)
2.3 Assignment (13-20/9)


Campus Aarhus – details to be announced.


Registration will be via Aarhus University Web-shop (for AU students): or by direct application to Peter Grønkjær,, for all others. Direct applications to Peter Grønkjær should include a 1 page CV.

Deadline for registration is Monday, 20 June 2016. Information regarding admission will be sent out no later than Friday, 1 July 2016

If you have any questions, please contact Peter Grønkjær e-mail:     

Comments on content: 
Revised 20.06.2016