You are here: AU PhD  Graduate Schools Science and Technology Courses and how to sign up Scientific courses Statistical models for genomic prediction in animals and plants (2015)

Statistical models for genomic prediction in animals and plants (2015)

Name of course: Statistical models for genomic prediction in animals and plants


ECTS credits: 5


Course parameters:

Language: English

Level of course: PhD

Time of year: Q4 (15-19 June 2015)

No. of contact hours/hours in total incl. preparation, assignment(s) or the like: 35 / 80.

Capacity limits:25


Objectives of the course:

The course focuses on the quantitative genetics and statistical background of different genomic prediction models, covering also estimation of variance components, theory on genomic heritabilities, Bayesian statistics, estimation of hyper parameters in Bayesian models, multitrait models and simple genomic feature models. Use of all models will be trained in computer practicals with the objective that students obtain an understanding of the statistical principles of the different models, and can analyse data with a critical assessement of the results from different statistical approaches.

 

Learning outcomes and competences:

At the end of the course, the student should be able to:

- analyze the statistical problems arising in problems with large sets of predictors

- structure and explain strengths and weaknesses of various statistical and computational tools to build prediction models from high dimensional data

- apply popular software tools for mixed models, ridge regression, LASSO and Bayesian MCMC methods

- perform cross validation studies and assess predictive ability of models by prediction correlation

- explain and evaluate consequences of the data and population factors affecting predictive ability

- apply prediction tools in an empirical data set


Compulsory programme:

- a set of key papers (approx 10) is distributed that students are expected to study as preparation

- students should actively participate in the computer practicals. After each practical 3 students will be asked to present their results which will be discussed with the other students

- students should hand in a course report (deadline 2 weeks after the last class) which is an extension of the computer practicals, for instance adds further detail or more comparisons of methods. The report should include a short description of the data used, argumentation for choice of methods, results in figures or tables, and a discussion and interpretation of the results, with a length of about 6 pages.


Course contents:

Teaching sessions are schedule for 5 days:

  • Day 1: background on genomic prediction and genomic selection and relevance of generation interval and accuracy in breeding programs; comparison to classical approaches (QTL mapping, MAS); simple approaches using GWAS results and polygenic risk scores. Simple mixed model (SNP-BLUP aka rrBLUP) for whole-genome prediction.
  • Day 2: cross-validation schemes: split data, x-fold, leave-one-out and cross validation across families; bias of predictions. Building of the G-matrix and the GBLUP model; comparison of SNP-BLUP and GBLUP variance components and predictions. Multitrait GBLUP and G-REML.
  • Day 3: Different scaling methods for G-matrices (Van Raden method 1,2,3,4), scaling and interpretation of relationships and inbreeding in the G-matrix. Single step GBLUP and combining the A and G-matrix and scaling of A and G matrices.
  • Day 4: Bayesian shrinkage models: BayesA and LASSO and their hyper parameters; Bayesian mixture models (BayesB/C/D/R, Bayesian variable selection and variations) and their hyper parameters.
  • Day 5: Theory on genomic heritability and effects of relationships in populations; impact of relationships on predictions and comparison of methods with strong and weak relationships. Multitrait Bayesian models and simple genomic feature models with variance components and predictions by chromosome.

Every day is broken up by 1.5-2 hours lecture followed by 1.5-2 hours exercise in the morning, and the same scheme is repeated in the afternoon. After each practical about 3 students will be asked to present their results, which will be discussed with the other students.

After the classes students should prepare a course project (deadline for submission 2 weeks after the last class) as described under 'compulsory programme'.


Prerequisites:

Background in linear models (regression, multiple regression) and preferably in mixed models (random effects, variance components)


Name of lecturers:

Luc L. Janss and Ole F. Christensen (Department of Molecular Biology and Genetics).


Type of course/teaching methods:

Lectures, computer exercises, short presentations by students


Literature:

Approx. 10 key papers and class notes.


Course homepage:

None


Course assessment:

Assessment is based on 1) short presentations during practicals and active participation in the discussion of the exercises; 2) the course report, which should add additional detail, insights, comparisons compared to the course exercises.


Provider:

Center for Quantitative Genetics and Genomics, Department of Molecular Biology and Genetics.


Special comments on this course:

The course fee is 1500 DKK, which covers lunch and daytime refreshment during the teaching days.


Time:

15-19 June 2015, from 8:30 – 16:00 hours.


Place:

Aarhus, Studenterhuset.


Registration:

Registration (requires payment of course fee by credit card) on the web site: auws.au.dk/default.aspx

by 15 May 2015. Information regarding admission will be sent out no later than 1 June 2015.

If you have any questions, please contact Luc Janss, e-mail: luc.janss@mbg.au.dk

Comments on content: 
Revised 20.06.2016