Multivariate Statistical Methods Used in Population Genetics
Maman Laouali Adamou Ibrahim *
Department of Biology, Faculty of Sciences and Techniques, Abdou Moumouni University of Niamey, Niger.
Oumarou Zango
Department of Biology, Faculty of Sciences and Techniques, University of Zinder, Niger.
Maman Maarouhi Inoussa
Department of Biology, Faculty of Sciences and Techniques, Abdou Moumouni University of Niamey, Niger.
Soulé Moussa
Department of Biology, Faculty of Sciences and Techniques, Abdou Moumouni University of Niamey, Niger.
Yacoubou Bakasso
Department of Biology, Faculty of Sciences and Techniques, Abdou Moumouni University of Niamey, Niger.
*Author to whom correspondence should be addressed.
Abstract
Several multivariate statistical methods are used in population genetics but there are very few studies that have revealed the strengths and weaknesses of different methods. Thus, this study aims to reveal the strengths and weaknesses of the different multivariate statistical methods used in population genetics through the world. This synthesis is carried out according to the methodology "Preferred Reporting Items for Systematic Reviews and Meta-Analyzes" (PRISMA). This study shown that various statistical methods or combination of multivariate statistical methods are used in population genetics. It emerges that there is no a priori a better method, so it is necessary to determine the method adapted to both the data collected and the research objective. This study identified the most commonly used multivariate statistical methods in genetics such as: Ordination methods (52.50%) are methods that summarize the information contained in the data matrix by minimizing wastage. This are: principal components analysis (by 32.0% of the articles), principal coordinates analysis (by 7.50% of the articles), discriminant analysis of principal component, factorial correspondence analysis, factorial discriminant analysis, factorial analysis on distance table. Clustering methods (35%) that aim to form groups of individuals that are as similar as possible, including the hierarchical ascending clustering (17.50% of articles), neighbor-joining, and Bayesian clustering model (by 15% of the articles). The analysis of the molecular variance (7.50%) which consists of studying the intra and inter-population variation and the Mantel test (5%) which aims to test the correlation between the matrix of genetic distances and other distance matrices (environmental causes of genetic variability).
Keywords: Genetics, multivariate statistical methods, ordination methods, classification methods.