ARPHA Conference Abstracts : Conference Abstract
Conference Abstract
Mg-Traits pipeline: advancing functional trait-based approaches in metagenomics
expand article infoEmiliano Pereira-Flores, Albert Barberan§, Frank Oliver Glöckner|, Antonio Fernandez-Guerra
‡ University of the Republic, Rocha, Uruguay
§ The University of Arizona, Arizona, United States of America
| University of Bremen, Bremen, Germany
¶ University of Copenhagen, Copenhagen, Denmark
Open Access


Microorganisms comprise an immense phylogenetic and metabolic diversity, inhabit every conceivable niche on earth, and play a fundamental role in global biogeochemical processes. Among others, their study is highly relevant to develop biotechnological applications, understand ecosystem processes and monitor environmental systems. Functional traits (FTs) (i.e., measurable properties of an organism that influence its fitness (McGill et al. 2006)) provide complementary information to the taxonomic composition to improve the characterization of microbial communities and study their ecology (Martiny et al. 2012). The application of FT-based approaches can be particularly enhanced when coupled with metagenomics, which as a culture-independent method, allows us to obtain the genetic material of microorganisms from the environment: Metagenomic data can be used to compute functional traits at the genome level from a random sample of individuals in a microbial community, irrespective of their taxonomic affiliation (Fierer et al. 2014). Previous works using FT-based approaches in metagenomics include the study of community assembly processes (Burke et al. 2011) and responses to environmental change (Leff et al. 2015), and ecosystem functioning (Babilonia et al. 2018).

In this work, we present the Metagenomic Traits pipeline: Mg-Traits. Mg-Traits is dedicated to the computation of 25 (and counting) functional traits in short-read metagenomic data, ranging from GC content and amino acid composition to functional diversity and average genome size (see Fig. 1). As an example application, we used the Mg-Traits pipeline to process the 139 prokaryotic metagenomes of the TARA Oceans data set (Sunagawa et al. 2015). In this analysis, we observed that the computed metagenomic traits track community changes along the water column, which denote microorganisms’ environmental adaptations.

Figure 1.  

Mg-Traits pipeline. The 25 metagenomic traits computed by the Mg-Traits pipeline are divided into four different groups. The first includes the metagenomic traits computed at the nucleotide level: (1) GC content, (2) GC variance, and (3) Tetranucleotide frequency. The second group includes the traits obtained from the open reading frame (ORF) sequence data: (4) ORFs to Base Pairs (BPs) ratio, (5) Codon frequency, (6) Amino acid frequency, and (7) Acidic to basic amino acid ratio. The third group is based on the functional annotation of the ORF amino acid sequences. The first 12 metagenomic traits (from 8 to 19 in the figure) comprise the composition, diversity, richness, and percentage of annotated genes for three different sets of genes: Pfam (, Resfam (, and Biosynthetic Gene Cluster (BGC) domains ( Additionally, this group includes (20) the percentage of transcription factors (TFs) and (21) the average genome size (AGS). Lastly, in the fourth group are included the taxonomy-related metagenomic traits: (22) average copy number of 16S rRNA genes (ACN), taxonomic (23) composition, (24) diversity, and (25) richness.

Mg-Traits allows the systematic computation of a comprehensive set of metagenomic functional traits, which can be used to generate a functional and taxonomic fingerprint and reveal the predominant life-history strategies and ecological processes in a microbial community. Mg-Traits contributes to improving the exploitation of metagenomic data and facilitates comparative and quantitative studies. Considering the high genomic plasticity of microorganisms and their capacity to rapidly adapt to changing environmental conditions, Mg-Traits constitutes a valuable tool to monitor environmental systems.

The Mg-Traits pipeline is available at It is programmed in AWK, BASH, and R, and it was devised using a modular design to facilitate the integration of new metagenomic traits.


Bioinformatics: Metagenomics; Functional Traits; Microbial Ecology; Environmental Monitoring

Presenting author

Emiliano Pereira Flores

Presented at

1st DNAQUA International Conference (March 9-11, 2021)