Explainers.

Learn about what Guthub can do for you and get a brief introduction to all things microbiome, data processing, and paper-ready figures.

After learning about the microbiome, read about how to investigate the composition of your microbiome samples.

Learn to create a PCoA plot to distinguish different microbiomes.

Learn to create a PCoA with added arrows to showcase directional influence from most significant OTUs.

Read about the fundamentals behind BF Ratios.

Learn to a series of jitterplots for LDA values with the most significant OTUs.

Guthub offers a simple, crisp interface to take your raw
** Mothur** output files and turn them into paper-ready
figures. Mothur is a data
processing server hosted by the University of Michigan.
Read more at Mothur.org. Guthub offers limited
customization, but we try to meet the needs of all
participating researchers. Expect regular updates
to all

Before we get into all of this, let’s take a step back and ask
the fundamental question: **What is the microbiome?**

The ** microbiome** is the summation of all the
Bacteria and Archaea within any environmental
context. This means there is a separate

Let’s explain this in easier terms. Say you have a closet filled with drawers. These drawers are filled with a variety of clothes. For most people, these drawers are disorganized. While they may be organized by type, they are not necessarily organized by color. For purposes of this explanation, let’s assume we can organize these drawers however we want instantly. We also have an infinite amount of drawers. Still with me?

Let’s start exploring our closet!

To get anything meaningful out of this exploration,
we need to decide what clothes should go into what
drawers. We can look at all of our clothes as a whole
and look for patterns. Every closet is different.
While one closet may have an abundance of pants, another
might have no pants. Regardless of the contents of
your closet, you could look at different
**types** of clothes
like pants or shirts. Additionally, you could look at
different

When we are satisfied with how we’re going to organize
our clothes, we can place them into corresponding
drawers. For starters, a drawer could be filled
with all red clothes, all pants, or all red pants.
It is completely arbitrary, but note how the
* contents* of the drawer define what that drawer
represents.

The drawer is the organizational unit of a closet,
it’s what helps us break up all our clothes into
categories we can understand and compare. An
organizational unit can also be called a ** bin**.
(This is fitting considering drawers are basically bins!)
A closet is a

With any microbiome experiment, there are ** samples**.
Samples are the test subjects of your experiment
(e.g. mice, humans,

Now that we have an understanding of the organization of the
closet, how can we analyze it to get meaningful results? First,
we need a hypothesis. For example, let’s hypothesize that people
with a higher percentage of *red* clothes tend to have a lower percentage

__Closet 1__

** Red:** 10

__Closet 2__

** Red:** 10

Comparing the closets above, both Closet 1 and Closet 2 have an equal number of red clothes. Had we not taken into account the size of the closets, we would have concluded the distribution of red clothes among both closets is the same!

Instead, to compare different samples, or *closets*, we need
to look at

__Closet 1__

** Red:** 10

__Closet 2__

** Red:** 10

Now we can see a clear difference between the two closets!
Of course, this data is * not* significant yet, because our number
of samples, or

So the closets scenario makes sense, but what does this have to
do with the microbiome? Well, in a sense the microbiome is a
type of *closet*. Each microbiome is filled with

Each microbiome is a ** sample**. Each Bacteria, is referred as

Let’s make things a little more complicated. Previously we described microbiomes through the perspective of its’ OTUs. Say we wanted to look at the microbiomes of our samples in terms of phylums, genuses, or domains (other levels of the taxonomy name), could we do this also? Of course! Here is the structure of taxonomy names, in case you aren’t familiar:

**Kingdom****Phylum****Class****Order****Family****Genus****Species***Strain**

We can look at relative abundances of all attribute we have access to as long as it is adjusted for overall size within samples.

How the data set closets compare to the data set microbiome is listed below: $$Dataset \rightarrow Bins \rightarrow Attributes$$ $$Closet \rightarrow Clothes \rightarrow Types, \space Color, \space Age$$ $$Microbiome \rightarrow OTU \rightarrow taxonomy \_ name, \space size$$

As seen previously, ** Relative Abundance Plots** offer one
perspective of the microbiome. But this data plot doesn’t
tell us much. We can also look at the distribution of other
types of attributes. Since we already looked at how different
OTUs make up our samples’ microbiome, let’s dig deeper. We could
look at the distribution of different

Now that you have been introduced to the microbiome and bioinformatics, take a look at any other data analysis explainer and start making figures today!

To get things started, here's a typical workflow for making microbiome figures: $$ PCoA \space plot \rightarrow AMOVA \rightarrow Relative \space Abundance \space Plot \rightarrow BF \space Ratio \rightarrow$$ $$Alpha \space Diversity \rightarrow LDA \space and \space LEFSe \rightarrow LDA \space Jitterplots \rightarrow Heatmap$$

After looking at relative abundances let’s start categorizing
the microbiome a little more. One of the reasons to look at
Relative Abundance Plots is to get a feel for how different
phylum compare across samples. One example of this is the
** Bacteroidetes-to-Firmicutes Ratio**.

$$BF \space Ratio = {{Total \space number \space of \space Bacteroidetes} \over {Total \space number \space of \space Firmicutes}}.$$ $${{\uparrow \space number \space of \space Bacteroidetes} \over {\downarrow \space number \space of \space Firmicutes}} = higher \space BF \space Ratio = Lower \space risk \space of \space obesity$$

If we take the total number of *Bacteroidetes* and *Firmicutes*
bacteria and divide them, we can get a ratio for each sample.
After averaging these ratios, they can be compared across
different experimental groups.

One of the first data analyses you should conduct on your
microbiome data is the ** Principle Coordinates Analysis**, or

What do we mean by “two-dimensional spatial analyses”? Well, the PCoA plot is similar in structure to an XY-plot you’re familiar with from grade school math. A PCoA plot is similar, but the axes names are the two most significantly different OTUs between your experimental groups. This gives you a framework to compare where the samples are clustering.

These plots provide two key pieces of information:

**How do***samples within the same group*cluster?**How do***samples within different groups*cluster?

The above paragraphs are summarized in the PCoA plots below:

If samples within different groups cluster together, it is safe to conclude these groups have similar microbiomes. If samples within different groups cluster separate from each other, it is safe to conclude these groups have different microbiomes.

As can be seen in the PCoA plot above, we are looking at two
experimental groups: ** Ctrl PN** and

Looking just at the ** Met PN** samples, we see we have four samples.
There is a significant spread of the

Looking just at the ** Ctrl PN** samples, we see we have six samples.

When we compare both groups against each other, we see they have
similar spreads. Thus, we cannot say if *either* group has a
large or small variation, all we can say is they have the *same variation*.
Remember, these spreads are *relative* because we are
just comparing two groups without anything to reference to.

In addition to similar variations, we see both groups overlap almost perfectly. Thus, we can conclude that these two experimental groups have the same (or similar) microbiomes!

*(Look at PCoA for a general overview of PCoA plots!)*

To explain the concept of ** PCoA Biplots**, we’ll use the PCoA plot
below:

Initially, the PCoA plot suggests the two experimental groups,
** Met PN** and

To answer this, we can stay within our original PCoA plot.
These microbiomes are different by virtue of the most significant
OTUs observed. In other words, whatever OTUs differ the most
amongst groups is what’s causing these microbiomes to not overlap.
To see which OTUs are involved we can add ** biplot arrows**.
By displaying these

To better explain this, let’s see some biplot arrows in action!
Let’s dig deeper into the above plot and see which OTUs are
pushing ** Met PN** and

While this plot looks different than our bare PCoA plot, it’s
more similar than it may appear. Both experimental groups are
the same, ** Met PN** and

As stated above, the plots look different. This is because of
two reasons that are specific to our coding. First, our PCoA
and PCoA Biplot Shiny Apps are built separately, so the
formatting isn’t exactly the same. Additionally, this plot
example contains 6 additional ** Ctrl PN** microbiomes (the 6 dots
clustering at the bottom of the plot).

Now that we know what’s similar and different between the two plots, let’s start exploring our biplot! While it’s apparent the groups have different microbiomes in the bare plot, this biplot shows which OTUs are creating that difference.

In the upper left hand corner of the plot, it’s clear the
** Met PN** samples are being changed by three OTUs:

When addressing the OTUs that are changing ** Ctrl PN** relative to

A hypothetical next step in data analysis would be to identify
the points in ** Ctrl PN**. It seems that within