\(\LaTeX\) Guide

Typesetting Mathematical Equations in Quarto

Author
Affiliation

Kaitlyn Cook, edited by Ben Baumer

Smith College

An Introduction to \(\LaTeX\) in Quarto

\(\LaTeX\) is a markup language and document preparation system that allows for (easier) typesetting of mathematical symbols, notation, and equations. Quarto allows for integration of \(\LaTeX\) code alongside text and R code!

Getting Started

To indicate to the document compiler that we want to start typesetting math equations, we offset \(\LaTeX\) code using dollar signs: $your LaTeX code here$. For example, the code $y=mx+b$ produces \(y=mx+b\).

Common Math Commands

Most LaTeX math commands are relatively intuitive and obey common mathematical order. We’ll illustrate some of the most common symbols you might use in an SDS course below!

  • Superscripts and Subscripts: we can use underscores to obtain subscripts (e.g., $x_i$ yields \(x_i\)) and carets to obtain superscripts (e.g., $e_i^2$ yields \(e_i^2\)). If your superscript or subscript contains more than one character (e.g., if we wished to write down notation for a \(t\) distribution with \(n-2\) degrees of freedom), we can wrap our superscripts and subscripts in braces, { }. As a concrete example, the code $t_n-2$ produces the output \(t_n-2\), while the code $t_{n-2}$ produces the output \(t_{n-2}\).

  • Hats for Estimated and Predicted Values: we can use the \widehat{} command to place hats on top of values that are estimated or predicted from the observed data (e.g., $\widehat{y}_i$ yields \(\widehat{y}_i\)).

  • Overbars for the Sample Mean: we can use the command $\overline{}$ to denote the sample average (e.g., $\overline{x}$ yields \(\overline{x}\)).

  • Fractions: we can use the command \frac{}{} to typeset fractions; anything in the first set of brackets will go in the numerator, and anything in the second set of brackets will go in the denominator (e.g., $\frac{1}{2}$ yields \(\frac{1}{2}\)).

  • Summations: we can use the command $\sum_{}^{}$ to typeset the summation notation we’ve been using in class; the first set of brackets sets up the indexing that goes below/on the bottom of the summation sign while the second set of brackets specifies the upper limit on the sum, which goes above/on top of the summation sign (e.g., $\sum_{i=1}^{n}x_i$ yields \(\sum_{i=1}^{n}x_i\)).

  • Multiplication: by convention, we typically do not write a multiplication symbol at all (such as when writing \(y=mx+b\)); alternatives include indicating multiplication using parentheses (such as when writing \((4)(3)=12\)) or using the $\cdot$ or $\times$ commands (which yield \(\cdot\) and \(\times\), respectively).

  • Inequalities: we can use the commands $<$, $\leq$, $>$, and $\geq$ to typeset \(<\), \(\leq\), \(>\), and \(\geq\), respectively.

Note

The Spinelli Center maintains an extensive guide to LaTeX and list of LaTeX-related resouces, accessible here should you be interested in learning more!

Typesetting Probability Equations and Formulas

Here are some examples of how we might typeset probability concepts.

Probability Notation and Symbols

There are four special symbols that we saw in class to represent the notions of unions (“or”), intersections (“and”), complements (“not”), and conditioning (“given”). We can typeset each of these symbols as follows:

  • Unions: the LaTeX command for typesetting unions is \cup. As mentioned previously, you will want to place all of your math expressions inside two dollar signs, $your LaTeX code here$, which in this case produces the symbol \(\cup\).
  • Intersections: the LaTeX command for typesetting intersections is \cap. Once we place this command inside two dollar signs, we get the symbol \(\cap\).
  • Complements: the LaTeX command for typesetting the complement of an event is ^c, where the caret ^ is used to indicate that we wish to format c as a superscript. For example, if we put the code A^c inside two dollar signs, we get the output \(A^c\).
  • Conditioning: the LaTeX command for typesetting conditioning is \vert, though you can also simply use the vertical line keyboard key, |. After typesetting \vert inside two dollar signs, we get the symbol \(\vert\); after placing the vertical line keyboard key inside two dollar signs, we also get the symbol \(|\).

Please also use \Pr for probability and \mathbb{E} for expectation. Let’s see all of these symbols put into practice! Let \(A\) be the event that a student in an introductory statistics class is a STEM major and \(B\) be the event that the student prefers vanilla ice cream to chocolate.

  • The marginal probability of being a STEM major is \(\Pr(A)\), and the marginal probability of being a non-STEM major is \(\Pr(A^c)\).
  • The joint probability of being a STEM major and preferring vanilla ice cream is \(\Pr(A \cap B)\).
  • The probability of either being a STEM major or preferring vanilla ice cream is \(\Pr(A \cup B)\).
  • The probability of being a STEM major given the student prefers vanilla ice cream is \(\Pr(A | B)\).

Equations and Formulas

If we want to typeset full equations (like the Law of Total Probability or Bayes’ Rule) that are important enough (or long enough) to be offset from the main text in their own centered paragraph, we can do so using display math mode. Display mode is achieved by either (a) wrapping the \(\LaTeX\) code in two sets of double dollar signs $$code for your offset equation here$$ or (b) wrapping the LaTeX code in \begin{equation} and \end{equation}.

Here we demonstrate both of those approaches; the following code (when rendered) should produce the Law of Total Probability as a centered equation on its own line in the document twice: \[\Pr(A) = \Pr(A | B) \Pr(B) + \Pr(A|B^c) \Pr(B^c)\] \[\begin{equation} \Pr(A) = \Pr(A|B) \Pr(B) + \Pr(A|B^c) \Pr(B^c) \end{equation}\] Similarly, here is an example of how we might typeset Bayes’ Rule using display math mode plus the \frac{}{} command: \[\Pr(A|B) = \frac{\Pr(B|A) \Pr(A)}{\Pr(B)}.\]

Typesetting Population Parameters and Population Regression Lines

In statistics, the convention is to represent all population parameters by lowercase Greek letters; here is a list of the most common letters used and the corresponding LaTeX code:

  • Alpha: \(\alpha\) (\alpha)
  • Beta: \(\beta\) (\beta)
  • Gamma: \(\gamma\) (\gamma)
  • Delta: \(\delta\) (\delta)
  • Epsilon: \(\epsilon\) (\epsilon) or \(\varepsilon\) (\varepsilon)
  • Theta: \(\theta\) (\theta)
  • Lambda: \(\lambda\) (\lambda)
  • Mu: \(\mu\) (\mu)
  • Pi: \(\pi\) (\pi)
  • Rho: \(\rho\) (\rho)
  • Sigma: \(\sigma\) (\sigma)
  • Chi: \(\chi\) (\chi)

Equations and Formulas

If we want to typeset full equations that are important enough (or long enough) to be offset from the main text in their own centered paragraph (like the equations for the population or fitted linear regression line), we can do so using display math mode. Display mode is achieved by either (a) wrapping the \(LaTeX\) code in two sets of double dollar signs $$code for your offset equation here$$ or (b) wrapping the LaTeX code in \begin{equation} and \end{equation}.

Here we demonstrate both of those approaches; the following code (when rendered) should produce the population model for the mean as a centered equation on its own line in the document twice:

$$
  \mathbb{E}[Y_i | x_i] = \beta_0 + \beta_1 x_i
$$

\[ \mathbb{E}[Y_i | x_i] = \beta_0 + \beta_1 x_i \]

\begin{equation}
  \mathbb{E}[Y|x_i] = \beta_0 + \beta_1 x_i.
\end{equation}

\[\begin{equation} \mathbb{E}[Y|x_i] = \beta_0 + \beta_1 x_i. \end{equation}\]

The equivalent representation of the population regression line (in data = model + error format), alongside the regression assumptions, can be typeset as

\begin{equation}
  Y_i = \beta_0 + \beta_1 x_i + \epsilon_i, \quad \epsilon_i \sim N(0, \sigma^2),
\end{equation}

\[\begin{equation} Y_i = \beta_0 + \beta_1 x_i + \epsilon_i, \quad \epsilon_i \sim N(0, \sigma^2), \end{equation}\] where the code $\sim$ produces \(\sim\), our notational shorthand for the phrase “follows the same distribution is” or “is distributed according to”. So we read the notation \(\epsilon_i \sim N(0, \sigma^2)\) as “the errors (i.e., the way in which the population deviates from the model) follow a Normal distribution with mean 0 and variance \(\sigma^2\)”.

The fitted regression line based on the observed values of \(y_i\) and \(x_i\) in our sample is given by \[\widehat{y}_i = \widehat{\beta}_0 + \widehat{\beta}_1 x_i.\]

If you would like to create a multi-line display math environment where each line of your formula/text is vertically aligned at a particular location, you can alternatively use the align environment, which wraps the LaTeX code in \begin{align} and \end{align}. To insert a line break in the aligned formula, we use two backslashes, \\; the content before the two backslashes forms the first line of the formula and the content after the two backslashes forms the second line. To indicate how these two (or more) lines should be positioned on top of each other, we use the “and” sign, &; \(\LaTeX\) formats the multiple lines so that the locations of the & sign are vertically aligned. For example, the code

\begin{align}
  a &= b + c \\ 
  d + e  f &= \\ 
  g + h &= i + j
\end{align}

produces the output \[\begin{align} a &= b + c \\ d + e + f&= \\ g + h &= i + j \end{align}\] If you would like to turn off the equation numbering, you may do so by replacing \begin{align} \end{align} with \begin{align*} \end{align*}.