About

To paraphrase one of my inspirations for this project (Kurz 2023): This is a labor of love. In 2015, Hildegarde Heymann (from here on: HGH), Distinguished Professor of Viticulture & Enology at UC-Davis, was kind enough to host me as a visiting scholar at her laboratory. Among many other formative experiences I had as a sensory scientist at UC-Davis, HGH shared with me the R Opus, a handbook she had written and compiled of common analytical approaches to multivariate (food-sensory) data in R. Like many of her other mentees, students, and postdocs, I benefited immensely from HGH’s practical insight into how to apply abstruse multivariate analyses to real problems in research, and the R Opus manifested that knowledge into a hands-on guide for how to implement those tools.

In the time since, I have passed on the R Opus to my own mentees, students, and postdocs. As R has continued to mature and become more accessible as a scripting language for data analysis–in particular, as “tidy” programming principles have become more dominant–I have found myself also passing on my own set of tips and tricks for how to transform the tools found in the original R Opus into the current vernacular. I began teaching data analytics and coding for researchers using R, and after learning how to (clumsily) transform my course notes into accessible bookdowns, I thought: why not the R Opus?

This is that thought put into some sort of action. I hope it is useful for you.

Usage

This bookdown is constructed around typical workflows and data analysis needs for sensory scientists. You know who you are.

For all others, this bookdown is a structured introduction to the analysis of multivariate data from a practical and applied perspective. Specifically, we investigate how to apply a series of common multivariate statistical analyses to a set of data derived from the tasting and rating of wines by both trained and untrained human subjects. I place almost all of the emphasis on “how to”, and much less on the statistical theory behind these approaches. As I have spent longer and longer using the statistical tools sensory scientists commonly apply (and occasionally developing and adapting new ones), I have come to believe that it is much more important to think of statistical analyses as tools that we apply, rather than worrying about our complete abstract understanding. The latter, of course, is also extremely important, but cannot be built without any possibility of first understanding why we might choose a particular analysis, and what it will look like applied to a particular dataset.

Even if you do not work in the field of sensory science, I hope that these examples will prove useful and easily understandable. Plus: thinking about how wine tastes is interesting, especially when we combine it with complicated statistics!

R Setup

You can read this bookdown entirely online using the navigation panels. However, if you want to learn to use R to conduct analyses like these, I strongly suggest you follow along. You’ll need to install R to do so. To install R, go to https://cran.r-project.org/ and follow the appropriate instructions for your operating system.

While it is not strictly necessary, you will almost certainly find it more pleasant to use the RStudio IDE (Interactive Development Environment) for R, which can be downloaded and installed from https://posit.co/products/open-source/rstudio/.

Here’s the list of packages I used in this bookdown:

package
tidyverse
FactoMineR
patchwork
ggrepel
ggedit
ggforce
DistatisR
SensoMineR
paletteer
here
broom
skimr
factoextra
naniar
agricolae
tidytext
brms
tidybayes
simputation
missMDA
corrr
widyr
rgl
candisc
MASS
ca
pls

Once you have set up R (and RStudio), you can run the following lines of code to install the packages I use in this bookdown. This might take a minute–and you might have to restart R to do it. Go get a snack!

packages <- c("tidyverse", "FactoMineR", "patchwork", "ggrepel", "ggedit", "ggforce", "DistatisR", "SensoMineR", "paletteer", "here", "broom", "skimr", "factoextra", "naniar", "agricolae", "tidytext", "brms", "tidybayes", "simputation", "missMDA", "corrr", "widyr", "rgl", "candisc", "MASS", "ca", "pls")

install.packages(packages, dependencies = TRUE)

Some of these tools install some additional (extra-R) tools. The Stan-based tools are most likely to cause trouble. If you cannot install them, consider the walkthroughs on the main Stan website. Not installing them means you just won’t be able to replicate my extended experiments with Bayesian modeling (probably not the greatest loss in the world for you).

What this is not

I will not be going over the basics of R coding and programming. You can pick up a a fair amount by following along, or if you are truly new to R I recommend checking out Wickham et al’s now-classic introduction, the Stat545 website, or any of the Carpentries workshops.

About me

I’m an associate professor of Food Science & Technology at Virginia Tech. I teach about sensory evaluation and about applied data analysis for food and ag scientists. I am not a statistician, but I interact with and consume a lot of statistics.

As you work, you may start a local server to live preview this HTML book. This preview will update as you edit the book when you save individual .Rmd files. You can start the server in a work session by using the RStudio add-in “Preview book”, or from the R console:

Session Info

At the end of chapter, I will be including a sessionInfo() chunk to try to make it easier to reproduce the work, as well as to diagnose any problems.

sessionInfo()
#> R version 4.3.1 (2023-06-16)
#> Platform: aarch64-apple-darwin20 (64-bit)
#> Running under: macOS Ventura 13.6.1
#> 
#> Matrix products: default
#> BLAS:   /Library/Frameworks/R.framework/Versions/4.3-arm64/Resources/lib/libRblas.0.dylib 
#> LAPACK: /Library/Frameworks/R.framework/Versions/4.3-arm64/Resources/lib/libRlapack.dylib;  LAPACK version 3.11.0
#> 
#> locale:
#> [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
#> 
#> time zone: America/New_York
#> tzcode source: internal
#> 
#> attached base packages:
#> [1] stats     graphics  grDevices utils     datasets 
#> [6] methods   base     
#> 
#> other attached packages:
#>  [1] lubridate_1.9.2 forcats_1.0.0   stringr_1.5.0  
#>  [4] dplyr_1.1.2     purrr_1.0.1     readr_2.1.4    
#>  [7] tidyr_1.3.0     tibble_3.2.1    ggplot2_3.4.3  
#> [10] tidyverse_2.0.0
#> 
#> loaded via a namespace (and not attached):
#>  [1] gtable_0.3.4      jsonlite_1.8.7    highr_0.10       
#>  [4] compiler_4.3.1    tidyselect_1.2.0  jquerylib_0.1.4  
#>  [7] scales_1.2.1      yaml_2.3.7        fastmap_1.1.1    
#> [10] R6_2.5.1          generics_0.1.3    knitr_1.43       
#> [13] bookdown_0.37     munsell_0.5.0     bslib_0.5.1      
#> [16] pillar_1.9.0      tzdb_0.4.0        rlang_1.1.1      
#> [19] utf8_1.2.3        stringi_1.7.12    cachem_1.0.8     
#> [22] xfun_0.39         sass_0.4.7        timechange_0.2.0 
#> [25] cli_3.6.1         withr_2.5.0       magrittr_2.0.3   
#> [28] digest_0.6.33     grid_4.3.1        rstudioapi_0.15.0
#> [31] hms_1.1.3         lifecycle_1.0.3   vctrs_0.6.3      
#> [34] evaluate_0.21     glue_1.6.2        fansi_1.0.4      
#> [37] colorspace_2.1-0  rmarkdown_2.23    tools_4.3.1      
#> [40] pkgconfig_2.0.3   htmltools_0.5.6

References

Kurz, A. Solomon. 2023. Doing Bayesian Data Analysis in Brms and the Tidyverse. Version 1.1.0. https://bookdown.org/content/3686/.