In this post I present some resources for business analytics in R. Most of them are books that can be consulted freely online, and also buy the printed edition.
Tabular data manipulation
- R for data science: https://r4ds.had.co.nz/
- R for data science translated: https://es.r4ds.hadley.nz/
This work explains the functionalities of tidyverse
, it works quite well as a reference book. I also attach the Spanish translation for Spanish-speaking users.
- Data.table vignettes: https://cran.r-project.org/web/packages/data.table/vignettes/
The data.table
library covers functionality similar to dplyr
and tidyr
. It has a more complex syntax, but is more versatile and can handle larger tables.
Graphics
- ggplot2: Elegant Graphics for Data Analysis: https://ggplot2-book.org/
- A ggplot2 Tutorial for Beautiful Plotting in R: https://www.cedricscherer.com/2019/08/05/a-ggplot2-tutorial-for-beautiful-plotting-in-r/
Two resources dedicated exclusively to exploiting the functionalities of ggplot. They can be useful for creating graphs to disseminate results within the organization or for publications.
- R Graphics Cookbook, 2nd edition: https://r-graphics.org/
Book similar to the previous two, which also presents the graphical utilities of base R.
- Storytelling with data: https://www.storytellingwithdata.com/
Book on how to design graphics and presentations, based on the premise that in any presentation we are telling a story.
Text
- Text Mining with R: A Tidy Approach: https://www.tidytextmining.com/
R also includes functionalities for analyzing text. This is an introduction to exploratory text analysis within the tidyverse framework.
Maps
- Geocomputation with R: https://r.geocompx.org/
A reference for those interested in examining geographic information, for example by examining business locations in a city. This text is a comprehensive introduction to spatial analysis with R.
Supervised learning and machine learning
- Hands-On Machine Learning with R: https://bradleyboehmke.github.io/HOML/
- The caret package: https://topepo.github.io/caret/index.html
- Tidy Modeling with R: https://www.tmwr.org/
Until 2019, caret was the standard for R supervised learning workflows, and many users rely on this framework today. The first two books present how to use caret. Later, the author of caret joined the team that develops tidyverse at Posit to develop tidymodels. This set of packages is presented in the third book of this listing.
Publication
In addition to providing analysis tools, R and RStudio provide ways to present results in a reproducible way.
- Rmarkdown: https://rmarkdown.rstudio.com/
A system to generate reproducible documents, presentations and dashboards.
- Quarto: https://quarto.org/
An open-access scientific and technical publishing system to create all types of documents: Jupyter notebooks, web pages, books or blogs.
- Shiny: https://shiny.posit.co/
An interactive website production system for data analysis.
Another resources
- Tidy tuesday: https://github.com/rfordatascience/tidytuesday
Since 2018, the Tidy Tuesday community has presented a dataset to work with throughout the week. The idea is to have data to practice and share the results with the community.
- Julia Silge’s blog: https://juliasilge.com/
Julia Silge’s blog presents practical R tutorials for exploratory data analysis and machine learning.
- The Jose M Sallan Static Website: https://jmsallan.netlify.app/
When possible, I publish posts here mainly about R and data analysis. This is a work in progress, which among other things lacks a post index and a proper name. I will be grateful if you can ask me questions and problems in R to create new posts.