Description
This course focuses on introducing basic tools for conducting data science using the R programming language. Since it is an introductory course, some fundamental concepts of general R usage will also be covered. The course aims to provide a foundation for students to develop their own projects focused on analyzing biodiversity data.
Important: A single group will be opened at the time with the highest number of registered participants.
Schedules:
- Group #1: Starts Wednesday, October 9, and will be held every Wednesday from 3:00 PM to 5:00 PM (GMT-6) for 8 weeks.
- Group #2: Starts Wednesday, October 9, and will be held every Wednesday from 6:00 PM to 8:00 PM (GMT-6) for 8 weeks.
- Modality: Virtual.
- Theoretical/Practical: To complete the program, participants must attend more than 75% of virtual synchronous classes and achieve an average greater than or equal to 70 in evaluations.
- Cost: Free.
- Availability of at least 16 hours during the entire program to attend eight virtual synchronous sessions. (2 hrs/class)
- Availability of at least 24 hours during the entire the program for completing short assignments, labs, and a final project. (3 hrs/week)
- Fill out the form to Participate in Redbioma activities (previously circulated, please fill out only once).
Registration form
Link: Registration for R Programming Language
Objectives
General
Develop programming and problem-solving skills using biodiversity data with the R programming language. The course will provide the basics of programming in R and tools for various data science processes.
Specific
- Provide the basics of programming in R.
- Offer knowledge on the use of functions and relevant packages for biodiversity data analysis.
- Identify strategies for using new packages and functions.
Course methodology
The course will be based on active learning through problem-solving. While the course is theoretical/practical, its primary focus will be on practical exercises. These exercises will demonstrate the use of specific functions, followed by applying these concepts in new practice sets as assignments.
The program is theoretical/practical, allowing participants to apply theoretical knowledge through case studies, group discussions, and a research project.
Important:- All synchronous sessions will be recorded and published on the redbioma website.
- Final research projects will be published on the redbioma website.
Program content
- Introduction to R, RStudio, Programming
- Basics of programming in R.
- Object types.
- Data handling
- Data science workflow.
- Packages.
- Data I/O.
- Data organization
- Preprocessing with dplyr and tidyr.
- Statistical analysis
- Parametric and non-parametric analyses with rstatix.
- Visualization
- Customizable plots with ggplot2.
- Diversity metrics
- Diversity indices and species accumulation curves.
- Models
- Model training and evaluation with tidymodels.
- Closing and final project description
- Report creation with Quarto.
- Recap.
Evaluation
Students will complete assignments and a final project. Evaluation items are as follows:
Item | Value (%) |
---|---|
Synchronous classes | 40 |
Assignments | 30 |
Final project | 30 |
Total | 100 |
The final project involves applying the knowledge acquired in the course to a problem of interest to each student.
Assignments will be given after each class and can be submitted until 12:00 AM (GMT-6) on the day of the next class.
Class schedule
Class | Week |
---|---|
Introduction | 1 |
Packages, Data I/O | 2 |
Data organization | 3 |
Statistical analysis | 4 |
Customizable plots | 5 |
Diversity metrics | 6 |
Models | 7 |
Report creation & Closure | 8 |
Materials
Some supporting materials:
- R4DS
- Cheatsheets (HTML)
- Cheatsheets (PDF)
- RStudio Education
- Posit Academy
- Posit Recipes
- Biodiversity Data Science
References
-
Wickham, H., Çetinkaya-Rundel, M., & Grolemund, G. (2023). R for data science: Import, tidy, transform, visualize, and model data (2nd ed.). External link
-
Wilke, C. (2019). Fundamentals of data visualization (1st ed.). External link
-
Kuhn, M., & Silge, J. (2023). Tidy modeling with R. External link
-
Roswell, M., Dushoff, J., & Winfree, R. (2021). A conceptual guide to measuring species diversity. Oikos, 130(3), 321–338. External link
Contacts
Professors | |
---|---|
Instructor: Jonathan Solórzano Villegas | jonathanvsv@gmail.com |
María Auxiliadora Mora | maria.mora@itcr.ac.cr |