Geospatial biodiversity data processing using the Python programming language

Description

This course is focused on handling, visualizing, and analyzing geospatial biodiversity data using the Python programming language. It covers the fundamentals of Python, its geospatial modules, and its application to processing biodiversity data. The course demonstrates how data science methodologies and techniques can be applied to the geospatial component of biodiversity data.


Schedule and start date
Important: Only one group will be available.
  • Starts Monday, January 13 2025, and will be held every Monday from 6:00 PM to 8:00 PM (GMT-6) for 8 weeks.


Course type
  • Modality: Virtual.
  • Theoretical/Practical: To complete the program, participants must attend more than 75% of virtual synchronous classes and achieve an average greater than or equal to 70 in evaluations.
  • Cost: Free.


Requirements
  • Availability of at least 16 hours during the entire program to attend eight virtual synchronous sessions. (2 hrs/class)
  • Availability of at least 24 hours during the entire the program for completing short assignments, labs, and a final project. (3 hrs/week)
  • Fill out the form to Participate in Redbioma activities (previously circulated, please fill out only once).


Registration form

Link: Registration for Geospatial data processing


Objectives

General

To learn how to develop Python programs focused on processing geospatial biodiversity data.


Specific

  • Apply a data science approach to import, transform, visualize, analyze, and communicate geospatial biodiversity data.
  • Develop replicable solutions to computational problems using Python.
  • Integrate tabular, graphical, and geospatial visualizations of biodiversity data into documents and interactive applications developed in Python.


Course methodology

The course is delivered through theoretical and practical virtual synchronous classes. Theoretical concepts are explained by the instructor during class sessions and through assigned readings. Practical sessions are dedicated to programming exercises performed by students.

Lesson content is available on the course website, which includes links to bibliographic resources and other learning materials such as tutorials and videos.

Important:
  • All synchronous sessions will be recorded and published on the redbioma website.
  • Final research projects will be published on the redbioma website.


Program content

  1. Introduction to the course
    1. Course program review.
    2. Data science and reproducibility.
    3. Markdown language.
    4. Jupyter notebooks.
  2. Python programming language
    1. Basic data types.
    2. Variables.
    3. Expressions.
    4. Comments.
    5. Conditionals.
  3. Python programming language (continued)
    1. Loops.
    2. Data structures.
      1. Lists.
      2. Tuples.
      3. Sets.
      4. Dictionaries.
  4. Introduction to biodiversity data
    1. Darwin Core standard.
    2. Pygbif library.
  5. Data Analysis and visualization
    1. Pandas library.
  6. Statistical plotting.
    1. Matplotlib library.
    2. Plotly library.
  7. Introduction to geospatial data
    1. Data models.
    2. Geographic information systems.
  8. Vector data
    1. GeoPandas library.
  9. Raster data
    1. Rasterio library.
  10. Visualization of biodiversity geospatial data
    1. Leafmap library.


Evaluation

Students will complete assignments and a final project. Evaluation items are as follows:


Item Value (%)
Attendance 30
Assignments 30
Final project 40
Total 100


Class schedule

Class Week
Introduction 1
Python programming language 2
Python programming language (continued) 3
Introduction to biodiversity data, Analysis, and Visualization 4
Statistical plotting 5
Introduction to geospatial data, Vector data 6
Raster data 7
Visualization of biodiversity geospatial data 8


References

  1. Downey, A. B. (2024). Think Python: How to think like a computer scientist (3rd ed.). Green Tea Press. External link

  2. Geopandas contributors. (s.f.). geopandas: Geographic pandas extensions. Recuperado 1 de enero de 2022, de External link

  3. GeoPandas contributors. (s.f.). GeoPandas: Geographic pandas extensions. Recuperado el 1 de enero de 2022, de External link

  4. Kaggle. (s.f.-a). Learn Pandas. Recuperado 1 de agosto de 2024, de External link

  5. Kaggle. (s.f.-b). Learn Data Visualization. Recuperado 1 de agosto de 2024, de External link

  6. Kaggle. (s.f.-c). Learn Geospatial Analysis. Recuperado 1 de agosto de 2024, de External link

  7. Markdown Tutorial. (s.f.). Markdown Tutorial. Recuperado el 19 de marzo de 2022, de External link

  8. P, C. (s.f.). Plotly: An open-source, interactive data visualization library for Python. Recuperado 1 de agosto de 2024, de External link

  9. Rey, S. J., Arribas-Bel, D., & Wolf, L. J. (2020). Geographic Data Science with Python. External link

  10. Severance, C. R. (2016). Python for everybody: Exploring data in Python 3 (S. Blumenberg & E. Hauser, Eds.). CreateSpace Independent Publishing Platform. External link

  11. The Pandas Development Team. (s.f.). Pandas: Powerful data structures for data analysis, time series, and statistics. Recuperado 1 de enero de 2022, de External link


Contacts

Professors Email
Instructor: Manuel Vargas mfvargas@gmail.com
María Auxiliadora Mora maria.mora@itcr.ac.cr