esade

Intro to R (2215.YR.005557.1)

General information

Type:

OBL

Curs:

1

Period:

S semester

ECTS Credits:


null ECTS

Teaching Staff:

Group

Teacher

Department

Language

Ruben Coca Marin

Operaciones, Innovación y Data Sciences

ENG


COURSE CONTRIBUTION TO PROGRAM


R is one of the leading software for data science worldwide. It is widely used in big companies and institutions, such as McKinsey&Company, Banco Santader or Telefónica.
Learning R is fundamental to understand better what is data science and to learn how to put analytics and artificial intelligence at work.

Course Learning Objectives

The objective of the seminar is to introduce relevant concepts of R and data analysis. At the end of the course, students should:

- Code instructions in R to perform basic data analysis: Import, summarise and visualize data. Complete simple statistical tests.

CONTENT

1. Session: Fundamentals of Data Analytics with R


Analysis of S&P500 performance

- What is R and R-Studio
- Basic operations in R
- How to import files
- Basic statistical analysis (summary, mean, median, variance¿)
- How to select data

2. Session: Data Visualization


World Development Indicators at a glance

- Advanced grouping techniques
- Basic visualization tools for exploring one single variable
- Basic visualization tools for exploring relationships between variables

3. Session: Data correlations and hypothesis testing


Lending Club performance

- Correlation and AUC
- Statistical estimation and error
- Introduction to confidence intervals
- Introduction to hypothesis testing

Methodology


The course format and methodological approach are based on a combination of explanations and practical parts. During the sessions participants will be provided with the material needed to follow this course. The material includes both the theoretical content of the different subjects to be discussed and the data needed to practice the concepts learned.
Participants will be with provided with real data sets for practices and will work in groups to solve different challenges by applying quantitative methods.

The course is divided in three sessions, each one including practice cases coming from real business situations.
Students may use their laptops/tablets on the lectures/practice sessions ONLY for the course activities. Use emailing, facebooking, tweeting, chatting, skyping, internet surfing, etc. should NOT be done during classes.

Assessment criteria


100% Class attendance
R test

Bibliography


- Bishop, C. M. (2006). Pattern recognition. Machine Learning, 128.

- Manyika, J., Chui, M., Brown, B., Bughin, J., Dobbs, R., Roxburgh, C., & Byers, A. H. (2011). Big data: The next frontier for innovation, competition, and productivity.

- Siegel, E. (2013). Predictive analytics: The power to predict who will click, buy, lie, or die. John Wiley & Sons.

- Google's R Style Guide: https://google.github.io/styleguide/Rguide.html

- The tidyverse style guide: https://style.tidyverse.org/

- Radziwill, N. M. (2017). End-to-end solved problems with R: A catalog of 26 examples using statistical inference. Lapis Lucera.

- Radziwill, N. M. (2019). Statistics (the Easier Way) with R, 3rd Ed: an informal text on statistics and data science. Lapis Lucera.

- David M. Diez, Mine Çetinkaya-Rundel, Christopher D. Barr (2019). OpenIntro Statistics, 4th Ed.

Timetable and sections

Group

Teacher

Department

Ruben Coca Marin

Operaciones, Innovación y Data Sciences

Timetable

From 2022/1/10 to 2022/1/12:
From Monday to Wednesday from 14:00 to 15:30.
From Monday to Wednesday from 16:00 to 17:30.