Workload distribution
Session 1: Introduction to the course
Instructor: Eloi Puertas, PhD
Content:
- What is "the cloud"
- Intro to "Data Science"
- The Data Scientist's Toolbox
Session 2: Foundations: From the Computer to the Cloud
Instructor: David Arcos
Content:
- The Computer
- System Resources: CPU, Memory, HDD, I/O, Network
- The Internet
- Cloud Services: IaaS, PaaS, SaaS, Serverless
Session 3: Foundations: Cloud architecture
Instructor: David Arcos
Content:
- Scalability
- Distributed systems
- Distributed data stores
- The CAP Theorem
Session 4: Data Science in the Cloud: ETL
Instructor: Eloi Puertas, PhD
Content:
- Extract Data: Scraping
- Transform: Pandas
- Load: MongoDB
Session 5: Data Science in the Cloud: Data Analysis
Instructor: Eloi Puertas, PhD
Content:
- Hadoop Ecosystem
- Apache Spark
Session 6: Case Studies in Cloud Computing
Instructor: David Arcos
Content:
- Cloud vs On-Premises
- Cloud migrations
- Cloud-native companies
- Private Clouds
Session 7: Data Science in the cloud: Data Visualization
Instructor: Eloi Puertas, PhD
Content:
- The importance of visualization
- Visualization libraries and frameworks
Session 8: Ethics, social and legal aspects
Instructor: David Arcos
Content:
- Privacy laws: legal issues when using personal data
- Software licenses
- Open Data
- Open Government
- Security and risks
Session 9: Data Science in the cloud: Deployment
Instructor: Eloi Puertas
Content:
- Containers
- Web applications
- Cloud Compute Services
Course Learning Objectives
In the Big Data era, powerful technology infrastructures are more needed than ever. Nowadays companies need to have access to such technology but to own and maintain it can be very expensive. Cloud Computing is a new model for enabling ubiquitous access to shared pools of resources (computer networks, servers, storage, applications and services) over the Internet. By means of cloud computing companies can easily access to unlimited technology infrastructure power. In the case of data analytics, cloud computing is a cornerstone for deploying successful solutions.
Throughout this course, we will deal the basic concepts of cloud computing, starting from client-server and web application traditional solutions to more general approaches like software as a service or platform as a service solutions. Other aspects as security and ethics will also covered. The practical part of this course is towards solving data science projects using basic cloud computing infrastructures.
The goal of this course is to know about cloud computer infrastructures, security and to be able to solve a basic data science project basic flow.
The contents of this course are divided in cloud computing infrastructure lessons and data science problem solving lessons. Also some case of studies and security and ethics issues are discussed.
At the end of the course, students should:
* Be familiar with Cloud Computing basic concepts.
* Understand and apply common solutions for solving Data Science projects.
* Understand ethics and security aspects in cloud computing and data science.