|Course Title||BIG DATA – Reporting and Prediction
|Platform||Tableau Public & Google Cloud Platform|
|Prerequisites||Basic understanding of databases is preferred. Otherwise, Excel spreadsheet processing experience and an understanding of computer software systems will suffice.|
|Target Audience||Data analysts; Computer Analysts; Individuals in any role dealing with small or large amounts of data needing to analyze it and produce insightful reports and dashboards.|
|Dates||September 21, 28; October 5, 19, 26; November 2, 9. No class on October 12; last class 9 a.m.-12 p.m.|
|Instructor||Diego Perea – Ph.D.|
|Schedule||Saturday 9 a.m. – 4:30 p.m. (30 minute lunch)|
|Gouvernement du Québec fee (taxes incl.)||$90.00|
|General public fee (taxes incl.)||$750.73|
Recommended textbook: Tableau and Google Cloud Platform on-line documentation
NB: This is a non- credit course. Certificate provided for all participants who have completed 80% of course hours.
|This course provides an introduction to data mining and big data reporting. It gives participants the concepts and software skills needed to research, load, process and analyze data to obtain the insights that big data provides. It focuses on developing software skills in Tableau and Google Big Query required to process data and prepare presentations and dashboards that highlight the data’s added value.
The course methodology is based on lectures led by the instructor who will present the concepts using industry examples. Each lecture is followed by a lab where participants complete specific tasks designed to reinforce the concepts introduced in the lecture.
In addition, participants are expected to complete a small assignment during the course and present it to the class. This will give them the confidence to apply the data mining skills learned in the course and to present the data insights in a clear, concise and engaging way.
In addition to these datasets, participants are welcome to bring their own data to produce reports and dashboards in preparation for their project. Find below the link to the previous student projects https://public.tableau.com/views/StudentAssignmentsFall2017/StudentAssignmentsStory
|Topics Covered in this Course|
Please note that the instructor reserves the right to modify this schedule
|Week 1||Introduction and lab preparation and topic 1|
|Week 2||Topics 1 and 2
Data processing stages in Data mining
Hardware and software systems for data mining
|Week 3||Topic 3
Extracting, Transforming and Loading data
|Week 4||Topics 4 and 5
Principles of data analysis
Effective reports and dashboards displaying the data insights
|Week 6||Topics 6 and 7
Big Data distributed processing systems
Connecting analytics to Big data systems
|Week 7||Topic 8
Forecasting and prediction
|Week 8||Topic 8 and concluding remarks.|
SOFTWARE TO BE USED
For the course, we will primarily use Tableau public to load, process and analyze data, and produce reports and dashboards. For the large data portion of the course, we will use Google Big Query platform. Other complimentary software includes MS Excel and MS Access. Data analytics software tools besides Tableau will be addressed in the course to provide participants a holistic view of data mining.
LABS and DATASETS
In the labs, participants will practice the skills needed for the different stages of the data mining process. Namely, ETL (Extraction Transformation and Load), Analysis and Reporting. During the course, we will use the following datasets.
- Uber trip data: trip information including Uber service type, source, destination, distance, duration and paid fare. Example
- On-line store purchasing behavior data: Characterization of on-line purchasing behavior. Example:
- Mobile video trending data: Characterization and trending analysis of video consumption from mobile devices. Example:
- Google Cloud NOAA data: Worldwide meteorological information including temperature, wind and rain for more than 60 years. Example:
- Google Cloud Shakespeare data: word count of all Shakespeare works. Example: