|Course Title||R Programming Language – Level 1|
|Platform||R and R-studio|
|Gouvernement du Québec fee (taxes incl.)||$42.00|
|General public fee (taxes incl.)||$344.79|
|Schedule||Saturday 9:00 – 12:00 & 13:00 – 15:15|
|Dates||October 13, 20, 27; November 3|
|Prerequisites||A college level understanding of Math and Linear Algebra. Basic concepts on algorithm design.|
|Target audience||Software developers, data analysts, professionals in any area|
|Instructor||Diego Perea PhD|
|Location||Brittain Hall – BH-309|
|NB: This is a non-credit course. Certificate provided for all participants who complete 80% of course hours.|
Please note that this is a non-credit course.
| The last poll from the IEEE Spectrum magazine reported the R language as the 6th most popular programming language among readers. Other Traditional languages – like C, Java and Python – occupied the other first ten places, but R was the only non-traditional programming language in the top 10 list. R is a programming language for data statistics and it was designed by statisticians. In the last years, it has become the de facto industry standard to process data, analyze it and execute machine learning algorithms to obtain predictions.
In this course, we introduce participants to the R language syntax, data structures and modular functionality. There are 6 design labs to perform data loading, manipulation and graphical reporting. At the end of the course, participants will be comfortable using R in their data analysis and other types of projects.
|Topics to be covered include:|
Please note that the instructor reserves the right to modify this schedule.
|Week 1||Topic 1 and 2
|Week 2||Topic 3 and 4
|Week 3||Topic 5 and 6
SOFTWARE TO BE USED
For the course, we will mainly use R, which is the industry standard for statistical learning and provides functions for most of the methods.
LABS and DATASETS
In the labs, the participant will apply the concepts and methods seen in class using practical datasets. Among others, we will use the following datasets:
- Uber trip data: trip information including Uber service type, source, destination, distance, duration and paid fare. Continuous regression methods are applied to estimate the trip fare based on trip distance, time an service demand.
- Advertisement data: dataset containing the budget spent on advertisement by a company on different markets. Continuous variable regression methods help design the best advertisement plan to maximize profit.
- Automobile features data: dataset containing car characteristics to develop a model that predicts whether a car gets high or low gas mileage.
In addition to the previous datasets, participants are encourage to bring in their own data. The following are some website with good data sources.