R Programming Language – Level 1

Return to schedule

register-button24

 

Course Title R Programming Language – Level 1
Course number 900-086-EQ
Platform R and R-studio
Duration 21 hours
Gouvernement du Québec fee (taxes incl.) $42.00
General public fee (taxes incl.) $344.79
Schedule Saturday 9:00 – 12:00 & 13:00 – 15:15
Dates October 13, 20, 27; November 3
Prerequisites A college level understanding of Math and Linear Algebra. Basic concepts on algorithm design.
Target audience Software developers, data analysts, professionals in any area
Instructor Diego Perea PhD
Location Brittain Hall – BH-210
NB: This is a non-credit course.  Certificate provided for all participants who complete 80% of course hours.
   
Course description
Please note that this is a non-credit course.
 The last poll from the IEEE Spectrum magazine reported the R language as the 6th most popular programming language among readers. Other Traditional languages – like C, Java and Python – occupied the other first ten places, but R was the only non-traditional programming language in the top 10 list. R is a programming language for data statistics and it was designed by statisticians. In the last years, it has become the de facto industry standard to process data, analyze it and execute machine learning algorithms to obtain predictions.

In this course, we introduce participants to the R language syntax, data structures and modular functionality. There are 6 design labs to perform data loading, manipulation and graphical reporting. At the end of the course, participants will be comfortable using R in their data analysis and other types of projects.

   
Topics to be covered include:
  1. Introduction to R and R-studio
  2. R syntax and data structures
  3. Loading data in R
  4. R modular functionality – using and creating functions
  5. Introduction to data analysis
  6. Basic statistics using R
   
Weekly Topics
Please note that the instructor reserves the right to modify this schedule.
Week 1 Topic 1 and 2

  • Introduction, course description and R overview
  • R syntax and data structure
Week 2 Topic 3 and 4

  • Loading data in R and modular functionality
Week 3 Topic 5 and 6

  • Introduction to data analysis and basic statistics using R: Histograms, box and scatter plots

SOFTWARE TO BE USED
For the course, we will mainly use R, which is the industry standard for statistical learning and provides functions for most of the methods.

LABS and DATASETS
In the labs, the participant will apply the concepts and methods seen in class using practical datasets. Among others, we will use the following datasets:

  1. Uber trip data: trip information including Uber service type, source, destination, distance, duration and paid fare. Continuous regression methods are applied to estimate the trip fare based on trip distance, time an service demand.
    https://public.tableau.com/views/Lab4-DatacharacterizationA-categoricalfields/Dashboard2
  2. Advertisement data: dataset containing the budget spent on advertisement by a company on different markets. Continuous variable regression methods help design the best advertisement plan to maximize profit.
    https://public.tableau.com/shared/W5KC9CF5S
  3. Automobile features data: dataset containing car characteristics to develop a model that predicts whether a car gets high or low gas mileage.
    https://public.tableau.com/views/auto-mpgdatasetboxplots/Dashboard2

In addition to the previous datasets, participants are encourage to bring in their own data. The following are some website with good data sources.

www.kaggle.com
www.open.canada.ca/data/en/dataset

TOP