Programming for Business Analytics

This course introduces the basics of programming using R for business applications.

Course Description

This course leverages R to equip students with the fundamental skills required for robust business data analysis. It addresses both practical programming skills and the conceptual understanding necessary to apply data science effectively in business contexts.

The goal is to inspire a passion for data analysis and foster a community among students to deepen their learning and enhance their collaborative skills.

Git and GitHub

Statistical computing lives in the world of plain-text tools (like R, Quarto/RMarkdown, and Git). While tools like Excel and Word are familiar, they often trap us in a cycle of filenames like report_final_v2.docx and forgotten steps (“how did I make this figure?”). For a conceptual overview, see Kieran Healy’s The Plain Person’s Guide to Plain Text Social Science.

We write code that is universally portable and reproducible, allowing us to recreate entire analyses with a single command. To manage this work, we use Git—a version control system that tracks the history of our entire project folder, rather than just “tracking changes” inside a single file. This workflow helps us keep the mess in check and collaborate effectively.

Mastering these tools also allows you to build a Professional Identity on GitHub. Your profile serves as a living portfolio, showcasing your actual code and contributions to prospective employers.

We will use the following books in this class.

  • [MD] Ismay, Chester and Albert Y. Kim. 2022. Statistical Inference via Data Science: A ModernDive into R and the Tidyverse.
  • [QSS] Imai, Kosuke and Nora Webb Willaims. 2022. Quantitative Social Science: An Introduction with Tidyverse, 2022. Princeton University Press.
  • [IMS] Mine Cetinkaya-Rundel and Johanna Hardin. 2021. Introduction to Modern Statistics. OpenIntro.
  • [AAG] R for Everyone: Advanced Analytics and Graphics, 2nd Edition by Jared P. Lander, O’Reilly Media, 2017.
  • [VT] Visualize This: The Flowing Data Guide to Design, Visualization, and Statistics by Nathan Yau, John Wiley & Sons, 2011.

Final Project

A specific focus of this course is the production of a polished, portfolio-ready project. Students develop a clear research question, locate and prepare data, and apply analytical techniques to answer it. The final deliverable is a publicly accessible article that demonstrates data fluency to prospective employers and peers.

Deliverable Timeline
Creating a GitHub Repository Week 4
Data and Proposal Week 6
First Visualization Week 9
First Analysis Week 11
Final Report Week 15

Learning Objectives

  • Data Visualization and Wrangling: Students will learn to summarize and visualize data, transforming messy data into tidy, analyzable formats.
  • Causality and Regression Analysis: The course emphasizes evaluating claims about causality and using linear regression to conduct data analysis.
  • Statistical Uncertainty: Understanding and quantifying uncertainty in data analysis are core components of the curriculum.
  • Professional Tools: Mastery of professional tools such as R, RStudio, git, and GitHub will be developed.
Course Info

ISS4066 / ISS5066
Fall 2025
Mon, 14:20-17:20
TSMC Bldg. R421

Course Highlights

Data Wrangling & Viz

Transforming messy data into tidy formats and communicating insights.

Causality & Regression

Evaluating causal claims and conducting rigorous regression analysis.

Statistical Uncertainty

Simulation-based (Bootstrapping) vs. Conventional (CLT) approaches to uncertainty.

Professional Workflow

Mastery of R, RStudio, Git, and GitHub for collaborative analytics.


Lecture Materials

  • Lecture 1: Course Intro
  • Lecture 2: R Programming Basics
  • Lecture 3: Data Types & Visualization I
  • Lecture 4: User Defined Functions & Visualization II
  • Lecture 5: Data Wrangling
  • Lecture 6: Causality
  • Lecture 7: Bivariate Relationships & Tidying Data
  • Lecture 8: Prediction and Iteration
  • Lecture 9: Regression and Model Fit
  • Lecture 10: More on Regression
  • Lecture 11: Sampling and Sampling Distributions
  • Lecture 12: The Bootstrap and CIs
  • Lecture 13: Hypothesis Testing
  • Lecture 14: Models of Uncertainty
  • Lecture 15: Inference for Regression

Video Lectures

Lecture 4: User Defined Functions & Visualization II
Lecture 5: Data wrangling
Lecture 6: Causality