College: Harvard University
Certificate Price: $49
Length: 8 Weeks
Instructors: Rafael Irizarry

Data Science: Wrangling

In this course, part of our Professional Certificate Program in Data Science,we cover several standard steps of the data wrangling process like importing data into R, tidying data, string processing, HTML parsing, working with dates and times, and text mining. Rarely are all these wrangling steps necessary in a single analysis, but a data scientist will likely face them all at some point.

Very rarely is data easily accessible in a data science project. It’s more likely for the data to be in a file, a database, or extracted from documents such as web pages, tweets, or PDFs. In these cases, the first step is to import the data into R and tidy the data, using the tidyverse package. The steps that convert data from its raw form to the tidy form is called data wrangling.

This process is a critical step for any data scientist. Knowing how to wrangle and clean data will enable you to make critical insights that would otherwise be hidden.

Similar Courses

Computing for Data Analysis

The modern data analysis pipeline involves collection, preprocessing, storage, analysis, and interactive visualization of data. The goal of this course, part of the Analytics: Essential Tools and Methods MicroMasters program,…

Computer Hardware and Operating Systems

This is a self-paced course that provides an Introduction to Computer Hardware and Operating Systems This course will cover topics including: Fundamentals of system hardware Introduction to OS concepts OS…
C Programming: Advanced Data Types

C Programming: Advanced Data Types

In this course, part of the C Programming with Linux Professional Certificate program, you will define your own data types in C, and use the newly created types to more…
Saving Schools

Saving Schools

This course seeks to answer the question: how did a school system, once the envy of the world, stumble so that the performance in math, science, and reading of U.S.…
The Quantum World

The Quantum World

Welcome to The Quantum World! This course is an introduction to quantum chemistry: the application of quantum theory to atoms, molecules, and materials. You’ll learn about wavefunctions, probability, special notations,…
China and Communism

China and Communism

How did the Communists conquer China? What role does culture play? What are the successes and failures of the Chinese Communist Party after seizing power in 1949? What constitutes liberation?…
Big Data and Education

Big Data and Education

Online and software-based learning tools have been used increasingly in education. This movement has resulted in an explosion of data, which can now be used to improve educational effectiveness and…
Lending

Lending, Crowdfunding, and Modern Investing

In this course, you’ll learn the foundational theories behind robo-advising, crowdfunding, and marketplace lending, and how to apply these theories to optimize your investments. Professor David Musto of the Wharton…