4.5 Merging Data

View the pitching data.table (created in Quiz 4.3) like a spreadsheet

View(pitching)


Read in the Salaries CSV dataset directly from the following link and save it to a variable called "salaries"- http://dgrtwo.github.io/pages/lahman/Salaries.csv

salaries = read.csv("http://dgrtwo.github.io/pages/lahman/Salaries.csv")


Turn the salaries data.frame into a data.table (called "salaries")

salaries = as.data.table(salaries)


View the salaries data.table like a spreadsheet

View(salaries)


What column or columns are shared between the pitching and salaries data.tables? (Separate columns names with ",")

playerID,yearID,teamID,lgID


Merge the pitching and salaries data.tables, saving them into a data.table called "merged", using all the columns that are shared between the tables

merged = merge(pitching, salaries, by=c("lgID", "playerID", "teamID", "yearID"))


Merge the pitching and salaries data.tables, but this time keep all rows of the pitching data.table, including those that don't have salary information. Save this into a variable called "merged.all"

merged.all = merge(pitching, salaries, by=c("lgID", "playerID", "teamID", "yearID"), all.x=TRUE)