Welcome to the course!
In this course, we will be using RStudio to interact with R, and write our annotated R code using Markdown.
RStudio Server is an open-source integrated development environment (IDE) that you can run on the cloud. The main advantage of RStudio with respect to other graphical interfaces, such as R GUI (the default), is that it integrates a powerful built-in text editor as well as other tools for plotting, debugging, and workspace management. The Server version of RStudio, which we are now running in an Amazon Web Services instance, will allow all of us to connect to the same server and run the code remotely.
Markdown is a simple formatting syntax to generate HTML or PDF documents. In combination with R, it will generate a document that includes the comments, the R code, and the output of running such code.
You can embed R code in chunks like this one:
1 + 1
## [1] 2
You can run each chunk of code one by one, by highlighting the code and clicking Run
(or pressing Ctrl + Enter
in Windows or command + enter
in OS X). You can see the output of the code in the console right below, inside the RStudio window.
Alternatively, you can generate (or knit) an html document with all the code, comment, and output in the entire .Rmd
file by clicking on Knit HTML
.
You can also embed plots and graphics, for example:
x <- c(1, 3, 4, 5)
y <- c(2, 6, 8, 10)
plot(x, y)
If you run the chunk of code, the plot will be generated on the panel on the bottom right corner. If instead you knit the entire file, the plot will appear after you view the html document.
Using R + Markdown has several advantages: it leaves an “audit trail” of your work, including documentation explaining the steps you made. This is helpful to not only keep your own progress organized, but also make your work reproducible and more transparent. You can easily correct errors (just fix them and run the script again), and after you have finished, you can generate a PDF or HTML version of your work.
We will be exploring R through R Markdown over the next few modules. For more details and documentation see http://rmarkdown.rstudio.com.
And one more thing! If you want to have the same setup as I will for the guided coding sessions, you will have to make the following two changes:
1 - Go to Tools
-> Global Options...
-> R Markdown
and unselect Show output inline for all R Markdown documents
2 - Within the same menu, go to Pane Layout
and select Console
on the top-right panel.
If you are running this script in your own computer, you will have to run the chunk below in order to install all the R packages we will use throughout the summer school. (Note that this may take a while!)
doInstall <- TRUE # Change to FALSE if you don't want packages installed.
toInstall <- c("streamR", "ggplot2", "stringr", "readr", "readtext",
"devtools", "quanteda", "dplyr", "igraph", "topicmodels",
"stringi", "igraph", "Matrix", "doMC", "glmnet", "xgboost",
"lsa")
if (doInstall) install.packages(toInstall)
if (doInstall) devtools::install_github("pablobarbera/twitter_ideology/pkg/tweetscores")
if (doInstall) devtools::install_github("pablobarbera/streamR/streamR")
if (doInstall) devtools::install_github("mukul13/rword2vec")