Unit 2: Various and Sundry about R and setting it up

Remember that there are always comments under the video below that review the content of the video.

Review of and comments on the video:

There are many references for R - maybe because the documentation is not really user-friendly. So, lots of people have written books and webpages, etc. It can be daunting to wade through all of the information and lots of times there is not enough information to allow you to run the program yourself. Thus, this site; I hope to show you in the videos that it does work, and what is necessary to do it yourself.

Here are the commands we looked at in the video and some others:

From the command line in R:
To find out where R goes to get things and store them (your workspace or work directory):getwd()
To set your working directory (choose your own path):setwd("C:/Users/Tudor/R/Data and Programmes")
To load your data file (for example: bdr.txt that includes column headings): mydata read.table("bdr.txt",header=T)
To edit a tableedit(mydata) or fix(mydata)
To save variables x,y to the working directory:save(x, y, file="mydata.Rdata")
To load data previously saved to the working directory:load("mydata.Rdata") - restores x and y as above
To make a dataframe out of the loaded table:mydata <- as.data.frame(mydata)
If you have a variable, myX, in your table, to pick out the myX column and make a variable out of itmyX mydata$myX
Some basic statistics with myX:
standard deviation

One sample t-test: to test if the mean of myX is 100:t.test(myX, mu=100)
Box plot of myX:boxplot(myX)
Histogram of myX:hist(myX)
QQ plot against normal (comparing quantiles to see how "normal" the data are - the straighter the lineup of points the more normal the data are):qqnorm(myX)
To store your images (png,gif,pdf,...)First:
png(file="C:/your path here/yourimagename.png")
THEN run your graphics.
Remember dev.off()

Just a quick note about variance and standard deviation. The standard deviation is the square root of the variance, so when you know one, you know them both. Either can be thought of as a measure of dispersion, that is to say, how tightly or loosely the data are distributed about the mean. So, for the bell curve, they would measure the "fatness" of the curve: small standard deviations correspond to narrow, peaked curves and larger standard deviations correspond to wider, flatter curves. Just for the record, the variance of a sample is the average of the squares of the differences of the observations from their mean. More later...

Of course, enhancements of all these are possible. You'll have to plunge into the documentation on the CRAN site, or elsewhere.

In case you have some experience and are looking for some quick references, here are some:
Cran reference card:https://cran.r-project.org/doc/contrib/Short-refcard.pdf
Jonathan Baron's help page: https://finzi.psych.upenn.edu/ Baron's reference card: https://www.psych.upenn.edu/~baron/refcard.pdf

Contact me at: dtudor@germinalknowledge.com

© Germinal Knowledge. All rights reserved