
We will encounter many more examples of model formulas later on- such as when we use R for regression analysis. Notation of the type y ~ x can be read as "y described using x". We can draw a boxplot of the number of cancer cases according to each level of alcohol consumption (alcgp): (Look up the R help on this data set to find out more about the variables.) The first 5 rows of the data are shown below: As a further example, let us consider another data set esoph in R, relating to a case-control study of esophageal cancer in France, containing records for 88 age/alcohol/tobacco combinations. We have already looked at examples with histograms and boxplots. With grouped data, it is important to be able not only to create plots for each group but also to compare the plots between groups. If there are any missing values, these can be excluded if we simply adding an extra argument na.rm=T to tapply.Ĭompute the range and mean of Ozone levels for each month, using the tapply command. For example, we might want to compute the mean temperatures in each month: This can be done using the tapply() command. When dealing with grouped data, you will often want to have various summary statistics computed within groups for example, a table of means and standard deviations. Scatterplots in R (R Tutorial 2.6) MarinStatsLectures
Scatter plot in rstudio full#
The full range of point plotting symbols used in R are given by "pch" in the range 1 to 25 see the help on "points" to see what each of these represent. "col" refers to the color of symbols plotted. Repeats the scatterplot, this time with red filled circles that are nicer to look at.

> plot(airquality$Temp, airquality$Ozone, col="red", pch =19) The default plotting symbols in R are not always pretty! You can actually change the plotting symbols, or colors to something nicer. Write the following command in R and describe what you see in terms of relationships between the variables. With more than two variables, the pairs() command draws a scatterplot matrix. > plot(airquality$Temp, airquality$Ozone) # How do Ozone and temperature measurements relate? The R command for drawing a scatterplot of two variables is a simple command of the form "plot(x,y)." We will look at this in more detail later when we discuss regression and correlation. One very commonly used tool in exploratory analysis of multivariate data is the scatterplot.
