GSU Library Research Guides: R: Basic Graphing (2024)

Histograms are best to plot continuous level variables because, as the name suggests,the values are on a continuum. Histograms are very helpful for investigating the distribution of continuous variables which is important for determining if a variable needs to be recoded.

Code

GSU Library Research Guides: R: Basic Graphing (1)

We can create histograms either through base R or ggplot2 package.

  • In base R, we usehist() function we are plotting a distribution of expenditure variable.
  • In tidyverse, we use ggplot() and geom_histogram() functions to create the same graph.
  • In comparison to base R, ggplot() function enables us to customize our plots. For instance, we were able to change the count of bins, added a theme (theme_bw() function), and change the labels of the x-axis and y-axis using labs() function.

Output from Base R

GSU Library Research Guides: R: Basic Graphing (2)

Output from ggplot()

GSU Library Research Guides: R: Basic Graphing (3)

Output from ggplot() - improved version

GSU Library Research Guides: R: Basic Graphing (4)

The histogram shows us the range of ages among theobservations and the frequency of occurrence. We can also see that the distribution of expenditure does not follow a normal curve (it is closer to normal curve, but it is not normal)and is skewed to the right. This may effect our results of our earlier statistical tests.

Boxplots, often called box-and-whisker plots andare used to represent the quartiles of continuous level variables. Boxplots display the variation in the sample with boxes that represent the quartiles and 'whiskers' of observationsoutside the upper and lower quartiles. These plots can be done with a single variable ormultiple variables, as we will see below.

Code

GSU Library Research Guides: R: Basic Graphing (5)

We can create boxplots either through base R or ggplot2 package.

  • In base R, we useboxplot()function we are plotting a distribution ofexpenditurevariable.
  • Intidyverse, we useggplot()andgeom_boxplot()functions to create the same graph.
  • In comparison to base R,ggplot()function enables us to customize our plots. For instance, we were able toadd a theme (theme_bw()function), and change the labels of the x-axis and y-axis usinglabs()function.

The boxplotsbelow showus the median (just above 5,000) of the variable expenditure with a horizontal line inside the gray box. The top and bottom edgesof the gray box are the 25 (Q1) and 75 (Q3) quartiles of the distribution. Next, the whiskers are the minimum and maximum values recorded for expenditure of the observations. Dots are outliers.

Output from base R

GSU Library Research Guides: R: Basic Graphing (6)

Output from ggplot()

GSU Library Research Guides: R: Basic Graphing (7)

Output from ggplot() - improved version

GSU Library Research Guides: R: Basic Graphing (8)

Code

We can also create a boxplot of expenditure variable by other variables. For instance, we can graph expenditure by two counties in county variable.

GSU Library Research Guides: R: Basic Graphing (9)

This code might look intimidating at first. However, each step helps us to configure a specific aspect of the plot:

  • filter() function helps us to filter county variable into only two options: Sonoma and Merced
  • geom_boxplot() function creates a boxplot of expenditure by county
  • theme_bw() function creates black-and-white theme for the plot
  • labs() function changes the x-axis and y-axis names
  • coord_flip() function flips the coordinates x and y
  • scale_x_continuous() function helps us to change how x-axis scale looks like
    • breaks argument with seq() function helps to alter the x-axis ticks
    • limits argument helps us to alter the limits of the x-axis (lower and upper limits)

Output

GSU Library Research Guides: R: Basic Graphing (10)

This box plot is separated by the two counties (Merced and Sonoma)and expenditure is represented in the y-axis. This helps us to see the distribution of expenditure by county.

Bar plots are bested used to represent ordinal level variables to show the distribution of the options. We can graph a bar plot of a single variable or multiple variables for a direct comparison.

Code

GSU Library Research Guides: R: Basic Graphing (11)

We can create bar plots either through base R or ggplot2 package.

  • In base R, we usebarplot()function we are plotting a distribution of gradesvariable.
  • Intidyverse, we useggplot()andgeom_bar()functions to create the same graph.
  • In comparison to base R,ggplot()function enables us to customize our plots. For instance, we were able toadd a theme (theme_bw()function), and change the labels of the x-axis and y-axis usinglabs()function.

Output from base R

GSU Library Research Guides: R: Basic Graphing (12)

Output from ggplot()

GSU Library Research Guides: R: Basic Graphing (13)

Output from ggplot() - improved version

GSU Library Research Guides: R: Basic Graphing (14)

The bar plots above showthe raw count of observations of the variable grades broken upby the observations. We can clearly see that there are more KK-08 grades than KK-06 grades in the dataset.

Code

GSU Library Research Guides: R: Basic Graphing (15)

This code might look intimidating at first. However, each step helps us to configure a specific aspect of the plot:

  • filter()function helps us to filter county variable into only two options: Sonoma and Merced
  • geom_bar()function creates a boxplot of expenditure by county
    • fill and color arguments help us to fill and color our bar plot by county variable
  • theme_minimal()function creates a minimal theme for the plot
  • labs()function changes the x-axis and y-axis names
  • coord_flip()function flips the coordinates x and y
  • scale_y_continuous()function helps us to change how y-axis scale looks like
    • breaksargument withseq()function helps to alter the y-axis ticks
    • limitsargument helps us to alter the limits of the y-axis (lower and upper limits)

Output

GSU Library Research Guides: R: Basic Graphing (16)

We have broken the observations by grades (KK-06 and KK-08) and the county(Merced and Sonoma district).

Scatter plots are best used to graphically showif there is a relationship between two variables and what that relationship may looklike.

Code

GSU Library Research Guides: R: Basic Graphing (17)

We can create bar plots either through base R or ggplot2 package.

  • In base R, we useplot()function we are plotting a distribution ofgradesvariable.
  • Intidyverse, we useggplot()andgeom_point()functions to create the same graph.
  • In comparison to base R,ggplot()function enables us to customize our plots. For instance, we were able toadd a theme (theme_bw()function), and change the labels of the x-axis and y-axis usinglabs()function, and even add a regression line using geom_smooth() function.

Output from base R

GSU Library Research Guides: R: Basic Graphing (18)

Output from ggplot()

GSU Library Research Guides: R: Basic Graphing (19)

Output from ggplot() - improved version

GSU Library Research Guides: R: Basic Graphing (20)

Above arescatter plots of the variables students by teachers. Scatter plots are very helpful when examining continuous level variables and if a graphical relationship exists. We can see in this scatter plot that there is a linear and positive relationship between the number of students and teachers.After looking as this graph, we would next want to conduct statistical tests to see if the relationships is statically significant.

GSU Library Research Guides: R: Basic Graphing (2024)

References

Top Articles
Latest Posts
Article information

Author: Melvina Ondricka

Last Updated:

Views: 6732

Rating: 4.8 / 5 (48 voted)

Reviews: 87% of readers found this page helpful

Author information

Name: Melvina Ondricka

Birthday: 2000-12-23

Address: Suite 382 139 Shaniqua Locks, Paulaborough, UT 90498

Phone: +636383657021

Job: Dynamic Government Specialist

Hobby: Kite flying, Watching movies, Knitting, Model building, Reading, Wood carving, Paintball

Introduction: My name is Melvina Ondricka, I am a helpful, fancy, friendly, innocent, outstanding, courageous, thoughtful person who loves writing and wants to share my knowledge and understanding with you.