- 1 Getting started with R
- 2 Functions
- 3 Data types and structures
- 4 Reading and manipulating data
- 5 Reading and writing files
- 6 Manipulating data
- 7 Creating Graphs using ggplot2
- 8 Getting started with R
Markdown
- 8.1 Creating and an R Markdown file
- 8.2 Basic parts of the file and syntax
- 8.3 YAML header
- 8.4 Global options chuck
- 8.5 Document body
- 8.6 Headings
- 8.7 Body text
- 8.8 Links
- 8.9 Code chunks
- 8.10 Generate the report
- 8.11 Useful keyboard shortcuts on windows
- 8.12 Visualizing tables with DT
- 8.13 Scatter plot
- 8.14 Exercise 2
- 9
Version control with Git, GitHub, Sourcetree and RStudio
- 9.1 Git, GitHub
- 9.2 Installing Git
- 9.3 Accessing GitHub
- 9.4 Sourcetree
- 9.5 Configure Sourcetree and Github
- 9.6 Enable version control in R Studio
- 9.7 Add GitHub account to Sourcetree
- 9.8 set up GitHub to accept communication with Sourcetree
- 9.9 Create repository on GitHub.
- 9.10 Create a new R Studio project.
- 9.11 Add the R Studio project to Sourcetree
- 9.12 Connect
R Studio Project
andGitHub repository
- 9.13 Main Git operations with version control with Sourcetree
R is a scripting language known for its simple syntax. It is suitable for anyone desiring start or enhance their journey in data management, analysis and scientific research. It can be used to retrieve data, clean, analyse and visualize the data.
- Simple syntax
- Variety of packages to handle wide range of tasks
- Powerful RStudio IDE for simplify code management and getting started
- Also R has a large community of users and you will easily get support
Before we start coding, we need to setup the coding environment. We shall install R and RStudio the graphical Integrated Development Environment (IDE) for R. RStudio makes using R much easier and more interactive.
Download the R installation file.
Run the downloaded file and accept prompts to install.
Download the RStudio installation file.
Run the downloaded file and accept prompts as well.
RStudio is divided into 4 panels
- Source (top-left)
- R Console (top-right)
- Environment/History (top-right)
- Files/Plots/Packages/Help/Viewer (bottom-right)
We can use RStudio console to give commands to R on what it should do.
write the line below in the console and hit enter
print("Hello world")
simple calculations
1 + 5
We can create new scripts and put our commands that R should perform.
Follow the steps shown in the figure to create a new file.
Let us now add the previous code into the script and save the script.
print("Hello world")
# This is a comment
# simple calculations
1 + 5
We use <-
symbol to assign objects to variables.
This symbol has the followig shortcut on windows computers alt + -
# assigning
greeting <- "Hello"
print(greeting)
# joining strings
my_name <- "John"
greeting_2 <- paste(greeting, ":", my_name)
print(greeting_2)
A function is a block of code that performs a specific task when instructed or called to do the task.
Programming languages have built in functions and developers can also create their own functions to perform desired tasks.
We have already used functions like print()
to output some
contents given and the paste()
function to concatenate/join
strings. These are built in function that we can use to perform the
tasks for which they were created.
Apart from using the built in functions, we can also create new functions to easily perform tasks. This could be trying to reduce on duplicating code for repetitive tasks or as away to share code with others through functions.
To define a function, we use the function
keyword. The function
may have or may not have arguments
The function syntax can take on the format below:
function_name <- function(argument_1, argument_2, ...){
function_body
}
- function_name. The name of the function
- arguments. The function may have several arguments/inputs depending on the needs. These could be data of different types.
- function_body. This is where we put instructions using code to perform different tasks.
# a function to join entered names
my_fullname <- function(first_name, last_name) {
combined_name <- paste("My name is: ",
first_name,
toupper(last_name))
combined_name
}
# a function to multiply arguments
my_multiplication <- function(first_num = 10, second_num = 6) {
multiply_arguments <- first_num * second_num
my_result <- paste("My result is: ", multiply_arguments)
my_result
}
We call functions by giving their names and supplying arguments in case they are required by the functions. Let us use examples from the functions we have created.
# call my_fullname function
my_fullname(first_name = "John", last_name = "Peter")
# call my_multiplication function
my_multiplication() # uses default values of the arguments
my_multiplication(first_num = 125, second_num = 61) # uses supplied values to arguments
If functions have documentation, lieke inbuilt function and functions
from other created packages, we can access their documentation using the
help()
function or ?
symbol to access the documentation of
the functions and how these functions can be used.
# using help()
help("paste")
help(paste)
# using ?
?"paste"
?paste
Variables can store data of different data types. In R, the data type of the R-object becomes the data type of the variable.
- numeric - (6.5, 71, 217)
- integer - (1L, 33L, 301L, “L” declares this as an integer)
- complex - (9 + 3i, where “i” is the imaginary part)
- character (a.k.a. string) - (“m”, “TRUE”, “Getting started with R”, “FALSE”, “23.1”)
- logical (a.k.a. boolean) - (TRUE or FALSE)
The class()
function is used to check the data type of a variable
greeting <- "Hello"
class(greeting)
my_data <- c("m", "TRUE", "Getting started with R", "FALSE", "23.1")
class(my_data)
A vector is a common data type in R and is composed of a series of values that are of the same data type. Either numbers or characters or logical.
We create a vector using the c()
function and seperate the items
using a comma.
# Create a vector.
fisher_men <- c("Peter", "Andrew", "James", "John") # character vector
print(fisher_men)
# Get the class of our vector.
print(class(fisher_men))
vector_num <- c(11, 101, 23, 50.3) # numeric vector
class_vec_num <- class(vector_num)
vector_log <- c(TRUE, FALSE, FALSE, TRUE) # logical vector
class_vec_log <- class(vector_log)
print(paste("The class of vector_num is: ", class_vec_num))
print(paste("The class of vector_log is: ", class_vec_log))
To access elements of a vector, we use indices. The index of the first element in a vector is 1, the second is 2 …
We provide the index in square brackets to retrieve the desired element from the vector.
fisher_men <- c("Peter", "Andrew", "James", "John") # character vector
# first element
fisher_men[1]
# last element
fisher_men[length(fisher_men)]
# particular elements
fisher_men[fisher_men %in% c("John", "Andrew")]
vector_num <- c(11, 101, 23, 50.3) # numeric vector
# first element
vector_num[1]
# last element
vector_num[length(vector_num)]
# particular elements
vector_num[vector_num > 15]
Repeat vectors using the rep()
function
# each
rep(c("Peter", "Andrew", "James", "John"), each = 5)
# times
rep(c("Peter", "Andrew", "James", "John"), times = 5)
rep(c("Peter", "Andrew", "James", "John"), times = c(4, 2, 3, 1))
Use the :
symbol or seq()
function to create a vector of
sequencies.
# using ':'
1:20
# using 'seq()'
seq(from = 1, to = 20, by = 1)
seq(from = 1, to = 20, by = 4)
seq(from = 1, to = 100, by = 5)
A list can contain objects of different data types (mixture of numeric, character, logical …)
We use the list()
function to create a list.
fisher_men <- c("Peter", "Andrew", "James", "John") # character vector
vector_num <- c(11, 101, 23, 50.3) # numeric vector
vector_log <- c(TRUE, FALSE, FALSE, TRUE, FALSE)
my_list <- list(fisher_men, vector_num, vector_log)
# access elements of a list
my_list[[1]]
my_list[[length(my_list)]]
# named list elements
my_list <- list("first_elem" = fisher_men,
"second_elem" = vector_num,
"third_elem" = vector_log)
my_list$first_elem
my_list$second_elem
# Get the type of an object using the typeof() function
typeof(my_list)
# Get the length of the list using the length() function
length(my_list)
# last element of a list
my_list[[length(my_list)]]
Data frames are tabular data objects. They can store data in columns of
different data types. Some can be character
columns, others numeric
and others logical
. A data frame is a two dimensional object and
stores lists of vectors of the same length.
A data frame is the most common way of storing tabular (table or spreadsheet) data and is something you will work with often.
We use the function data.frame()
to create a data frame.
my_dataframe <- data.frame(
id = 1:4,
name = c("Peter", "Andrew", "James", "John"),
height=c(1.65, 1.61, 1.72, 1.55)
)
We can check the structure of the data frame using the str()
function
str(my_dataframe)
We can use the [, [[, $
operators to access the elements of the
data frame.
# [
my_dataframe["name"]
# [[
my_dataframe[["name"]]
# $
my_dataframe$name
we use dim()
function to find the number of row and columns of a
data frame. We can also use the nrow()
function to find the number
of rows and the ncol()
function to find the number of columns of
the data frame.
# dimension of the data frame
dim(my_dataframe)
# number of rows
nrow(my_dataframe)
# number of columns
ncol(my_dataframe)
We are going to practice with swirl. swirl is an R package that consists of a collection of interactive courses for learning R.
Use the following code to install the swirl package.
install.packages("swirl")
load the swirl
package using the libray()
function to load
the functions in the package.
library("swirl")
Use the install_course()
function to install the
R_Programming_E
course we shall be following for our first
practice.
install_course("R_Programming_E")
swirl()
Enjoy coding.
In order to manipulate data, we need to create, read/import data into the R environment using appropriate functions. We can use functions provided in the base system where applicable or functions in other containers (packages) specialized for specific file types.
R comes with functionality in the base system. This functionality is what you get when you download and install R. This can be extended using containers (packages) that have other functionality or different implementation. These packages contain functions to perform specific tasks.
For packages on The Comprehensive R Archive Network (CRAN), you can install them using the install.packages() function, specifying the name of the function in quotes. This can be done in the console pane since we do not need to save this in a script.
The code below will install all tidyverse related packages. tidyverse is a group of packages designed with a similar philosophy, grammar, and data structures aimed at simplifying data management and analysis.
install.packages("tidyverse")
NOTE For packages under development by developers that may not be on CRAN, we can install them using the devtools or remotes packages. Let us install the devtools package with install.packages(“devtools”).
We can now install a package using devtools
install.packages("devtools")
devtools::install_github("twesigye10/supporteR")
We can load packages using the library() function.
library(tidyverse)
library(supporteR)
Before we start reading and writing files, let us add more sub folders to our project.
We can use supporteR
to easily create sub folders in our project using
the create_project_sub_folders()
function. This will help us to
keep the folder names uniform as we walk the journey together. You can
write the following code in the console and hit enter.
supporteR::create_project_sub_folders()
We are going to use iris
data that is included the datasets
package.
We shall export this data into a csv
for illustration purposes.
We can write a csv
file using the the write_csv()
function from the
same readr
package. After exporting the data into the outputs folder,
copy it over into the inputs folder. This has been done to indicate that
we can always place the data we want to read/import into the inputs
folder and what ever we want to export into the outputs
folder.
library(tidyverse)
# write_csv is a function inside readr package of tidyverse
write_csv(datasets::iris, file = "outputs/iris_data.csv")
We are now going to read in this data stored in the inputs
folder of
the project.
library(tidyverse)
# read_csv is a function inside readr package of tidyverse
df_iris <- read_csv("inputs/iris_data.csv")
We use the select()
function from the dplyr
package to subset
columns. That is to choose/remove columns of interest from the data
frame and work only with the desired columns.
With iris
example we are going to select columns of interest by
library(tidyverse)
df_iris <- read_csv("inputs/iris_data.csv")
select(.data = df_iris, Species, Petal.Length)
select(.data = df_iris, c("Species", "Petal.Length"))
# another way. Using a pattern
select(.data = df_iris, starts_with("Sepal."))
select(.data = df_iris, ends_with(".Width"))
We use the filter()
function from the dplyr
package to subset rows
of interest. This can be based on entries in one column or more columns.
library(tidyverse)
df_iris <- read_csv("inputs/iris_data.csv")
# considering one specific entry
filter(.data = df_iris, Species %in% c("versicolor"))
# considering two specific entry
filter(.data = df_iris, Species %in% c("versicolor", "virginica"))
# considering key word
filter(.data = df_iris, str_detect(string = Species, pattern = "versi"))
# considering more than one column
filter(.data = df_iris, Sepal.Length > 5, str_detect(string = Species, pattern = "versi"))
We can add columns to the data frame using the mutate()
function from
dplyr
package. This adds the new columns at the end of the data frame.
library(tidyverse)
df_iris <- read_csv("inputs/iris_data.csv")
mutate(.data = df_iris, sepal_length_category = ifelse(Sepal.Length > 6, "greater_than_6", "less_than_6"))
mutate(.data = df_iris, sepal_length_category = case_when(Sepal.Length < 5 ~ "cat_less_than_5",
Sepal.Length < 7 ~ "cat_5_6",
Sepal.Length >= 7 ~ "cat_7+"))
We can use the rename()
function to rename column names. We use the
format new_name = old_name
library(tidyverse)
df_iris <- read_csv("inputs/iris_data.csv")
rename(.data = df_iris, sepal_length = Sepal.Length)
Use the rename_with()
function to rename multiple columns at once
following a certain pattern. We specify a function to do the renaming.
# convert all columns to upper case
rename_with(.data = df_iris, .fn = toupper)
# convert columns that start with "Petal" to lower case
rename_with(.data = df_iris, .fn = tolower, .cols = starts_with("Petal"))
Use the summarise()
function to collapse the data frame into fewer
rows based on the summary you are creating. If no groups are present in
the data frame, the resulting data frame will be one row with the
summary.
library(tidyverse)
df_iris <- read_csv("inputs/iris_data.csv")
summarise(.data = df_iris,
mean_sepal_length = mean(Sepal.Length),
mean_sepal_width = mean(Sepal.Width)
)
# summarise based on grouping
df_iris_grp <- group_by(.data = df_iris, Species)
summarise(.data = df_iris_grp,
mean_sepal_length = mean(Sepal.Length),
mean_sepal_width = mean(Sepal.Width)
)
Pipes help us to pipe/chain operations together. This way we could
combine renaming columns, filtering data and summarizing together. R
comes with the |>
operator starting from R version 4.1.0. Before,
there was the %>%
operator from magrittr
package of tidyverse
.
When using the pipes, the resulting data frame from the an operation is the input data frame for the next operation.
library(tidyverse)
df_iris <- read_csv("inputs/iris_data.csv")
df_iris |>
filter(Species %in% c("versicolor", "virginica")) |>
mutate(sepal_length_category = case_when(Sepal.Length < 5 ~ "cat_less_than_5",
Sepal.Length < 7 ~ "cat_5_6",
Sepal.Length >= 7 ~ "cat_7+")) |>
rename_with(.fn = tolower, .cols = starts_with("Petal"))
# summarise
# group numbers
df_iris |>
group_by(Species) |>
summarise(count = n())
# mean
df_iris |>
group_by(Species) |>
summarise(mean_sepal_length = mean(Sepal.Length),
mean_sepal_width = mean(Sepal.Width)
)
We can export results the same way we exported the iris dataset. We can create a variable and use it to export or we can still use pipes.
df_iris <- read_csv("inputs/iris_data.csv")
df_sep_measurements_mean <- df_iris |>
group_by(Species) |>
summarise(mean_sepal_length = mean(Sepal.Length),
mean_sepal_width = mean(Sepal.Width)
)
write_csv(df_sep_measurements_mean, "outputs/sep_measurements_mean.csv")
In the second practice, we are going to focus on using dplyr
and
tidyr
to clean data and make it tidy. We shall use the
Getting and Cleaning Data
swirl course for this.
After course installation, you can run swirl()
and select the course
you will have installed.
library("swirl")
install_course("Getting and Cleaning Data")
swirl()
Happy R coding days.
We can create graphs in R using different packages. For the purpose of our learning, we shall use ggplot2 from tidyverse. This package helps us to quickly create beautiful graphs using data in the data frame and we can easily customize these graphs according to our preference.
From the reference, you can click on the icons of desired graphs under
the Geoms
sub heading to find more about the graph and how to create
it.
There is also another resource of R Graph Gallery on ggplot2 where you can access different graphs and explore how to create and customize them.
ggplot graphs are built step by step, incrementing each step at the end
with a +
sign.
- Attach data frame to the ggplot using the
data
argument - Specify the mappings/aesthetics (
aes
). Columns and other properties to visualize - Specify the type of plot/graph by adding the
geom_*()
functions.
We use the geom_bar()
or geom_col()
functions to create the bar
graphs. geom_bar()
uses stat_count()
by default and makes the height
of the bar proportional to the number of cases in each group.
geom_col()
uses stat_identity()
and the heights of the bars
represent values in the data.
library(tidyverse)
df_iris <- read_csv("inputs/iris_data.csv") |>
mutate(sepal_length_category = case_when(Sepal.Length < 5 ~ "cat_less_than_5",
Sepal.Length < 7 ~ "cat_5_6",
Sepal.Length >= 7 ~ "cat_7+"))
# basic geom_bar
ggplot(data = df_iris, aes(y = Species)) + # x or y column provided
geom_bar(fill = "blue") + # uses stat_count() by default
theme_bw()
# geom_bar color by category
ggplot(data = df_iris, aes(x = Species)) + # x or y column provided
geom_bar(aes(fill = sepal_length_category)) + # uses stat_count() by default
theme_bw()
# basic geom_col
ggplot(data = df_iris, aes(x = Sepal.Length, y = Species)) + # both x and y provided
geom_col(fill = "blue") + # uses stat_identity() by default
theme_bw()
# geom_col color by category
ggplot(data = df_iris, aes(x = Sepal.Length, y = Species)) + # both x and y provided
geom_col(aes(fill = sepal_length_category)) + # uses stat_identity() by default
theme_bw()
# summarizing data and plotting
df_mean_sep_len <- df_iris |>
select(Sepal.Length, Species) |>
group_by(Species) |>
summarise(`Mean Length` = mean(Sepal.Length))
# customizing the graph
ggplot(data = df_mean_sep_len, aes(x = `Mean Length`, y = Species)) +
geom_col(fill = "blue") +
theme_bw() +
theme(axis.ticks = element_blank(),
axis.text.x = element_text(face = "bold", size=12),
axis.text.y = element_text(face = "bold", size=12),
panel.grid.major.x = element_blank(),
panel.grid.minor.x = element_blank(),
panel.grid.minor.y = element_blank(),
panel.border = element_blank(),
axis.title = element_text(size=12)) +
ylab(label = "")
The scatter plot helps us to analyse the relationship between observations. It can be handy during Exploratory Data Analysis (EDA) to explore the data.
library(tidyverse)
df_iris <- read_csv("inputs/iris_data.csv")
# Basic scatter
ggplot(data = df_iris, aes(x = Sepal.Length, y = Sepal.Width)) +
geom_point() +
labs(title = "Plot of Sepal measurements")
# scatter with color categories
ggplot(data = df_iris, aes(x = Sepal.Length, y = Sepal.Width, colour = Species)) +
geom_point() +
labs(title = "Plot of Sepal measurements")
The Boxplot can help us to check the distribution of the data
library(tidyverse)
df_iris <- read_csv("inputs/iris_data.csv")
# basic Boxplot
ggplot(data = df_iris, aes(x = Species, y = Sepal.Length)) +
geom_boxplot()
# Boxplot with jitter
ggplot(data = df_iris, aes(x = Species, y = Sepal.Width, colour = Species)) +
geom_boxplot() +
geom_jitter()
We can use ggsave to export graphs to use in presentations and reports. It uses the defaults of the last plot displayed to export your plot.
ggsave("outputs/plot.png", scale = 2)
We are going to use the iris dataset to test our skills in working with data.
- Import the iris dataset into a variable named
df_iris
. For the following steps, create a new variabledf_iris_data
assigning to it the data you imported and use the pipe operator to add steps2
to5
.- Subset the dataset to keep with columns that start with
Petal
andSpecies
- Rename columns that start with
Petal
by removingPetal.
from the columns. Maintaining onlyLength
andWidth
- Subset dataset to keep only rows where
Length
is greater than1.2
- Create a new column of
Length_Category
with ranges< 3
“less_than_3”,< 5
“btn_3_and_5”,> 5
“greater_than_5”- Create a bar graph indicating the
Species
and colour the graph based onLength_Category
.- Calculate the mean
Length
for eachSpecies
and store the result in thedf_mean_length
variable.- Create a bar graph for the result from the previous step.
- Calculate the proportions based on
Species
andLength_Category
. Export this result into a csv file.
R Markdown helps us to generate reports. The reports could be based on changing data like during data collection where you have to communicate daily data collection progress to different teams. It could also be a dynamic report generated regularly after a given period of time like weekly or monthly report. when we connect these reports with data, it becomes easy for us to run the code and generate these reports.
These reports can include different elements like text, tables, graphs, maps e.t.c that are based on the data that we are reporting on.
The reports can be generated in different formats like HTML files, PDF files, Word files, Presentation files …
You can create R Markdown
files the same way we create R
files. From
the File
menu, click New File
then R Markdown…
From the pop up pane, choose the format of the report to generate give the Title of the document and click OK.
It will create a file with sample content containing basic parts of the report.
Save the file into an appropriate folder for our case the R
folder in
our project and start modifying or clear the sample content and add your
own content.
This part controls the document type to create. In this header, you can specify the type of document, the author, the title and other properties like table of contents for the document if appropriate.
This part allows us to specify some options that affect the entire document. This could include code folding, evaluation, display of messages, how to handle results, if to include the chunk. These options can be overridden in the individual chunk options.
In the body, we put content that will be seen in the report. The content may include headings, text, graphs, tables, maps, code e.t.c.
We use the #
symbol to define the heading level
# First-level header
## Second-level header
### Third-level header
We write free text outside of chunks and this text will be generated in the report. We can also style this text by giving it colour, making it bold, italic.
Normal text
Bold text or Another bold text
Italic text or Another italic text
This is coloured text
Normal text
**Bold text** or __Another bold text__
*Italic text* or _Another italic text_
<span style="color:red"> This is coloured text </span>
We can also add links to reports. Next is the link to our repository.
[Getting started with R](https://github.com/twesigye10/getting_started_with_R)
[Markdown Basics](https://rmarkdown.rstudio.com/authoring_basics.html)
Inside a code chunk, we can put R code that will be run and results added to the report. Code can also be added in text as inline code to return some results from the code. If your results are assigned to a variable, remember to call the variable after evaluation. This way, it will show the output.
summary(iris)
Inline code. The number of observations for iris data is: 150
Inline code. The number of observations for iris data is: **`r nrow(iris)`**
You generate the report by knitting the document using the knit button
on top of the document. Or you can use the keyboard shortcut of
Ctrl + Shift + K
.
-
Ctrl + Alt + I
. Used to create a new code chunk -
Ctrl + Shift + K
. Used to knit the file
Note: This is a list
- Ctrl + Alt + I. Used to create a new code chunk
- Ctrl + Shift + K. Used to knit the file
You can easily visualize tables with DT
package an interface to the JavaScript
library DataTables. This can be installed using the following code. An
example of usage has also been included.
install.packages("DT")
library(DT)
iris |>
datatable()
The scatter plot helps us to analyse the relationship between observations. It can be handy during Exploratory Data Analysis (EDA) to explore the data.
library(tidyverse)
df_iris <- read_csv("inputs/iris_data.csv")
# scatter with color categories
ggplot(data = df_iris, aes(x = Sepal.Length, y = Sepal.Width, colour = Species)) +
geom_point() +
labs(title = "Plot of Sepal measurements")
Transfer exercise 1 into R Markdown to generate HTML report.
Version control involves managing the versions of our code as individual developers or while collaborating as teams. We can track changes we made to the code and in this way we could even reset the code to a desired backward version of the code.
There are many ways of using R Studio
, Git
and GitHub
. But for the
purpose of this guide, we shall focus on setting up environment for
version control using Git
, Github
and Sourcetree
.
Git
is is a free and open source distributed version control mechanism
that we can use to manage our code. GitHub
on the other hand is a an
internet hosting service for projects using git as their version control
mechanism. By default the projects hosted are public and can be accessed
by anyone. However there are also private projects.
We can download Git and install on our
computers. During the installation, it will install also with it
Git Bash
and Git GUI
(Graphical User Interface). These two
interfaces can be used to manage the code with the Git Bash
using
command line interface while the GUI using a user interface to manage
the code. For the purpose of this guidance, we shall be using
Sourcetree
that is an easy to use tool without using the command line.
We can use GitHub to store and manage our code. To use it for this purpose, we need to create an account if we do not have one already. You can sign up from the GitHub sign up page.
Sourcetree
is a free Git client from
Atlassian that simplifies working with Git
repositories. You can easily manage and visualize your code.
You can download
Sourcetree from the
software page of Atlassian
and install it on your computer.
After the installation of Git
, Sourcetree
and creating a GitHub
account, we can now configure our tools to easily manage the versions of
our code.
To enable version control in R Studio
, open R Studio
then click on
Tools
menu select Global Options
, then Git/SVN
and activate the
Enable Version control interface for R Studio projects
. The
Git executable
box should now be filled with the location of Git
executable file.
To connect GitHub account to Sourcetree, under Remote Repositories
tab
in Sourcetree, click on Add an account…
and edit the information in
the Edit Hosting Account
menu that pops up.
For Hosting Service
, choose GitHub
.
For Authentication
, choose OAuth
and click Refresh OAuth Token
.
This may open a web page for you to sign in to GitHub
.
Once signed in, the account connection will display under the
Remote Repositories
tab. If you click on your account and the click on
refresh
button, you will see your repositories on GitHub
.
We also need to configure GitHub
to accept communication from
Sourcetree.
For this we shall login in to GitHub. Then
click on your profile
located in the top right of the page, select
Settings
, then Developer settings
located in the bottom left of the
page,then locate Personal access tokens
and click tokens (classic)
.
On the Personal access tokens (classic)
page, we shall generate a
token
for Authenticating Sourcetree
while pushing code and set it’s
expiration
to no
. We shall be adding the generated token
to
repositories to allow the communication from Sourcetree
. Copy this
token
and keep it safely.
This is a one time process that you do not repeat.
Try to avoid spaces in repository names, instead use underscores between words. An example “first_repository”, “cbi_data_mgt”
using the option for Version Control
to check out a project from a
version control repository, then choose Git
. Specify the link to the
GitHub
repository created and create the project.
Open Sourcetree and click on a plus sign
on the tabs bar of Sourcetree
to add a new tab. While Local
is selected, click on the Add
tool to
the create R Studio
project to Sourcetree.
Click on Settings
tool once the R Studio
project is open in
Sourcetree
, click on the origin
url and click on edit. We shall add
the personal-access-token
to the url. Add your
personal-access-token
+ @
before github.com
of the url.
https://{personal-access-token}@github.com/{my_repository}
.
Note: The contents inside {}
change accordingly.
You are now ready to version management on the repository and push your
changes to GitHub
.
- commit. You can do this iside R Studio. Esure to add a commit message to reflect what has changed.
- push. You can do this inside R studio or Sourcetree
- pull. You can do this inside R studio or Sourcetree
- branch. You can do this Sourcetree
- merge branches. You can do this Sourcetree