Chapter 6 Working with Strings
6.1 Remove a pattern from a string
price_table=tribble(~car, ~price,
"Corvette", "$65,000",
"Mustang GT", "$40,000")
# BASE R METHOD (sub by replacing something with nothing)
gsub("\\$", "",price_table$price) # (pattern, replace with, object$column)## [1] "65,000" "40,000"
# TIDYVERSE METHOD
str_remove(price_table$price, pattern = "\\$")## [1] "65,000" "40,000"
You can remove numbers by typing "[:digit:]"
panss_sem_data$cgi_sev=str_remove(panss_sem_data$cgi_sev, pattern = "[:digit:]")6.2 Replace one pattern in a string with another
Tidyverse command: str_replace() or str_replace_all()
Base R command: gsub()
# base R
gsub(mtcars, replacement = )
#tidyverse
str_replace_all(iris$Species, pattern=c("e", "a"), replacement="ZZZZ") |>
head()
str_replace(iris$Species, pattern=c("e", "a"), replacement="ZZZZ") |>
head()6.3 Find (i.e., filter for) all instances of a string
Useful for finding very specific things inside a column (e.g., one particular person’s name in a roster of names; everyone with a particular last name)
Tidyverse command: str_detect()
Base R command: grepl()
Note both must be nested inside of filter()
cars_df=rownames_to_column(mtcars, var = "car")
# base R
cars_df |> filter(grepl("Firebird", car))
# tidyverse
cars_df %>% filter(str_detect(car,"Firebird"))You can also search for multiple strings simultaneously by including the “or” logical operator inside the quotes.
cars_df |> filter(str_detect(car, "Firebird|Fiat"))You can also include the negation logical operator to filter for all instances except those with the specified string.
# base R
cars_df |> filter(!(grepl("Pontiac", car)))
# tidyverse
cars_df |> filter(!(str_detect(car, "Pontiac")))