Control matching behaviour with modifier functions.

fixed: Compare literal bytes in the string. This is very fast, but not usually what you want for non-ASCII character sets.
coll: Compare strings respecting standard collation rules.
regex: The default. Uses ICU regular expressions.
boundary: Match boundaries between things.

fixed(pattern, ignore_case = FALSE)

coll(pattern, ignore_case = FALSE, locale = "en", ...)

regex(pattern, ignore_case = FALSE, multiline = FALSE,
  comments = FALSE, dotall = FALSE, ...)

boundary(type = c("character", "line_break", "sentence", "word"),
  skip_word_none = NA, ...)

Arguments

pattern	Pattern to modify behaviour.
ignore_case	Should case differences be ignored in the match?
locale	Locale to use for comparisons. See `stringi::stri_locale_list()` for all possible options. Defaults to "en" (English) to ensure that the default collation is consistent across platforms.
...	Other less frequently used arguments passed on to `stringi::stri_opts_collator()`, `stringi::stri_opts_regex()`, or `stringi::stri_opts_brkiter()`
multiline	If `TRUE`, `$` and `^` match the beginning and end of each line. If `FALSE`, the default, only match the start and end of the input.
comments	If `TRUE`, white space and comments beginning with `#` are ignored. Escape literal spaces with `\` .
dotall	If `TRUE`, `.` will also match line terminators.
type	Boundary type to detect. `character` Every character is a boundary. `line_break` Boundaries are places where it is acceptable to have a line break in the current locale. `sentence` The beginnings and ends of sentences are boundaries, using intelligent rules to avoid counting abbreviations (details). `word` The beginnings and ends of words are boundaries.
skip_word_none	Ignore "words" that don't contain any characters or numbers - i.e. punctuation. Default `NA` will skip such "words" only when splitting on `word` boundaries.

Examples

pattern <- "a.b"
strings <- c("abb", "a.b")
str_detect(strings, pattern)
#> [1] TRUE TRUE
str_detect(strings, fixed(pattern))
#> [1] FALSE  TRUE
str_detect(strings, coll(pattern))
#> [1] FALSE  TRUE

# coll() is useful for locale-aware case-insensitive matching
i <- c("I", "\u0130", "i")
i
#> [1] "I" "İ" "i"
str_detect(i, fixed("i", TRUE))
#> [1]  TRUE FALSE  TRUE
str_detect(i, coll("i", TRUE))
#> [1]  TRUE FALSE  TRUE
str_detect(i, coll("i", TRUE, locale = "tr"))
#> [1] FALSE  TRUE  TRUE

# Word boundaries
words <- c("These are   some words.")
str_count(words, boundary("word"))
#> [1] 4
str_split(words, " ")[[1]]
#> [1] "These"  "are"    ""       ""       "some"   "words."
str_split(words, boundary("word"))[[1]]
#> [1] "These" "are"   "some"  "words"

# Regular expression variations
str_extract_all("The Cat in the Hat", "[a-z]+")
#> [[1]]
#> [1] "he"  "at"  "in"  "the" "at" 
#> 
str_extract_all("The Cat in the Hat", regex("[a-z]+", TRUE))
#> [[1]]
#> [1] "The" "Cat" "in"  "the" "Hat"
#> 

str_extract_all("a\nb\nc", "^.")
#> [[1]]
#> [1] "a"
#> 
str_extract_all("a\nb\nc", regex("^.", multiline = TRUE))
#> [[1]]
#> [1] "a" "b" "c"
#> 

str_extract_all("a\nb\nc", "a.")
#> [[1]]
#> character(0)
#> 
str_extract_all("a\nb\nc", regex("a.", dotall = TRUE))
#> [[1]]
#> [1] "a\n"
#>

Control matching behaviour with modifier functions.

Arguments

See also

Examples

Contents