-
The replacement to
perl()
isregex()
notregexp()
(#61). -
str_subset()
now allows to set custom options forfixed
pattern search (#79).
-
stringr is now powered by stringi instead of base R regular expressions. This improves unicode and support, and makes most operations considerably faster. If you find stringr inadequate for your string processing needs, I highly recommend looking at stringi in more detail.
-
stringr gains a vignette, currently a straight forward update of the article that appeared in the R Journal.
-
str_c()
now returns a zero length vector if any of its inputs are zero length vectors. This is consistent with all other functions, and standard R recycling rules. Similarly, usingstr_c("x", NA)
now yieldsNA
. If you want"xNA"
, usestr_replace_na()
on the inputs. -
str_replace_all()
gains a convenient syntax for applying multiple pairs of pattern and replacement to the same vector:input <- c("abc", "def") str_replace_all(input, c("[ad]" = "!", "[cf]" = "?"))
-
str_match()
now returns NA if an optional group doesn't match (previously it returned ""). This is more consistent withstr_extract()
and other match failures. -
New
str_subset()
keeps values that match a pattern. It's a convenient wrapper forx[str_detect(x)]
(#21, @jiho). -
New
str_order()
andstr_sort()
allow you to sort and order strings in a specified locale. -
New
str_conv()
to convert strings from specified encoding to UTF-8. -
New modifier
boundary()
allows you to count, locate and split by character, word, line and sentence boundaries. -
The documentation got a lot of love, and very similar functions (e.g. first and all variants) are now documented together. This should hopefully make it easier to locate the function you need.
-
ignore.case(x)
has been deprecated in favour offixed|regex|coll(x, ignore.case = TRUE)
,perl(x)
has been deprecated in favour ofregex(x)
. -
str_join()
is deprecated, please usestr_c()
instead.
-
fixed path in
str_wrap
example so works for more R installations. -
remove dependency on plyr
-
Zero input to
str_split_fixed
returns 0 row matrix withn
columns -
Export
str_join
-
new modifier
perl
that switches to Perl regular expressions -
str_match
now uses new base functionregmatches
to extract matches - this should hopefully be faster than my previous pure R algorithm
-
new
str_wrap
function which givesstrwrap
output in a more convenient format -
new
word
function extract words from a string given user defined separator (thanks to suggestion by David Cooper) -
str_locate
now returns consistent type when matching empty string (thanks to Stavros Macrakis) -
new
str_count
counts number of matches in a string. -
str_pad
andstr_trim
receive performance tweaks - for large vectors this should give at least a two order of magnitude speed up -
str_length returns NA for invalid multibyte strings
-
fix small bug in internal
recyclable
function
- all functions now vectorised with respect to string, pattern (and where appropriate) replacement parameters
- fixed() function now tells stringr functions to use fixed matching, rather than escaping the regular expression. Should improve performance for large vectors.
- new ignore.case() modifier tells stringr functions to ignore case of pattern.
- str_replace renamed to str_replace_all and new str_replace function added. This makes str_replace consistent with all functions.
- new str_sub<- function (analogous to substring<-) for substring replacement
- str_sub now understands negative positions as a position from the end of the string. -1 replaces Inf as indicator for string end.
- str_pad side argument can be left, right, or both (instead of center)
- str_trim gains side argument to better match str_pad
- stringr now has a namespace and imports plyr (rather than requiring it)
- fixed() now also escapes |
- str_join() renamed to str_c()
- all functions more carefully check input and return informative error messages if not as expected.
- add invert_match() function to convert a matrix of location of matches to locations of non-matches
- add fixed() function to allow matching of fixed strings.
- str_length now returns correct results when used with factors
- str_sub now correctly replaces Inf in end argument with length of string
- new function str_split_fixed returns fixed number of splits in a character matrix
- str_split no longer uses strsplit to preserve trailing breaks