This method takes as its argument x an object that represents a sequence of
character data, such as an object of class tokens, and truncates it at the
position where a match for the argument pattern is found. Currently it is
only implemented for tokens objects.
Usage
trunc_at(x, pattern, ...)
# S3 method for tokens
trunc_at(
x,
pattern,
keep_this = FALSE,
last_match = FALSE,
from_end = FALSE,
...
)Arguments
- x
An object that represents a sequence of character data.
- pattern
A regular expression.
- ...
Additional arguments.
- keep_this
Logical. Whether the matching token itself should be kept. If
TRUE, the truncating happens right after the matching token; ifFALSE, right before.- last_match
Logical. In case there are several matching tokens, if
last_matchisTRUE, the last match will be used as truncating point; otherwise, the first match will.- from_end
Logical. If
FALSE, the match starts from the first token progressing forward; ifTRUE, it starts from the last token progressing backward.If
from_endisFALSE, the part ofxthat is kept after truncation is the head ofx. If it isTRUEinstead, the part that is kept after truncation is the tail ofx.
Examples
(toks <- tokenize('This is a first sentence . This is a second sentence .',
re_token_splitter = '\\s+'))
#> Token sequence of length 12
#> idx token
#> --- --------
#> 1 this
#> 2 is
#> 3 a
#> 4 first
#> 5 sentence
#> 6 .
#> 7 this
#> 8 is
#> 9 a
#> 10 second
#> 11 sentence
#> 12 .
trunc_at(toks, re("[.]"))
trunc_at(toks, re("[.]"), last_match = TRUE)
trunc_at(toks, re("[.]"), last_match = TRUE, from_end = TRUE)
