Arguments
- x
Object of class
table
or named numeric vector that will be interpreted as such.- tot_n_tokens
Number representing the total number of tokens in the corpus from which the frequency list is derived. When
tot_n_tokens
isNULL
, this total number of tokens will be taken to be the sum of the frequencies inx
.- sort_by_ranks
Logical. If
TRUE
, the items in the frequency list are sorted by frequency rank. IfFALSE
, the items in the frequency list, depending on the input type, either are sorted alphabetically or are not sorted at all.
Value
An object of class freqlist
, which is based on the class table
.
It has additional attributes and methods such as:
base
print()
,as_data_frame()
,summary()
andsort
,an interactive
explore()
method,various getters, including
tot_n_tokens()
,n_types()
,n_tokens()
, values that are also returned bysummary()
, and more,subsetting methods such as
keep_types()
,keep_pos()
, etc. including[]
subsetting (see brackets).
Additional manipulation functions include type_freqs()
to extract the frequencies
of different items, freqlist_merge()
to combine frequency lists, and
freqlist_diff()
to subtract a frequency list from another.
Objects of class freqlist
can be saved to file with write_freqlist()
;
these files can be read with read_freqlist()
.
Examples
toy_corpus <- "Once upon a time there was a tiny toy corpus.
It consisted of three sentences. And it lived happily ever after."
## make frequency list in a roundabout way
tokens <- tokenize(toy_corpus)
flist <- as_freqlist(table(tokens))
flist
#> Frequency list (types in list: 19, tokens in list: 21)
#> rank type abs_freq nrm_freq
#> ---- --------- -------- --------
#> 1 a 2 952.381
#> 2 it 2 952.381
#> 3 after 1 476.190
#> 4 and 1 476.190
#> 5 consisted 1 476.190
#> 6 corpus 1 476.190
#> 7 ever 1 476.190
#> 8 happily 1 476.190
#> 9 lived 1 476.190
#> 10 of 1 476.190
#> 11 once 1 476.190
#> 12 sentences 1 476.190
#> 13 there 1 476.190
#> 14 three 1 476.190
#> 15 time 1 476.190
#> 16 tiny 1 476.190
#> 17 toy 1 476.190
#> 18 upon 1 476.190
#> 19 was 1 476.190
## more direct procedure
freqlist(toy_corpus, as_text = TRUE)
#> Frequency list (types in list: 19, tokens in list: 21)
#> rank type abs_freq nrm_freq
#> ---- --------- -------- --------
#> 1 a 2 952.381
#> 2 it 2 952.381
#> 3 after 1 476.190
#> 4 and 1 476.190
#> 5 consisted 1 476.190
#> 6 corpus 1 476.190
#> 7 ever 1 476.190
#> 8 happily 1 476.190
#> 9 lived 1 476.190
#> 10 of 1 476.190
#> 11 once 1 476.190
#> 12 sentences 1 476.190
#> 13 there 1 476.190
#> 14 three 1 476.190
#> 15 time 1 476.190
#> 16 tiny 1 476.190
#> 17 toy 1 476.190
#> 18 upon 1 476.190
#> 19 was 1 476.190
## build frequency list from scratch: example 1
flist <- as_freqlist(c("a" = 12, "toy" = 53, "example" = 20))
flist
#> Frequency list (types in list: 3, tokens in list: 85)
#> rank type abs_freq nrm_freq
#> ---- ------- -------- --------
#> 1 toy 53 6235.294
#> 2 example 20 2352.941
#> 3 a 12 1411.765
## build frequency list from scratch: example 2
flist <- as_freqlist(c("a" = 12, "toy" = 53, "example" = 20),
tot_n_tokens = 1300)
flist
#> Frequency list (types in list: 3, tokens in list: 85)
#> <total number of tokens: 1300>
#> rank type abs_freq nrm_freq
#> ---- ------- -------- --------
#> 1 toy 53 407.692
#> 2 example 20 153.846
#> 3 a 12 92.308