Arguments
- x
Object of class
tableor named numeric vector that will be interpreted as such.- tot_n_tokens
Number representing the total number of tokens in the corpus from which the frequency list is derived. When
tot_n_tokensisNULL, this total number of tokens will be taken to be the sum of the frequencies inx.- sort_by_ranks
Logical. If
TRUE, the items in the frequency list are sorted by frequency rank. IfFALSE, the items in the frequency list, depending on the input type, either are sorted alphabetically or are not sorted at all.
Value
An object of class freqlist, which is based on the class table.
It has additional attributes and methods such as:
base
print(),as_data_frame(),summary()andsort,an interactive
explore()method,various getters, including
tot_n_tokens(),n_types(),n_tokens(), values that are also returned bysummary(), and more,subsetting methods such as
keep_types(),keep_pos(), etc. including[]subsetting (see brackets).
Additional manipulation functions include type_freqs() to extract the frequencies
of different items, freqlist_merge() to combine frequency lists, and
freqlist_diff() to subtract a frequency list from another.
Objects of class freqlist can be saved to file with write_freqlist();
these files can be read with read_freqlist().
Examples
toy_corpus <- "Once upon a time there was a tiny toy corpus.
It consisted of three sentences. And it lived happily ever after."
## make frequency list in a roundabout way
tokens <- tokenize(toy_corpus)
flist <- as_freqlist(table(tokens))
flist
#> Frequency list (types in list: 19, tokens in list: 21)
#> rank type abs_freq nrm_freq
#> ---- --------- -------- --------
#> 1 a 2 952.381
#> 2 it 2 952.381
#> 3 after 1 476.190
#> 4 and 1 476.190
#> 5 consisted 1 476.190
#> 6 corpus 1 476.190
#> 7 ever 1 476.190
#> 8 happily 1 476.190
#> 9 lived 1 476.190
#> 10 of 1 476.190
#> 11 once 1 476.190
#> 12 sentences 1 476.190
#> 13 there 1 476.190
#> 14 three 1 476.190
#> 15 time 1 476.190
#> 16 tiny 1 476.190
#> 17 toy 1 476.190
#> 18 upon 1 476.190
#> 19 was 1 476.190
## more direct procedure
freqlist(toy_corpus, as_text = TRUE)
#> Frequency list (types in list: 19, tokens in list: 21)
#> rank type abs_freq nrm_freq
#> ---- --------- -------- --------
#> 1 a 2 952.381
#> 2 it 2 952.381
#> 3 after 1 476.190
#> 4 and 1 476.190
#> 5 consisted 1 476.190
#> 6 corpus 1 476.190
#> 7 ever 1 476.190
#> 8 happily 1 476.190
#> 9 lived 1 476.190
#> 10 of 1 476.190
#> 11 once 1 476.190
#> 12 sentences 1 476.190
#> 13 there 1 476.190
#> 14 three 1 476.190
#> 15 time 1 476.190
#> 16 tiny 1 476.190
#> 17 toy 1 476.190
#> 18 upon 1 476.190
#> 19 was 1 476.190
## build frequency list from scratch: example 1
flist <- as_freqlist(c("a" = 12, "toy" = 53, "example" = 20))
flist
#> Frequency list (types in list: 3, tokens in list: 85)
#> rank type abs_freq nrm_freq
#> ---- ------- -------- --------
#> 1 toy 53 6235.294
#> 2 example 20 2352.941
#> 3 a 12 1411.765
## build frequency list from scratch: example 2
flist <- as_freqlist(c("a" = 12, "toy" = 53, "example" = 20),
tot_n_tokens = 1300)
flist
#> Frequency list (types in list: 3, tokens in list: 85)
#> <total number of tokens: 1300>
#> rank type abs_freq nrm_freq
#> ---- ------- -------- --------
#> 1 toy 53 407.692
#> 2 example 20 153.846
#> 3 a 12 92.308
