Skip to contents

This function reads an object of the class freqlist from a csv file. The csv file is assumed to contain two columns, the first being the type and the second being the frequency of that type. The file is also assumed to have a header line with the names of both columns.

Usage

read_freqlist(file, sep = "\t", file_encoding = "UTF-8", ...)

Arguments

file

Character vector of length 1. Path to the input file.

sep

Character vector of length 1. Column separator.

file_encoding

File encoding used in the input file.

...

Additional arguments (not implemented).

Value

Object of class freqlist.

Details

read_freqlist not only reads the file file, but also checks whether a configuration file exists with a name that is identical to file, except that it has the filename extension ".yaml".

If such a file exists, then that configuration file is taken to 'belong' to file and is also read and the frequency list attributes "tot_n_tokens" and "tot_n_types" are retrieved from it.

If no such configuration file exists, then the values for "tot_n_tokens" and "tot_n_types" are calculated on the basis of the frequencies in the frequency list.

See also

Examples

toy_corpus <- "Once upon a time there was a tiny toy corpus.
It consisted of three sentences. And it lived happily ever after."
freqs <- freqlist(toy_corpus, as_text = TRUE)

print(freqs, n = 1000)
#> Frequency list (types in list: 19, tokens in list: 21)
#> rank      type abs_freq nrm_freq
#> ---- --------- -------- --------
#>    1         a        2  952.381
#>    2        it        2  952.381
#>    3     after        1  476.190
#>    4       and        1  476.190
#>    5 consisted        1  476.190
#>    6    corpus        1  476.190
#>    7      ever        1  476.190
#>    8   happily        1  476.190
#>    9     lived        1  476.190
#>   10        of        1  476.190
#>   11      once        1  476.190
#>   12 sentences        1  476.190
#>   13     there        1  476.190
#>   14     three        1  476.190
#>   15      time        1  476.190
#>   16      tiny        1  476.190
#>   17       toy        1  476.190
#>   18      upon        1  476.190
#>   19       was        1  476.190
.old_wd <- setwd(tempdir())
write_freqlist(freqs, "example_freqlist.csv")
freqs2 <- read_freqlist("example_freqlist.csv")
print(freqs2, n = 1000)
#> Frequency list (types in list: 19, tokens in list: 21)
#> rank      type abs_freq nrm_freq
#> ---- --------- -------- --------
#>    1         a        2  952.381
#>    2        it        2  952.381
#>    3     after        1  476.190
#>    4       and        1  476.190
#>    5 consisted        1  476.190
#>    6    corpus        1  476.190
#>    7      ever        1  476.190
#>    8   happily        1  476.190
#>    9     lived        1  476.190
#>   10        of        1  476.190
#>   11      once        1  476.190
#>   12 sentences        1  476.190
#>   13     there        1  476.190
#>   14     three        1  476.190
#>   15      time        1  476.190
#>   16      tiny        1  476.190
#>   17       toy        1  476.190
#>   18      upon        1  476.190
#>   19       was        1  476.190
setwd(.old_wd)