Skip to contents

This function finds matches for an XPath query in a corpus.

Usage

find_xpath(x, pattern, fun = NULL, final_fun = NULL, namespaces = NULL, ...)

Arguments

x

A corpus: an fnames object, a character vector of an XML source, or a document parsed with xml2::read_xml().

pattern

An XPath query.

fun

Function to be applied to the individual nodes prior to returning the result.

final_fun

Function to be applied to the complete list of matches prior to returning the result.

namespaces

A namespace as generated by xml2::xml_ns().

...

Additional arguments.

Value

A nodeset or the output of applying fun to a nodeset.

Examples

test_xml <- '
<p>
  <w pos="at">The</w>
  <w pos="nn">example</w>
  <punct>.</punct>
</p>'

find_xpath(test_xml, "//w")
#> {xml_nodeset (2)}
#> [1] <w pos="at">The</w>
#> [2] <w pos="nn">example</w>
find_xpath(test_xml, "//@pos")
#> {xml_nodeset (2)}
#> [1]  pos="at"
#> [2]  pos="nn"
find_xpath(test_xml, "//w[@pos='nn']")
#> {xml_nodeset (1)}
#> [1] <w pos="nn">example</w>

find_xpath(test_xml, "//w", fun = xml2::xml_text)
#> [1] "The"     "example"
find_xpath(test_xml, "//w", fun = xml2::xml_attr, attr = "pos")
#> [1] "at" "nn"