Skip to content

Commit

Permalink
Add unjt test for convert_tokens
Browse files Browse the repository at this point in the history
  • Loading branch information
lebebr01 committed Mar 30, 2018
1 parent aee2d3d commit 27ecb5a
Show file tree
Hide file tree
Showing 2 changed files with 22 additions and 0 deletions.
8 changes: 8 additions & 0 deletions NEWS.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,11 @@
# pdfsearch 0.2.0

* Added `remove_hyphen` argument to remove hyphen from words that wrap onto two subsequent lines
* Added `convert_tokens` function which uses the tokenizers R package to convert text to tokens.
* Created vignette with expanded details
* Created JOSS paper for submission
* Created code of conduct and contributing policies.

# pdfsearch 0.1.1

* Added additional examples of usage to documentation.
Expand Down
14 changes: 14 additions & 0 deletions tests/testthat/test_convert_tokens.r
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
context("Test token conversion")

test_that('error for multiple keywords ignore case', {
path <- system.file('pdf', '1610.00147.pdf', package = 'pdfsearch')
expect_output(str(convert_tokens(x = path,
path = TRUE)), "List of 31"
)

page_one_words <- length(convert_tokens(x = path, path = TRUE)[[1]][[1]])

expect_lt(length(convert_tokens(x = path, path = TRUE, token_function = tokenizers::tokenize_lines)),
page_one_words)

})

0 comments on commit 27ecb5a

Please sign in to comment.