Skip to content

Latest commit

 

History

History
57 lines (45 loc) · 1.62 KB

README.md

File metadata and controls

57 lines (45 loc) · 1.62 KB

poisrrr

The package poisrrr implements the Poisson Reduced Rank method introduced in C. Jentsch, E. R. Lee and E. Mammen (2020) Time-dependent Poisson reduced rank models for political text data analysis. Computational Statistics and Data Analysis, 142, 106813. See also C. Jentsch, E. Mammen and E. R. Lee (2021) Poisson reduced rank models with an application to political text data. Biometrika, 108, 2, 455 - 468

Installation

You can install the development version from GitHub with:

# install.packages("remotes")
remotes::install_github("chroetz/poisrrr")

Example

Create a Term-Document-Matrix form quanteda’s inaugural address corpus.

library(magrittr)
quanteda::data_corpus_inaugural %>% 
  quanteda::tokens(remove_punct = TRUE, remove_symbols = TRUE, remove_numbers = TRUE) %>% 
  quanteda::dfm(verbose = FALSE) %>% 
  as.matrix() %>% 
  t() ->
  tdm
tdm <- tdm[rowSums(tdm) > 5, ] # remove rare words

Apply the method for K = 2 dimensions and plot the resulting plane with document positions.

library(poisrrr)
K <- 2
theta <- estim(tdm, K, verbose=FALSE)
lst <- theta2plist(theta, K)
v <- lst$v
plot(NA, xlim=c(-0.2, 0.22), ylim=c(-0.3, 0.25), xlab="Dimension 1", ylab="Dimension 2")
points(v)
lines(v)
labels <- rownames(v)
labels[-c(1,4,7,19,20,22,29,32,33,37,38,39,40,46,55,56,58)] <- NA
text(v, labels=labels, cex=0.8, pos=3, offset=0.2)