Skip to content

Latest commit

 

History

History
10 lines (9 loc) · 475 Bytes

README.md

File metadata and controls

10 lines (9 loc) · 475 Bytes

spark - open-ended tracking of user behavior on stackoverflow

• Established relation between [lxml, pyspark]
    ◦ favorites and up/down vote ratio
◦ user reputation and post type ratio (question vs. answer) 
◦ user reputation and number of posts
◦ time of day and waiting time for an answer
◦ first post response quality and site tenure
• Developed synonym finder [word2vec]
• Predicted question tags from body text [pyspark.ml, regex]