spark - open-ended tracking of user behavior on stackoverflow
• Established relation between [lxml, pyspark]
◦ favorites and up/down vote ratio
◦ user reputation and post type ratio (question vs. answer)
◦ user reputation and number of posts
◦ time of day and waiting time for an answer
◦ first post response quality and site tenure
• Developed synonym finder [word2vec]
• Predicted question tags from body text [pyspark.ml, regex]