
While at Smartvid, one project I was responsible for was the automation of what was a painfully manual task for our sales team. I created a new pipeline for data to flow from the database, be transformed in Python to generate the appropriate statistics and metrics, and was deposited in Google Drive as a CSV for easy access.

My Master's thesis was on understanding the moderation of online scientific discourse. I created a lightweight website in Flask (python) and MySQL to collect data from moderators of the /r/science community on reddit. Participants were asked to determine whether a given comment posted to /r/science should be permitted, and to explain why. I was able to obtain a deeper understanding of the standards to which the moderators held the community to, and how the moderators go about their work. This expertly labeled data was used alongside a larger set of reddit comments to train a neural network and several traditional models (SVC, DT, GB) on the task of moderating /r/science comments. I was able to obtain an accuracy similar to the inter-annotator agreement, suggesting that automated moderation can be used to assist human moderators.

More to come - this page is a work in progress!


E. Lucas, C. O. Alm and R. Bailey, "Understanding Human and Predictive Moderation of Online Science Discourse," 2019 IEEE Western New York Image and Signal Processing Workshop (WNYISPW), Rochester, NY, USA, 2019, pp. 1-5, doi: 10.1109/WNYIPW.2019.8923109.

Elizabeth Lucas. Interstitial content detection. arXiv preprint. arXiv: 1708.04879, 2017