News & Events

  • 05/2019—08/2019: Data Scientist internship at Google in Mountain View, CA.

  • 01/2019: Presentation at the AAAI/ACM Conference on Artificial Intelligence, Ethics, and Society in Honolulu, HA.

  • 07/2018: Presentation at the Doctoral Consortium at the Educational Data Mining conference in Buffalo, NY.

  • 06/2018—08/2018: Data Scientist internship at Google in New York City.

  • 06/2018: Poster presentation at the Classification Society Meeting in Stony Brook NY.

  • 05/2018: Poster presentation at the Atlantic Causal Inference Conference in Pittsburgh, PA.

  • 11/2017: National finalist at the Citadel Data Open in New York, NY.

  • 09/16/2017: Winner of the Citadel Data Open datathon at Carnegie Mellon, with fellow students Nic Dalmasso, Kwhangho Kim, and Chirag Nagpal. (550+ student applications, around 125 students selected to compete)

  • 05/2018—08/2018: Data Science internship at Box in Redwood City, CA.

Research & Reading Groups

  • Parameter estimation in cognitive diagnosis models using clustering

    Cognitive Diagnosis Models (CDMs) are a type of latent class model used frequently in education and psychometrics. One typical application is to estimate whether students in a learning environment have mastered or not mastered each of a set of K skills. Traditional likelihood-based estimation methods are computationally intensive and become intractable for large datasets. I am investigating how to optimally estimate skill profiles using a lightweight, model-agnostic alternative: clustering, via k-means type algorithms or hierarchical agglomerative clustering. In particular, I’m investigating how to optimally perform this clustering when there are hierarchical relationships among skills that restrict the skill space by rendering some skill profiles impossible. (Supervised by Professor Rebecca Nugent)

  • Counterfactual formulations of (algorithmic) fairness criteria

    The subject of algorithmic fairness has received increasing attention in recent years, in particular since the publication of a 2016 ProPublic article alleging that software used in the criminal justice system to predict recidivism is biased against blacks. There has been much debate about what criteria are appropriate in order to designate an algorithm (or a decision process generally) as “fair.” I am thinking about ways to frame fairness in counterfactual terms within a causal inference framework; e.g. by answering questions like “would this person recidivate if they were granted bail?” as opposed to “what is the likelihood of this person recidivating generally?” (Supervised by Professor Edward Kennedy)

  • Causal Inference Reading Group

    A weekly reading group around methods and theoretical issues in causal inference.

  • Data Science Research and Education Group (Carnegie Mellon)

    A weekly reading group around a broad range of topics including clustering, unsupervised learning, data visualization, online learning, and data science education.

Selected Publications

(2018). Clustering students and inferring skill profiles with skill hierarchies. Doctoral consortium paper presented at the 11th International Conference on Educational Data Mining, Buffalo, NY.

(2018). Counterfactual prediction and fairness in risk assessment tools. Poster presented at the Atlantic Causal Inference Conference, Pittsburgh, PA.

(2018). Human use of machine translation to extract information from texts. In Isabel Lacruz and Riitta Jääskeläinen (eds.), Innovation and expansion in translation process research. Amsterdam: John Benjamins Publishing Company.

(2017). Filtering tweets for social unrest. Proceedings of the IEEE 11th International Conference on Semantic Computing (ICSC).

(2016). Machine classification of social media text along useful sociocultural dimensions. Technical Report: University of Maryland Center for Advanced Study of Language.

(2016). Modeling Triage Decision Making. Proceedings of the 38th Annual Conference of the Cognitive Science Society.

(2015). Reading between the lines: A prototype model for detecting Twitter sockpuppet accounts using language-agnostic processes. Poster presented at the 17th International Conference on Human-Computer Interaction. Los Angeles, CA.

(2015). Using structural topic modeling to detect events and cluster Twitter users in the Ukrainian crisis. Poster presented at the 17th International Conference on Human-Computer Interaction. Los Angeles, CA.