Natural Language Processing Engineer
Owned and improved an Information Extraction engine processing thousands of documents each week, using Stanford's TokensRegex system.
Designed, developed, and deployed a system matching named entities with topics.
Master's Thesis & Project
Supervisor: Emily Bender
I extended the Grammar Matrix, an open source grammar engineering project, to enable the morphological, syntactic, and semantic analysis of adjectives cross-linguistically.
with Woodley Packard & Melanie Bolla
We designed a TRAC-style Question Answering system, utilizing open source deep processing tools such as the Stanford CoreNLP dcoref coreference resolver, WordNet, DELPH-IN syntax/semantics processing, and NLTK.
with Ryan Aldrich
We utilized the large English Resource Grammar open source HPSG grammar to extend the Stanford CoreNLP dcoref Coreference Resolution system using semantic representations to augment existing functionality.
with Yi-Shu Wei
We designed and implemented an end-to-end Sentiment Analyzer using Machine Learning Classifiers in MALLET. We implemented a feature selection algorithm using Latent Dirichlet Allocation to divide the data by topic in an attempt to improve training. We showed that LDA topic modeling did not improve classifier performance.
Coursework & Projects in Natural Language Processing, Machine Learning, Statistics, Systems Engineering, and Linguistics.
Coursework in Syntax, Semantics, Morphology, Phonology, Phonetics, Psycholinguistics, and Neurolinguistics.
I worked in a team to implement and test several Machine Learning algorithms and techniques, including Decision trees, KNN, Naive Bayes, and Support Vector Machines. I also developed systems for improving Machine Learning, such as feature selection algorithms (chi-squared) and boosting methods (such as Transformation Based Learning).
I worked in a team to implement several deep processing methods, such as parsing, word sense disambiguation, and coreference resolution, using techniques such as CKY, (P)CFGs, and Hobb's algorithm.
This course included several presentations and discussion of cutting-edge sentiment analysis research, such as review mining, aspect extraction, recognizing spam, and summarization. I developed an end-to-end sentiment analysis system using MALLET (see above).
This course consisted of presentation and discussion of several NLP applications, including sentiment analysis, summarization, and coreference resolution, and how to apply deep processing techniques, especially with respect to graph-based sentential semantic models (Minimal Recursion Semantics), to improve existing cutting-edge systems.