Past Projects

(Doctoral Research et al.)

natural language processing icon

Natural Language Processing

I build models of the language people consume to understand how language exposure influences people's preferred language varieties.

Using people's reports of what popular media they consume (TV, Movies, etc.), I examine how the language people expose themselves to affects the language they are comfortable using. To do this, I collect a language sample comprising all the media people report consuming. I then use language models (statistical and machine learning, all in Python) to capture regularities present in these samples. With these models, I evaluate possible language material for predicted bias, in order to choose the best items for people. This helps to ensure people from diverse language backgrounds are comfortable with the language we use.

I also previously worked on a project evaluating the effectiveness of vector representations of words (word embeddings - Word2Vec) in different languages. Using the same classifier for a sentiment analysis task, my team used features generated from word embeddings of different languages. These languages differed in morphological complexity (i.e. amount of information per word), which influenced the effectiveness of the word embeddings.

speech processing icon

Speech and Digital Signal Processing

I create audio files that combine speech and specially-formulated noise for hearing experiments (see below).

I use MATLAB to create noise with the same frequency (spectral) properties as speech. For this, I generate filters (fit with high-order LPC) to shape white noise. I then process this new noise to have special temporal (envelope) characteristics. By combining this noise with processed speech samples, I generate auditory stimuli with specific acoustic properties.

I then used functional MRI to image people's brains while they listened to these stimuli. These MRI signals represent brain activity over time and require signal processing to denoise, identify activation trends and relationships, and examine whether (and where) activation is caused by the stimuli.

I have also previously worked on two additional projects: developing a negotiating dialogue agent using affective ASR and affective TTS (the ability to recognize and simulate emotion), and a Computational Auditory Scene Analysis system to separate speech from background noise using Deep Learning.

miscellaneous coding projects icon

Miscellaneous Programming Projects

In addition to my research, I also program for numerous other projects.

I use statistical software (Stata, as well as Python and MATLAB) to clean, transform, and analyze my experimental data. My ability to make scientific claims directly depends on my ability to perform these inferential statistics.

Additionally, when I was a consultant, I created an application to develop linguistically optimal speech prompts for a speech technology company.

Lastly, I program for personal projects. From computational pun finders to predicting my preferred web content, I like when computers save people (me) time. I also built the website you're currently enjoying (minus style templating).

Why Did I Do It?

(For Science!)

testing fairness icon

Language Bias in Cognitive Testing

I examine how your language variety affects your performance on clinical tests.

Clinical tests use standardized language in an effort to make tests fair. Making the language the same, however, does not account for the fact that people's language experiences are different. This diversity in language backgrounds leads to differences in test difficulty exactly because the language stays the same.

My team recently found that people's performance is significantly different on these tests based on which popular media they consume. We found this variability to be unrelated to traditionally considered dimensions of variability (race, class, etc.). This may indicate that people's media exposure can influence the language they feel comfortable with.

When we trained neural language models on the language the media sources contained, the models significantly correlated with people's performance. This implies that there are linguistic properties of media that can predict the test performance of people who consume it.

brain-ear picture

Speech Perception in Noise

I study speech perception in adverse listening conditions (like those in our noisy everyday lives).

While noise is often sporadic, many noises have recurring well-behaved structure (like a train engine or ticking clock). I study the ability of the auditory system to learn this regular temporal structure. Being able to identify this structure can allow people to predict and therefore overcome it, allowing us to hear signals amidst the noise. This incredible process happens constantly, but is only noticed in extremely challenging situations such as loud bars or concerts. To study this, I mostly use behavioral hearing tests, but have also used functional MRI neuroimaging.

My initial findings showed that people were less hindered by the noise when it was regular, rather than sporadic. Our follow up experiments showed that this process is not as straight-forward as we anticipated. It seems that certain people can gain a benefit from the repetition, but others are actually hampered by it. We are currently investigating what teases these two populations apart. So far, we have ruled out general cognitive ability (working memory, executive function) and musical experience.

past projects icon

Past Projects

I began my career studying the articulatory and acoustic properties of speech production.

In my undergraduate work, I used acoustic phonetic analysis to document dying languages. I built online "talking dictionaries" that contained sound recordings of words produced by the last native speakers of these languages. To create dictionary entries, I used phonetic analysis (in praat) to transcribe how the words were pronounced. My transcriptions, while useful, were no substitute for the nuanced speech patterns of actual native speakers, so the dictionaries include both. I then wrote my undergraduate thesis on a new combination of sounds present in one of these languages, Bugun.

In the early years of my PhD, I studied the post-operative speech outcome of tongue cancer patients. To do this, I used real-time Magnetic Resonance Imaging of patients' vocal tracts while speaking, before and after surgery. After performing image processing on the videos of patients' speech, I modeled their articulatory behavior to draw meaningful inferences on whether (and how) their speech motor behavior changed post-operatively.

Current Professional Status

I am currently a Machine Learning Engineering Manager at Kensho located in the Research Triangle, North Carolina.

Contact Details

To contact me about employment opportunities, inquire about future work availability, or ask general technology or personal questions, feel free to contact me at:

For speaking engagements, collaboration requests, publication correspondence, and academic or research inquiries, please contact me at: