From PhD to Postdoc - Lessons Learned in Speech Science Research

As I settle into my new role as a postdoctoral researcher at Laboratoire Parole et Langage (LPL) at Aix-Marseille University, I find myself reflecting on the incredible journey that brought me here. From starting my PhD at the University of Ferrara as part of the EU’s Conversational Brains project to now working on prosodic coordination with Dr. Leonardo Lancia, it’s been a transformative experience.

The PhD Journey: Challenges and Discoveries

The Conversational Brains Project

My PhD was centered around speech entrainment—a phenomenon I initially approached with more enthusiasm than understanding. The Conversational Brains project, funded by the European Union, provided an extraordinary interdisciplinary environment where neuroscientists, linguists, and AI researchers collaborated to understand how our brains coordinate during conversation.

Key insight: Real breakthrough research happens at the intersection of disciplines. My background in translation studies and computer science, combined with the neuroscience focus of the project, led to novel approaches that wouldn’t have emerged from any single field alone.

Technical Challenges

Working with speech data presents unique challenges that traditional machine learning textbooks don’t prepare you for:

  1. Data Quality: Natural conversational speech is messy—overlapping speakers, background noise, emotional variations, and technical recording issues all impact analysis.

  2. Temporal Complexity: Speech entrainment unfolds across multiple timescales simultaneously—from millisecond-level acoustic adjustments to minute-long prosodic adaptations.

  3. Individual Differences: What constitutes “entrainment” varies dramatically between speakers, cultures, and conversation contexts.

Methodological Evolution

My approach evolved significantly over the four years:

Year 1-2: Focus on traditional acoustic features (MFCCs, spectral measures)

  • Lesson: While these features are interpretable, they miss crucial prosodic information

Year 2-3: Integration of prosodic analysis using Praat and custom algorithms

  • Lesson: Prosodic features require careful normalization across speakers and recording conditions

Year 3-4: Deep learning approaches with attention mechanisms

  • Lesson: Neural networks can capture complex temporal patterns, but require careful validation to ensure they’re learning meaningful linguistic phenomena rather than spurious correlations

International Collaboration: The DAVI Experience

My internship at DAVI the Humanizers in Reims was a crucial bridge between academic research and industry application. Working on voice similarity analysis for conversational AI taught me:

Product-Focused Research

Academic research often prioritizes novelty and theoretical understanding, while industry needs robust, scalable solutions. The challenge was adapting my entrainment detection work into a real-time voice analysis system that could handle:

  • Diverse accents and speaking styles
  • Noisy recording conditions
  • Computational constraints of web deployment

User-Centered Design

Developing the emotional speech collection system through language games highlighted the importance of user experience in research tools. The most sophisticated algorithm is useless if researchers can’t easily collect quality data with it.

Transitioning to LPL: New Perspectives on Prosody

Moving to Aix-Marseille University and working with Dr. Leonardo Lancia has opened new avenues in my research:

Prosody as Dynamic Coordination

The “Prosody AS Dynamic COordinative Device” project frames prosody not just as a linguistic feature, but as a real-time coordination mechanism between speakers. This perspective has profound implications for:

  • Understanding autism spectrum communication differences
  • Developing more natural speech synthesis systems
  • Designing effective language learning tools

Cross-linguistic Perspectives

Working in a French research environment while studying cross-linguistic prosodic patterns has given me firsthand experience of the phenomena I study. The subtle prosodic adjustments I make when switching between English, Mandarin, and French in daily life inform my theoretical understanding in ways that monolingual research environments couldn’t provide.

Lessons for Emerging Researchers

1. Embrace Interdisciplinarity

The most interesting research questions exist at the boundaries between fields. Don’t be afraid to learn new methods, attend conferences outside your primary field, and collaborate with researchers from different backgrounds.

2. Balance Theory and Application

Pure theoretical research and applied development both have their place, but the most impactful work often bridges both. My speech entrainment research became more robust when I had to make it work in real-world applications.

3. Document Everything

Research is inherently collaborative and iterative. The analysis script you write today will be invaluable to a collaborator (or future you) next year. Good documentation and reproducible workflows aren’t just good practice—they’re essential for meaningful scientific progress.

4. Stay Connected to the Bigger Picture

It’s easy to get lost in technical details, but regularly stepping back to consider the broader implications of your work helps maintain motivation and identify new research directions.

Looking Forward: The Next Chapter

As I continue my research at LPL, I’m excited about several emerging directions:

Integration with Large Language Models

How can we incorporate understanding of prosodic coordination into next-generation conversational AI systems? The challenge is maintaining the nuanced, context-sensitive nature of human prosodic patterns while scaling to diverse applications.

Clinical Applications

Speech entrainment research has direct implications for understanding and treating communication disorders. I’m particularly interested in developing assessment tools that can track therapeutic progress in naturalistic conversation settings.

Cultural and Social Dimensions

Prosodic coordination patterns vary across cultures and social contexts in ways we’re only beginning to understand. Future research needs to embrace this diversity rather than treating it as noise to be controlled for.

Final Thoughts

The transition from PhD to postdoc represents more than just a change in position—it’s a shift from learning to conduct research to actually conducting it independently. The questions are bigger, the stakes are higher, but the potential for impact is enormous.

To current PhD students navigating similar journeys: embrace the uncertainty, seek out diverse perspectives, and remember that the most important research often emerges from the intersections between established fields. Your unique background and perspective are assets, not obstacles to overcome.

The future of speech science lies in understanding not just how we produce and perceive speech, but how we coordinate with each other through speech. It’s a future I’m excited to help build.


What challenges have you faced in your research journey? I’d love to hear about your experiences in the comments below.