Senior Machine Learning Engineer

Company: Textkernel by Bullhorn, Machine intelligence for people and jobs
Location: Amsterdam, Netherlands
Period: Feb 2022 - Present

Domain: HR Tech, High-volume Parsing, NLP & LLM Integration, Microservices.

  • Architecture & Modernization
    • Standardization: Led the migration of 10+ services to Python 3.11 and Pydantic v2, significantly reducing technical debt.
    • Infrastructure: Managed the migration of Docker images for 20+ repositories to AWS ECR, updating Helm charts to align with company-wide standards.
    • Monolith to microservices: Drove the extraction of core logic from legacy Perl monoliths into standalone, modern Python services.
    • Modular parsing: Designed and implemented a modular pipeline architecture by separating extraction and derivation logic (major refactoring effort), enabling the integration of new features and drastically reducing maintenance overhead, while preserving support for legacy features.
  • Performance & Efficiency
    • Optimization: Delivered 20–50% performance improvements across key services and reduced the production footprint of high-traffic services by 50%.
    • Latency: Reduced complex document processing time from >5s to <160ms by resolving critical tokenizer bottlenecks.
    • Resilience: Eliminated recurring outages caused by corrupt input data by implementing robust input validation patterns.
  • AI & LLM Strategy
    • Knowledge distillation: Used LLMs as teacher models to generate synthetic training data, improving the performance of existing efficient, low-latency production models.
    • Data quality: Integrated error-analysis steps into ML training pipelines to ensure consistent output quality.
    • Innovation: Co-developed a new LLM-based parsing service as a high-quality alternative engine for in-house CV parsing models.
    • LLM Ops: Built an end-to-end evaluation framework to measure LLM cost, latency, and quality to validate production readiness.
  • Operational Excellence
    • API evolution: Delivered a new version of the Candidate Input API, enabling full and partial ATS data updates.
    • Cross-team delivery: Led the end-to-end delivery of a new feature across seven products, coordinating multiple teams to ensure synchronized production deployment.
    • Observability: Standardized Grafana dashboards for consistent 4xx/5xx error tracking, speeding up root-cause analysis.

Skills: Python (FastAPI, Pydantic v2), Perl, Docker, Kubernetes (Helm), AWS (ECR), LLMs (Distillation, Parsing, Evaluation), Grafana

Machine Learning Engineer

Company: Textkernel by Bullhorn, Machine intelligence for people and jobs
Location: Amsterdam, Netherlands
Period: Jan 2019 - Jan 2022

Domain: HR Tech, NLP, Skills Intelligence, ML Platform Engineering.

  • Service Ownership: Owned the full lifecycle of the “Skills Extraction” microservice, scaling it to handle milions of documents per day across 25+ languages with <50ms (p90) latency. Successfully delivered it as both a standalone product and an integrated component of CV/vacancy parsing.
  • Taxonomy & ML Integration: Engineered a hybrid extraction engine combining a strict skills taxonomy with ML-based contextual validation; automated a “feedback loop” (ELK, Jira) that identified “unknown skills” in production traffic and reported them to the Knowledge team, driving continuous taxonomy improvement.
  • Complex Parsing Solutions: Engineered a custom parser for PDF LinkedIn profiles and collaborated on the multi-column CV rendering upgrade by designing validation UIs and implementing extraction heuristics.
  • CI/CD Standardization: Unified engineering workflows by creating generic GitLab CI/CD templates; drove department-wide adoption where every engineer migrated their service configurations to the new standard, ensuring consistent performance tracking and release automation across the fleet.
  • Resilience Engineering: Designed and implemented a standardized internal service client library with built-in retries, timeout policies, and user-friendly error messages. Impact: Adopted by the team for 10+ microservices, significantly stabilizing upstream communication.
  • Observability & DevEx: Built internal debug endpoint and Kibana dashboards to trace extraction logic, drastically reducing the “Time to Resolution” for support tickets regarding skill extraction errors.
  • DevOps & Support: Ensured platform stability by leading incident response and resolving systematic defects in core parsing products.
  • Technical Mentorship: Upskilled the team on Python profiling, CI/CD, and GitOps workflows to foster technical excellence.

Skills: Microservices, NLP, Elasticsearch, CI/CD (GitLab), API Design, Python, Docker, Kubernetes.

Machine Learning Engineer

Company: Qwant, European search engine
Location: Epinal, France
Period: Nov 2017 - Aug 2018

Domain: Search, NLP, Query Understanding

  • Query Correction: Worked on automatic query correction for the Qwant European search engine.
  • System Design: Designed and implemented a baseline correction system for isolated spelling errors in user search queries, using a two-stage NLP solution:
    • generated candidate corrections using edit-distance–based spell checking,
    • re-ranked them using a language model.

Source code: ccquery
Skills: NLP, Query Correction, Language Models, Edit Distance, Python (spaCy, fasttext, hunspell, symspell, PyNLPl), SRILM, Docker, GitLab

Machine Learning Engineer

Company: Xilopix, French search engine
Location: Epinal, France
Period: Oct 2016 - Nov 2017

Domain: Machine Learning, NLP, Information Retrieval, Search Engine

  • End-to-end ownership: Owned the Machine Learning workflow from raw data acquisition (Elasticsearch) to production deployment.
  • Modeling: Built neural network classifiers for webpage topic detection (TF-IDF, LSA) and image color classification:
    • trained an LSA model on 40M documents to transform raw text into 300-dimensional semantic vectors.
    • achieved 96% F1 score on the held-out dataset.
  • Integration: Solved a complex integration constraint by training models in Python and re-implementing inference logic in Ruby, ensuring prediction parity.
  • Deployment: Deployed models into production indexing, exposing predictions and probabilities to enable threshold tuning and downstream filtering.
  • Engineering rigor: Strengthened software practices around modularity, versioning, testing, performance optimization, and reproducibility.

Source code: xi-ml-topicdiscovery, xi-dip
Skills: Machine Learning, NLP, Text Classification, Neural Networks, TF-IDF, LSA/LSI, Elasticsearch, Python (gensim, scikit-learn), Ruby, Docker, CI/CD

PhD student

Company: Inria (French national research institute), Université de Lorraine
Location: Nancy, France
Period: Dec 2012 - Feb 2016

Domain: Speech Recognition, Language Modeling, Assistive Technologies

  • Project RAPSODIE: Conducted doctoral research on building communication aids for the deaf and hearing-impaired. Focused on accurate Speech-to-Text systems for embedded devices with limited memory.
  • Hybrid Language Model: Proposed and implemented a novel model combining words and syllables to handle out-of-vocabulary words while preserving accuracy on frequent words.
  • Vocabulary expansion: Investigated word similarity based on contextual distributions to support dynamic vocabulary growth in language models.
  • Intent detection: Developed a real-time system to detect questions vs. statements using lexical and prosodic features (pitch, n-gram likelihoods), training classifiers like logistic regression, decision trees and shallow neural networks.

Skills: Speech Recognition, Language Models, Feature Engineering, Experimental ML, Distributed Computing, Git, Perl, Java, LaTeX

Support for international employees

Company: Inria, French national research institute
Location: Nancy, France
Period: Feb 2014 – Aug 2015
Facilitated the integration of international researchers by organizing monthly group activities and assisting with administrative procedures.

Junior Research Engineer

Company: Inria, French national research institute
Location: Nancy, France
Period: Oct 2011 - Dec 2012

Domain: Speech Recognition, Applied Machine Learning

  • Speech Verification: Developed ML models to detect mismatches between speech and text in non-native speech for language learning systems.
  • Feature Engineering: Engineered feature sets comparing constrained (forced) vs. unconstrained (phonetic) text-to-speech alignments, leveraging domain-specific knowledge (phonetic classes, phoneme durations, n-gram likelihoods).
  • Analysis: Trained logistic regression models and conducted systematic experiments to analyze the effects of pronunciation variations and training data quality.

Skills: Speech Recognition, Feature Engineering, Logistic Regression, Data Analysis, Model Evaluation

Intern - Research & Software Development

University: Université de Lorraine
Location: Nancy, France
Period: Feb 2011 - Juin 2011

Domain: Speech Recognition, Remote sound, Home Automation

  • Optimization: Tested multiple acoustic/language models and decoding configurations to optimize speech recognition in noisy, remote-microphone setups.
  • Automation: Developed scripts (Java, Perl, Shell) to automate performance testing across different environments.

Skills: Automatic Speech Recognition (ASR), Remote sound, Java, Perl, Linux

Intern - Research & Software Development

University: Universitatea ‘Stefan cel Mare’
Location: Suceava, Romania
Period: Feb 2010 - Juin 2010

Domain: Human–Computer Interaction (HCI), Gesture-Based Gaming Interfaces

  • Prototyping: Engineered a custom setup using a Wii remote and IR LED glasses to capture head position.
  • Algorithm Design: Implemented algorithms to filter noise and map natural head motions to video game inputs in real-time.

Skills: C#, Computer Vision, Algorithm Design, Rapid Prototyping, Human-Computer Interaction (HCI)