Noam Ordan

Linguistics & NLP consultant — Hebrew & Arabic.

I help teams evaluate, annotate, and ship language models for Hebrew and Arabic. Twenty years in computational linguistics and translation studies, with a focus on bringing linguistic insight to bear on engineering decisions.

01 What I do

02 Selected engagements

2025–
Open-LLM evaluation, Hebrew & Arabic Freelance — Google, PwC
2021–24
NLP technology lead IAHLT — Hebrew & Arabic NLP tools and corpora
2018–20
R&D advisor EDT Software — active learning for legal discovery, PII detection

03 Background

Most of my published research concerns translationese — the systematic features that distinguish translated text from text written natively. PhD in Translation Studies (Bar-Ilan, 2011); MA in Philosophy of Science (Tel Aviv). Lectured and held research fellowships at Haifa, Saarland, and the Arab Academic College of Education.

04 Selected publications

2017
Found in Translation: Reconstructing Phylogenetic Language Trees from Translations ACL

The fingerprint a source language leaves on translated English is strong enough to recover language-family trees from translations alone — phylogeny without ever looking at the originals.

2016
On the Similarities Between Native, Non-native and Translated Texts ACL

Translated and non-native English share more than either shares with native English: both are lexically thinner, lean on explicit cohesion, and use fewer pronouns. A common second-pass fingerprint.

2015
On the Features of Translationese Digital Scholarship in the Humanities

A computational audit of the translation universals. Across ten source languages in Europarl, some classic features of translationese hold up; others collapse to chance.

2012
Language Models for Machine Translation: Original vs. Translated Texts Computational Linguistics

A counter-intuitive result: language models trained on translations beat models trained on originals for MT — because MT output already lives in the same dialect as human translation.

2011
Translationese and Its Dialects ACL

Translations differ from originals in two distinct ways — source-language interference and general translation effects. Function-word frequencies alone recover the source language with 92.7% accuracy.

Full list on Google Scholar. Code on GitHub.

05 Contact

[email protected]