Assistant Professor at Open University of the Netherlands, Department of Computer Science
Department of Computer Science at OU
Guest Researcher in the the Bernoulli Institute - Artificial Intelligence
Postdoctoral researcher in the Bernoulli Institute - Artificial Intelligence
AI at the Bernoulli Institute
Supervised by Prof. Lambert Schomaker
Postdoctoral researcher in the ADAPT center at Dublin City University
Marie Skłodowska-Curie EDGE Fellow in the ADAPT center at Dublin City University
ADAPT Centre at DCU
Supervised by Prof. Andy Way
PhD. in computer science from the University of Amsterdam
M.Sc. in Artificial Intelligence, Intelligent Systems
Visiting/Postal Address:
University of Groningen
Faculty of Science and Engineering
Artificial Intelligence - Bernoulli Institute
Nijenborgh 9, Room 0309
9747 AG Groningen
The Netherlands
Office:
Room 0309 (third floor)
About me:
I received my Master diploma (Cum Laude) from the University of Amsterdam in November 2008, after completing a innovative Master Thesis project at the EPFL Lausanne. During my Master thesis research in Lausanne I worked on gesture recognition with Hidden Markov Models. Back in Amsterdam I completed another extensive research project on the topic of obstacle and free space recognition for robot navigation. I published this research together with Dr. Arnoud Visser and Tijn Schmits and presented it in September 2009 at the ECMR conference in Croatia.Starting from may 2009 I worked in Germany at the Rheinische Friedrich-Wilhelms-Universität Bonn. We participated in the Robocup at Home competition of the International Robocup, and in our team I was responsible for our visual methods for people detection and people recognition. After the Robocup I worked on visual methods for object recognition using visual features (SIFT) and support vector machines. On 1 June 2010 I started my work on Statistical Machine Translation at the Institute for Logic, Language and Computation with Dr. Khalil Sima'an , in his project "Machine Translation When Exact Pattern Match Fails" funded by NWO Exact Sciences Free Competition . Following my PhD, I worked as a postdoctoral researcher in ILLC until October 2016. In June 2016 I defended my PhD titled "Aligning the Foundations of Hierarchical Statistical Machine Translation". In November 2016 I started a postdoc with Prof. Andy Way in ADAPT, at Dublin City University, working on hierarchical statistical machine translation and neural machine translation. In 2017, I obtained a Marie Skłodowska-Curie EDGE Grant for my project BAIT: Bilingual Association in Neural Machine Translation [ EDGE project ] , and in May 2017 I started working on this project. Over the last year I invested to become a deep learning expert and expert in pytorch programming, which allows me to implement deep learning models, when necessary from the ground up. This investment is currently starting to pay off, opening up new opportunities for multi-modal deep learning for neural machine translation and handwritten text recognition.
Research Interests and Current Work:
My research interests include machine translation (including syntax, morphology and semantics), handwritten text recognition, deep learning, computer vision, scholarly document processing and general machine learning. My current work focuses on developing new models and techniques for scholarly document processing and (neural) handwritten text recognition. I have a special interest in applying multi-modal techniques to take models in both fields to the next level. More information about my EDGE project can be found at [ EDGE project ]
My Master thesis project consisted of two parts. In the first part I automatically analyzed the structure of Hidden Markov Models (HMMs), and used it to automatically segment gesture sequences into the underlying primitive gestures. In the second part of my project I developed a technique to automatically merge or compress gesture models (HMMs).
Publications:
PhD thesis:
Diploma thesis:
Teaching:
Software:
Selected older reports from Bachelor and Master:
Recent Developments:
11-11-2021
We have a new paper on Active Learning, which my student Pieter Jacobs presented today at BNAIC/BENELEARN 2021 in Luxembourg.
[ BNAIC/BENELEARN 2021 ]
[ Our paper on arXiv ]
26-12-2020
I was consulted as a deep learning expert by the Dutch newspaper Dagblad van het Noorden
by journalist Koen Marée on the topic of deep fakes. My comments on how we might defend agains deep
fakes are available from the resulting article:
[ Article deep fakes 26-12-2020, Dagblad van het Noorden ]
‘De wetenschap werkt al veel langer aan de technologie, vertelt Gideon Maillette de Buy Wenniger.
Aan het RUG-instituut voor kunstmatige intelligentie doet hij onderzoek naar deep learning. ,,In
1940 keek men al hoe computers het menselijk brein zouden kunnen nabootsen. Sinds 2006 spreken
we van een nieuwe golf van deep learning. Zowel de algoritmes als de mogelijkheden voor
computers om berekeningen te maken worden heel snel beter."
In de uitzending van Lubach lag de nadruk vooral op de 'enge' kant van de technologie. Deepfakes
geven ruim baan aan oplichters die zich voor kunnen doen als iemand anders om geld af te
troggelen, aan het creëren van nepporno waar iemands naam mee kan worden beschadigd, of zelfs
het beïnvloeden van politiek door een gedeepfakete president gekke uitspraken te laten doen.
Daar worstelt Maillette de Buy Wenniger ook mee. Zelf promoveerde hij op het gebied van
automatische vertaling, een veld wat door de overgang naar deep learning in de laatste paar jaar een
revolutie doormaakte en reuzenstappen zette in de kwaliteit van gegenereerde
vertalingen. ,,Afgezien van dat het wel een bedreiging vormt voor de baan van mensen in de
vertaalindustrie, is het breed beschikbaar komen van hoog kwalitatieve, real-time automatische
vertaling een ontwikkeling met veel positieve kanten. Het probleem is dat de onderliggende deep
learning technologie veel algemener is en vrij toegepast kan worden. Dat brengt risico's met zich
mee.
...
"Een alternatief is daarom een aanpak die zich meer richt op de bron van de video. Je zou een
'hash' aan een video- of afbeeldingsbestand kunnen toevoegen. Dat is een speciale code die vastlegt
hoe het originele bestand eruit zag, en die meteen op internet gepubliceerd wordt." In dat geval kan,
bij twijfel, gecheckt worden of het bijvoorbeeld om een echte of gedeepfakete uitspraak van een
politicus gaat.’
30-1-2020
Presented our work on "Predicting the number of citations of scientific articles with shallow and deep model"
at CLIN 2020.
https://clin30.sites.uu.nl/programme/detailed/
23-9-2019
Presented our work "No Padding Please: Efficient Neural Handwriting Recognition""
at ICDAR 2019.
1-3-2019
Our new paper "No Padding Please: Efficient Neural Handwriting Recognition", which proposes new methods for
efficient neural handwriting recognition with multi-dimenisional long short-term memories (MDLSTMs) is now on
arXiv. This work also involves an efficient reimplementation of MDLSTMs from scratch in PyTorch, and a large number
of experiments and comparisons against literature results on the popular multi-writer IAM (handwriting) database.
22-2-2019
Our paper "Adaptation of Machine Translation Models with Back-translated Data using Transductive Data Selection Methods"
got accepted at CICLing 2019.
21-2-2019
Presented the continued work on handwriting recognition with minimal padding in an invited talk
for the research team lead by Prof. Dr. Ing. Rozenn DAHYOT, at Trinity College Dublin.
31-1-2019
Presented our work on handwriting recognition with minimal padding at CLIN 2019 in Groningen.
[ CLIN 2019 website ]
30-4-2018
Two of our papers got accepted at EAMT 2018 [ EAMT 2018 ].
6-8-2010
Made a fix for loading of the optimizer state for Adam in opennmt_py.
For the opennmt neural machine translation project
[ OpenNMT main website ]
[ Issue and fix in the opennmt_py open neural machine translation repository ].
21-9-2017
Presented new paper "Elastic-substitution decoding for Hierarchical SMT:
efficiency, richer search and double labels" at MT Summit, in Nagoya, Japan.
[ MT Summit 2017 ].
17-7-2017 -- 21-7-2017
Attended the International Summer School on Deep Learning 2017 in Bilbao, Spain
[ DeepLearn 2017 ].
6-8-2010 - Added support for m-n alignments to the tool
Software:
Based on software developed by Federico Sangati and in close collaboration with him, I developed an extension to his Tree visualization tool to allow the simultaneous visualization of source parse trees and the associated word alignments for SMT.Tree Alignment Violations
Recently a feature was added that allows visualizing the alignment constraint violations, assuming a reordering model that allows children of every node only to be permuted. Once a given source node n and its descending terminals "claim" a certain range in the target sentence, any source word outside the subtree rooted at n that tries to align within the same range, causes a crossing of alignments and an alignment violation. Alignment violations are indicated by pink, the offending words are drawn in pink and aligned by striped alignment lines for clearity. Furthermore, the words that cause the alignment violations with a certain subtree are indicated behind the root node of this subtree. We are still thinking how to optimize the visualization for clarity and avoiding overlap with parts of the tree.University of Amsterdam, Science Faculty Institute for Logic Language and Computation