Today we're joined by Armineh Nourbakhsh of JP Morgan AI Research to discuss the development and capabilities of DocLLM, a layout-aware large language model for multimodal document understanding. Armineh provides a historical overview of the challenges of document AI and an introduction to the DocLLM model. Armineh explains how this model, distinct from both traditional LLMs and document AI models, incorporates both textual semantics and spatial layout in processing enterprise documents like reports and complex contracts. We dig into her team’s approach to training DocLLM, their choice of a generative model as opposed to an encoder-based approach, the datasets they used to build the model, their approach to incorporating layout information, and the various ways they evaluated the model’s performance.
The complete show notes for this episode can be found at twimlai.com/go/672.
Optimization, Machine Learning and Intelligent Experimentation with Michael McCourt - #545
Jupyter and the Evolution of ML Tooling with Brian Granger - #544
Creating a Data-Driven Culture at ADP with Jack Berkowitz - #543
re:Invent Roundup 2021 with Bratin Saha - #542
Multi-modal Deep Learning for Complex Document Understanding with Doug Burdick - #541
Predictive Maintenance Using Deep Learning and Reliability Engineering with Shayan Mortazavi - #540
Building a Deep Tech Startup in NLP with Nasrin Mostafazadeh - #539
Models for Human-Robot Collaboration with Julie Shah - #538
Four Key Tools for Robust Enterprise NLP with Yunyao Li - #537
Machine Learning at GSK with Kim Branson - #536
The Benefit of Bottlenecks in Evolving Artificial Intelligence with David Ha - #535
Facebook Abandons Facial Recognition. Should Everyone Else Follow Suit? With Luke Stark - #534
Building Blocks of Machine Learning at LEGO with Francesc Joan Riera - #533
Exploring the FastAI Tooling Ecosystem with Hamel Husain - #532
Multi-task Learning for Melanoma Detection with Julianna Ianni - #531
House Hunters: Machine Learning at Redfin with Akshat Kaul - #530
Attacking Malware with Adversarial Machine Learning, w/ Edward Raff - #529
Learning to Ponder: Memory in Deep Neural Networks with Andrea Banino - #528
Advancing Deep Reinforcement Learning with NetHack, w/ Tim Rocktäschel - #527
Building Technical Communities at Stack Overflow with Prashanth Chandrasekar - #526
Create your
podcast in
minutes
It is Free
20/20
The Dropout
Ten Percent Happier with Dan Harris
World News Tonight with David Muir
NEJM This Week