Today we're joined by Armineh Nourbakhsh of JP Morgan AI Research to discuss the development and capabilities of DocLLM, a layout-aware large language model for multimodal document understanding. Armineh provides a historical overview of the challenges of document AI and an introduction to the DocLLM model. Armineh explains how this model, distinct from both traditional LLMs and document AI models, incorporates both textual semantics and spatial layout in processing enterprise documents like reports and complex contracts. We dig into her team’s approach to training DocLLM, their choice of a generative model as opposed to an encoder-based approach, the datasets they used to build the model, their approach to incorporating layout information, and the various ways they evaluated the model’s performance.
The complete show notes for this episode can be found at twimlai.com/go/672.
Advancing Hands-On Machine Learning Education with Sebastian Raschka - #565
Big Science and Embodied Learning at Hugging Face 🤗 with Thomas Wolf - #564
Full-Stack AI Systems Development with Murali Akula - #563
100x Improvements in Deep Learning Performance with Sparsity, w/ Subutai Ahmad - #562
Scaling BERT and GPT for Financial Services with Jennifer Glore - #561
Trends in Deep Reinforcement Learning with Kamyar Azizzadenesheli - #560
Deep Reinforcement Learning at the Edge of the Statistical Precipice with Rishabh Agarwal - #559
Designing New Energy Materials with Machine Learning with Rafael Gomez-Bombarelli - #558
Differentiable Programming for Oceanography with Patrick Heimbach - #557
Trends in Machine Learning & Deep Learning with Zachary Lipton - #556
Solving the Cocktail Party Problem with Machine Learning, w/ Jonathan Le Roux - #555
Machine Learning for Earthquake Seismology with Karianne Bergen - #554
The New DBfication of ML/AI with Arun Kumar - #553
Building Public Interest Technology with Meredith Broussard - #552
A Universal Law of Robustness via Isoperimetry with Sebastien Bubeck - #551
Trends in NLP with John Bohannon - #550
Trends in Computer Vision with Georgia Gkioxari - #549
Kids Run the Darndest Experiments: Causal Learning in Children with Alison Gopnik - #548
Hypergraphs, Simplicial Complexes and Graph Representations of Complex Systems with Tina Eliassi-Rad - #547
Deep Learning, Transformers, and the Consequences of Scale with Oriol Vinyals - #546
Create your
podcast in
minutes
It is Free
20/20
The Dropout
Ten Percent Happier with Dan Harris
World News Tonight with David Muir
NEJM This Week