Beyond Chain-of-Thought, Effective Graph-of-Thought Reasoning in Large Language Models
DSPy: Compiling Declarative Language Model Calls into Self-Improving Pipelines
Matryoshka Representation Learning
How to train your ViT? Data, Augmentation, and Regularization in Vision Transformers
Eyes Wide Shut? Exploring the Visual Shortcomings of Multimodal LLMs
Personal LLM Agents: Insights and Survey about the Capability, Efficiency and Security
Self-Rewarding Language Models
Vision Mamba: Efficient Visual Representation Learning with Bidirectional State Space Model
Code Generation with AlphaCodium: From Prompt Engineering to Flow Engineering
Quantifying Language Models’ Sensitivity to Spurious Features in Prompt Design or: How I learned to start worrying about prompt formatting
LLM Maybe LongLM: Self-Extend LLM Context Window Without Tuning
Large Language Models for Generative Information Extraction: A Survey
DeepSeekMoE: Towards Ultimate Expert Specialization in Mixture-of-Experts Language Models
Soaring from 4K to 400K: Extending LLM’s Context with Activation Beacon
Parameter-Efficient Transfer Learning for NLP
Mixtral of Experts
MoE-Mamba: Efficient Selective State Space Models with Mixture of Experts
WikiChat: Stopping the Hallucination of Large Language Model Chatbots by Few-Shot Grounding on Wikipedia
Video Understanding with Large Language Models: A Survey
GPT-4V(ision) is a Generalist Web Agent, if Grounded
Join Podbean Ads Marketplace and connect with engaged listeners.
Advertise Today
Create your
podcast in
minutes
It is Free
The WAN Show
Cyber Security Headlines
Babbage from The Economist
Cybersecurity Today
Software Engineering Daily