LLMPapers
updated
DocLLM: A layout-aware generative language model for multimodal document
understanding
Paper
• 2401.00908
• Published • 189
Lightning Attention-2: A Free Lunch for Handling Unlimited Sequence
Lengths in Large Language Models
Paper
• 2401.04658
• Published • 27
Weaver: Foundation Models for Creative Writing
Paper
• 2401.17268
• Published • 45
Efficient Tool Use with Chain-of-Abstraction Reasoning
Paper
• 2401.17464
• Published • 21
Shortened LLaMA: A Simple Depth Pruning for Large Language Models
Paper
• 2402.02834
• Published • 17
CroissantLLM: A Truly Bilingual French-English Language Model
Paper
• 2402.00786
• Published • 26
Self-Discover: Large Language Models Self-Compose Reasoning Structures
Paper
• 2402.03620
• Published • 117
Fine-Tuned Language Models Generate Stable Inorganic Materials as Text
Paper
• 2402.04379
• Published • 8
Fiddler: CPU-GPU Orchestration for Fast Inference of Mixture-of-Experts
Models
Paper
• 2402.07033
• Published • 19
The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits
Paper
• 2402.17764
• Published • 626
Sora: A Review on Background, Technology, Limitations, and Opportunities
of Large Vision Models
Paper
• 2402.17177
• Published • 87
Towards Optimal Learning of Language Models
Paper
• 2402.17759
• Published • 18
StarCoder 2 and The Stack v2: The Next Generation
Paper
• 2402.19173
• Published • 154
AST-T5: Structure-Aware Pretraining for Code Generation and
Understanding
Paper
• 2401.03003
• Published • 14
Stealing Part of a Production Language Model
Paper
• 2403.06634
• Published • 91
Design2Code: How Far Are We From Automating Front-End Engineering?
Paper
• 2403.03163
• Published • 98
LLM Agent Operating System
Paper
• 2403.16971
• Published • 73
Can large language models explore in-context?
Paper
• 2403.15371
• Published • 33
The Unreasonable Ineffectiveness of the Deeper Layers
Paper
• 2403.17887
• Published • 82
Transformers Can Do Arithmetic with the Right Embeddings
Paper
• 2405.17399
• Published • 54
Efficient Detection of Toxic Prompts in Large Language Models
Paper
• 2408.11727
• Published • 13
Ferret: Faster and Effective Automated Red Teaming with Reward-Based
Scoring Technique
Paper
• 2408.10701
• Published • 12
SciLitLLM: How to Adapt LLMs for Scientific Literature Understanding
Paper
• 2408.15545
• Published • 38
GST: Precise 3D Human Body from a Single Image with Gaussian Splatting
Transformers
Paper
• 2409.04196
• Published • 17
Fact, Fetch, and Reason: A Unified Evaluation of Retrieval-Augmented
Generation
Paper
• 2409.12941
• Published • 25
MinerU: An Open-Source Solution for Precise Document Content Extraction
Paper
• 2409.18839
• Published • 40
Open-RAG: Enhanced Retrieval-Augmented Reasoning with Open-Source Large
Language Models
Paper
• 2410.01782
• Published • 10
A Survey of Small Language Models
Paper
• 2410.20011
• Published • 46
Cut Your Losses in Large-Vocabulary Language Models
Paper
• 2411.09009
• Published • 49
DeepAnalyze: Agentic Large Language Models for Autonomous Data Science
Paper
• 2510.16872
• Published • 112