Papers that exist - a DogManTC Collection

DogManTC 's Collections

cool or whatever

Don’t forget me

LoRA Variant Catalogue

Papers that exist

updated 3 days ago

Upvote

Latent Zoning Network: A Unified Principle for Generative Modeling, Representation Learning, and Classification

Paper • 2509.15591 • Published Sep 19, 2025 • 45
A Survey on Latent Reasoning

Paper • 2507.06203 • Published Jul 8, 2025 • 94
Quantized Evolution Strategies: High-precision Fine-tuning of Quantized LLMs at Low-precision Cost

Paper • 2602.03120 • Published Feb 3 • 1
TADA! Tuning Audio Diffusion Models through Activation Steering

Paper • 2602.11910 • Published Feb 12 • 2
CoPE-VideoLM: Codec Primitives For Efficient Video Language Models

Paper • 2602.13191 • Published Feb 13 • 30
GeoAgent: Learning to Geolocate Everywhere with Reinforced Geographic Characteristics

Paper • 2602.12617 • Published Feb 13 • 20
OneVision-Encoder: Codec-Aligned Sparsity as a Foundational Principle for Multimodal Intelligence

Paper • 2602.08683 • Published Feb 9 • 52
Towards Universal Video MLLMs with Attribute-Structured and Quality-Verified Instructions

Paper • 2602.13013 • Published Feb 13 • 54
Language Models are Hidden Reasoners: Unlocking Latent Reasoning Capabilities via Self-Rewarding

Paper • 2411.04282 • Published Nov 6, 2024 • 37
Latent Flow Transformer

Paper • 2505.14513 • Published May 20, 2025 • 29
LoRA-Drop: Temporal LoRA Decoding for Efficient LLM Inference

Paper • 2601.02569 • Published Jan 5
LLMs + Persona-Plug = Personalized LLMs

Paper • 2409.11901 • Published Sep 18, 2024 • 35
Thanos: Enhancing Conversational Agents with Skill-of-Mind-Infused Large Language Model

Paper • 2411.04496 • Published Nov 7, 2024 • 22
FoNE: Precise Single-Token Number Embeddings via Fourier Features

Paper • 2502.09741 • Published Feb 13, 2025 • 15
FLEXITOKENS: Flexible Tokenization for Evolving Language Models

Paper • 2507.12720 • Published Jul 17, 2025 • 10
Distilling Token-Trained Models into Byte-Level Models

Paper • 2602.01007 • Published Feb 1
Multiscale Byte Language Models -- A Hierarchical Architecture for Causal Million-Length Sequence Modeling

Paper • 2502.14553 • Published Feb 20, 2025 • 1
Byte Latent Transformer: Patches Scale Better Than Tokens

Paper • 2412.09871 • Published Dec 13, 2024 • 108
Delta Attention: Fast and Accurate Sparse Attention Inference by Delta Correction

Paper • 2505.11254 • Published May 16, 2025 • 48
Less is More: Recursive Reasoning with Tiny Networks

Paper • 2510.04871 • Published Oct 6, 2025 • 511
OmniQuery: Contextually Augmenting Captured Multimodal Memory to Enable Personal Question Answering

Paper • 2409.08250 • Published Sep 12, 2024 • 1
LightMem: Lightweight and Efficient Memory-Augmented Generation

Paper • 2510.18866 • Published Oct 21, 2025 • 115
The End of Manual Decoding: Towards Truly End-to-End Language Models

Paper • 2510.26697 • Published Oct 30, 2025 • 119
Kimi Linear: An Expressive, Efficient Attention Architecture

Paper • 2510.26692 • Published Oct 30, 2025 • 133
Internalizing Meta-Experience into Memory for Guided Reinforcement Learning in Large Language Models

Paper • 2602.10224 • Published Feb 10 • 19
ProRAG: Process-Supervised Reinforcement Learning for Retrieval-Augmented Generation

Paper • 2601.21912 • Published Jan 29 • 1
xRAG: Extreme Context Compression for Retrieval-augmented Generation with One Token

Paper • 2405.13792 • Published May 22, 2024 • 1
ReplaceMe: Network Simplification via Layer Pruning and Linear Transformations

Paper • 2505.02819 • Published May 5, 2025 • 26
Make LoRA Great Again: Boosting LoRA with Adaptive Singular Values and Mixture-of-Experts Optimization Alignment

Paper • 2502.16894 • Published Feb 24, 2025 • 32
DeepGen 1.0: A Lightweight Unified Multimodal Model for Advancing Image Generation and Editing

Paper • 2602.12205 • Published Feb 12 • 80
MiniCPM-SALA: Hybridizing Sparse and Linear Attention for Efficient Long-Context Modeling

Paper • 2602.11761 • Published Feb 12 • 7
CoMeT: Collaborative Memory Transformer for Efficient Long Context Modeling

Paper • 2602.01766 • Published Feb 2
Elastic Attention: Test-time Adaptive Sparsity Ratios for Efficient Transformers

Paper • 2601.17367 • Published Jan 24 • 34
MemFly: On-the-Fly Memory Optimization via Information Bottleneck

Paper • 2602.07885 • Published Feb 8 • 7
Voxtral Realtime

Paper • 2602.11298 • Published Feb 11 • 19
UMEM: Unified Memory Extraction and Management Framework for Generalizable Memory

Paper • 2602.10652 • Published Feb 11 • 3
Weight Decay Improves Language Model Plasticity

Paper • 2602.11137 • Published Feb 11 • 2
Latent Thoughts Tuning: Bridging Context and Reasoning with Fused Information in Latent Tokens

Paper • 2602.10229 • Published Feb 10 • 5
Stroke3D: Lifting 2D strokes into rigged 3D model via latent diffusion models

Paper • 2602.09713 • Published Feb 10 • 8
Ex-Omni: Enabling 3D Facial Animation Generation for Omni-modal Large Language Models

Paper • 2602.07106 • Published Feb 6 • 11
TimeChat-Captioner: Scripting Multi-Scene Videos with Time-Aware and Structural Audio-Visual Captions

Paper • 2602.08711 • Published Feb 9 • 28
How Do Decoder-Only LLMs Perceive Users? Rethinking Attention Masking for User Representation Learning

Paper • 2602.10622 • Published Feb 11 • 27
Differential Transformer

Paper • 2410.05258 • Published Oct 7, 2024 • 182
Möbius Transform for Mitigating Perspective Distortions in Representation Learning

Paper • 2405.02296 • Published Mar 7, 2024 • 4
AToken: A Unified Tokenizer for Vision

Paper • 2509.14476 • Published Sep 17, 2025 • 37
Flash-VStream: Memory-Based Real-Time Understanding for Long Video Streams

Paper • 2406.08085 • Published Jun 12, 2024 • 17
Badllama 3: removing safety finetuning from Llama 3 in minutes

Paper • 2407.01376 • Published Jul 1, 2024
RoseLoRA: Row and Column-wise Sparse Low-rank Adaptation of Pre-trained Language Model for Knowledge Editing and Fine-tuning

Paper • 2406.10777 • Published Jun 16, 2024 • 2
OLoRA: Orthonormal Low-Rank Adaptation of Large Language Models

Paper • 2406.01775 • Published Jun 3, 2024 • 3
RAG-Instruct: Boosting LLMs with Diverse Retrieval-Augmented Instructions

Paper • 2501.00353 • Published Dec 31, 2024
Video Streaming Thinking: VideoLLMs Can Watch and Think Simultaneously

Paper • 2603.12262 • Published 11 days ago • 30
Think While Watching: Online Streaming Segment-Level Memory for Multi-Turn Video Reasoning in Multimodal Large Language Models

Paper • 2603.11896 • Published 11 days ago • 8
WeEdit: A Dataset, Benchmark and Glyph-Guided Framework for Text-centric Image Editing

Paper • 2603.11593 • Published 11 days ago • 25
One Model, Many Budgets: Elastic Latent Interfaces for Diffusion Transformers

Paper • 2603.12245 • Published 11 days ago • 18
EVATok: Adaptive Length Video Tokenization for Efficient Visual Autoregressive Generation

Paper • 2603.12267 • Published 11 days ago • 13
OmniStream: Mastering Perception, Reconstruction and Action in Continuous Streams

Paper • 2603.12265 • Published 11 days ago • 12
Training Language Models via Neural Cellular Automata

Paper • 2603.10055 • Published 13 days ago • 7
Accent Vector: Controllable Accent Manipulation for Multilingual TTS Without Accented Data

Paper • 2603.07534 • Published 15 days ago • 5
FireRedASR2S: A State-of-the-Art Industrial-Grade All-in-One Automatic Speech Recognition System

Paper • 2603.10420 • Published 12 days ago • 6
NerVE: Nonlinear Eigenspectrum Dynamics in LLM Feed-Forward Networks

Paper • 2603.06922 • Published 16 days ago • 2
Flash-KMeans: Fast and Memory-Efficient Exact K-Means

Paper • 2603.09229 • Published 13 days ago • 79
EvoTok: A Unified Image Tokenizer via Residual Latent Evolution for Visual Understanding and Generation

Paper • 2603.12108 • Published 11 days ago • 8
Aligning Visual Foundation Encoders to Tokenizers for Diffusion Models

Paper • 2509.25162 • Published Sep 29, 2025 • 3
RecTok: Reconstruction Distillation along Rectified Flow

Paper • 2512.13421 • Published Dec 15, 2025 • 5
ε-VAE: Denoising as Visual Decoding

Paper • 2410.04081 • Published Oct 5, 2024 • 7
TokenFlow: Unified Image Tokenizer for Multimodal Understanding and Generation

Paper • 2412.03069 • Published Dec 4, 2024 • 34
Towards Economical Inference: Enabling DeepSeek's Multi-Head Latent Attention in Any Transformer-based LLMs

Paper • 2502.14837 • Published Feb 20, 2025 • 3
Sparse-BitNet: 1.58-bit LLMs are Naturally Friendly to Semi-Structured Sparsity

Paper • 2603.05168 • Published 18 days ago • 4
Truncated Step-Level Sampling with Process Rewards for Retrieval-Augmented Reasoning

Paper • 2602.23440 • Published 24 days ago • 3
BBQ-to-Image: Numeric Bounding Box and Qolor Control in Large-Scale Text-to-Image Models

Paper • 2602.20672 • Published 27 days ago • 9
LLaDA-o: An Effective and Length-Adaptive Omni Diffusion Model

Paper • 2603.01068 • Published 22 days ago • 22
NOVA: Sparse Control, Dense Synthesis for Pair-Free Video Editing

Paper • 2603.02802 • Published 20 days ago • 7
InfinityStory: Unlimited Video Generation with World Consistency and Character-Aware Shot Transitions

Paper • 2603.03646 • Published 19 days ago • 8
Memex(RL): Scaling Long-Horizon LLM Agents via Indexed Experience Memory

Paper • 2603.04257 • Published 19 days ago • 19
MemSifter: Offloading LLM Memory Retrieval via Outcome-Driven Proxy Reasoning

Paper • 2603.03379 • Published 20 days ago • 31
Proact-VL: A Proactive VideoLLM for Real-Time AI Companions

Paper • 2603.03447 • Published 19 days ago • 36
Helios: Real Real-Time Long Video Generation Model

Paper • 2603.04379 • Published 18 days ago • 173
Beyond the Grid: Layout-Informed Multi-Vector Retrieval with Parsed Visual Document Representations

Paper • 2603.01666 • Published 21 days ago • 1
WildActor: Unconstrained Identity-Preserving Video Generation

Paper • 2603.00586 • Published 23 days ago • 37
ByteFlow: Language Modeling through Adaptive Byte Compression without a Tokenizer

Paper • 2603.03583 • Published 19 days ago • 2
HY-WU (Part I): An Extensible Functional Neural Memory Framework and An Instantiation in Text-Guided Image Editing

Paper • 2603.07236 • Published 16 days ago • 3
CoCo: Code as CoT for Text-to-Image Preview and Rare Concept Generation

Paper • 2603.08652 • Published 14 days ago • 39
Lost in Stories: Consistency Bugs in Long Story Generation by LLMs

Paper • 2603.05890 • Published 17 days ago • 91
BrandFusion: A Multi-Agent Framework for Seamless Brand Integration in Text-to-Video Generation

Paper • 2603.02816 • Published 20 days ago • 2
Towards a Neural Debugger for Python

Paper • 2603.09951 • Published 13 days ago • 5
Reading, Not Thinking: Understanding and Bridging the Modality Gap When Text Becomes Pixels in Multimodal LLMs

Paper • 2603.09095 • Published 13 days ago • 28
Fish Audio S2 Technical Report

Paper • 2603.08823 • Published 13 days ago • 34
InternVL-U: Democratizing Unified Multimodal Models for Understanding, Reasoning, Generation and Editing

Paper • 2603.09877 • Published 13 days ago • 47
UniCom: Unified Multimodal Modeling via Compressed Continuous Semantic Representations

Paper • 2603.10702 • Published 12 days ago • 4
According to Me: Long-Term Personalized Referential Memory QA

Paper • 2603.01990 • Published 21 days ago • 5
ID-LoRA: Identity-Driven Audio-Video Personalization with In-Context LoRA

Paper • 2603.10256 • Published 12 days ago • 19
HybridStitch: Pixel and Timestep Level Model Stitching for Diffusion Acceleration

Paper • 2603.07815 • Published 14 days ago • 10
OmniForcing: Unleashing Real-time Joint Audio-Visual Generation

Paper • 2603.11647 • Published 11 days ago • 31
Cheers: Decoupling Patch Details from Semantic Representations Enables Unified Multimodal Comprehension and Generation

Paper • 2603.12793 • Published 10 days ago • 37
MoKus: Leveraging Cross-Modal Knowledge Transfer for Knowledge-Aware Concept Customization

Paper • 2603.12743 • Published 10 days ago • 3
Attention Residuals

Paper • 2603.15031 • Published 7 days ago • 140
Test-Time Strategies for More Efficient and Accurate Agentic RAG

Paper • 2603.12396 • Published 10 days ago • 1
SuperLocalMemory V3: Information-Geometric Foundations for Zero-LLM Enterprise Agent Memory

Paper • 2603.14588 • Published 7 days ago • 2
MEMO: Memory-Augmented Model Context Optimization for Robust Multi-Turn Multi-Agent LLM Games

Paper • 2603.09022 • Published 13 days ago • 24
GradMem: Learning to Write Context into Memory with Test-Time Gradient Descent

Paper • 2603.13875 • Published 9 days ago • 32
Rethinking UMM Visual Generation: Masked Modeling for Efficient Image-Only Pre-training

Paper • 2603.16139 • Published 6 days ago • 31

Upvote

Collection guide
Browse collections