GitHub - Xuchen-Li/cv-arxiv-daily: Automatically update arXiv papers about SOT & VLT, Multi-modal Learning, LLM and Video Understanding using Github Actions.

Updated on 2025.02.04

Table of Contents

Single Object & Visual Language Tracking
Large Language Model
Video Understanding
Multi-modal Learning

Single Object & Visual Language Tracking

Publish Date	Title	Authors	PDF	Code
2025-01-13	Robust Single Object Tracking in LiDAR Point Clouds under Adverse Weather Conditions	Xiantong Zhao et.al.	2501.07133	null
2025-01-05	DeTrack: In-model Latent Denoising Learning for Visual Object Tracking	Xinyu Zhou et.al.	2501.02467	null
2025-01-13	FusionSORT: Fusion Methods for Online Multi-object Visual Tracking	Nathanael L. Baisa et.al.	2501.00843	link
2025-01-01	Less is More: Token Context-aware Learning for Object Tracking	Chenlong Xu et.al.	2501.00758	null
2024-12-28	Learning Adaptive and View-Invariant Vision Transformer with Multi-Teacher Knowledge Distillation for Real-Time UAV Tracking	You Wu et.al.	2412.20002	link
2024-12-26	SUTrack: Towards Simple and Unified Single Object Tracking	Xin Chen et.al.	2412.19138	link
2024-12-15	Exploring Enhanced Contextual Information for Video-Level Object Tracking	Ben Kang et.al.	2412.11023	link
2024-12-13	Visual Object Tracking across Diverse Data Modalities: A Review	Mengmeng Wang et.al.	2412.09991	null
2024-12-13	MVCTrack: Boosting 3D Point Cloud Tracking via Multimodal-Guided Virtual Cues	Zhaofeng Hu et.al.	2412.02734	link
2024-12-03	GSOT3D: Towards Generic 3D Single Object Tracking in the Wild	Yifan Jiao et.al.	2412.02129	link
2024-11-28	Improving Accuracy and Generalization for Efficient Visual Tracking	Ram Zaveri et.al.	2411.18855	null
2024-11-27	A comparison of extended object tracking with multi-modal sensors in indoor environment	Jiangtao Shuai et.al.	2411.18476	null
2024-12-04	A Distractor-Aware Memory for Visual Object Tracking with SAM2	Jovana Videnovic et.al.	2411.17576	link
2024-11-23	How Texts Help? A Fine-grained Evaluation to Reveal the Role of Language in Vision-Language Tracking	Xuchen Li et.al.	2411.15600	null
2024-11-24	ClickTrack: Towards Real-time Interactive Single Object Tracking	Kuiran Wang et.al.	2411.13183	null
2024-11-30	SAMURAI: Adapting Segment Anything Model for Zero-Shot Visual Tracking with Motion-Aware Memory	Cheng-Yen Yang et.al.	2411.11922	link
2024-12-09	Vision Eagle Attention: a new lens for advancing image classification	Mahmudul Hasan et.al.	2411.10564	link
2024-11-14	MFTIQ: Multi-Flow Tracker with Independent Matching Quality Estimation	Jonas Serych et.al.	2411.09551	link
2024-11-12	Visual Tracking with Intermittent Visibility: Switched Control Design and Implementation	Yangge Li et.al.	2411.08144	null
2024-12-16	ChatTracker: Enhancing Visual Tracking Performance via Chatting with Multimodal Large Language Model	Yiming Sun et.al.	2411.01756	null
2024-10-30	IP-MOT: Instance Prompt Learning for Cross-Domain Multi-Object Tracking	Run Luo et.al.	2410.23907	null
2024-10-27	NT-VOT211: A Large-Scale Benchmark for Night-time Visual Object Tracking	Yu Liu et.al.	2410.20421	link
2024-10-19	The Solution for Single Object Tracking Task of Perception Test Challenge 2024	Zhiqiang Zhong et.al.	2410.16329	null
2024-10-13	Gaussian Splatting Visual MPC for Granular Media Manipulation	Wei-Cheng Tseng et.al.	2410.09740	null
2024-10-09	DTVLT: A Multi-modal Diverse Text Benchmark for Visual Language Tracking Based on LLM	Xuchen Li et.al.	2410.02492	null
2024-09-30	Opt-in Camera: Person Identification in Video via UWB Localization and Its Application to Opt-in Systems	Matthew Ishige et.al.	2409.19891	null
2024-09-27	Improving Visual Object Tracking through Visual Prompting	Shih-Fang Chen et.al.	2409.18901	link
2024-09-26	General Compression Framework for Efficient Transformer Object Tracking	Lingyi Hong et.al.	2409.17564	null
2024-09-25	Towards Underwater Camouflaged Object Tracking: An Experimental Evaluation of SAM and SAM 2	Chunhui Zhang et.al.	2409.16902	link
2024-09-25	Conditional Generative Denoiser for Nighttime UAV Tracking	Yucheng Wang et.al.	2409.16834	link
2024-09-25	Progressive Representation Learning for Real-Time UAV Tracking	Changhong Fu et.al.	2409.16652	link
2024-09-25	Enhancing Nighttime UAV Tracking with Light Distribution Suppression	Liangliang Yao et.al.	2409.16631	link
2024-09-19	WeHelp: A Shared Autonomy System for Wheelchair Users	Abulikemu Abuduweili et.al.	2409.12159	link
2024-09-18	Distilling Channels for Efficient Deep Tracking	Shiming Ge et.al.	2409.11785	null
2024-09-13	Visual Language Tracking with Multi-modal Interaction: A Robust Benchmark	Xuchen Li et.al.	2409.08887	null
2024-09-10	VBIT: Towards Enhancing Privacy Control Over IoT Devices	Jad Al Aaraj et.al.	2409.06233	null
2024-09-03	Ultra-broadband room-temperature Fourier transform spectrometer with watt-level power consumption	Jakub Mnich et.al.	2409.01875	null
2024-08-25	Camouflaged_Object_Tracking__A_Benchmark	Xiaoyu Guo et.al.	2408.13877	null
2024-08-21	Low-Light Object Tracking: A Benchmark	Pengzhi Zhong et.al.	2408.11463	link
2024-08-20	MambaEVT: Event Stream based Visual Object Tracking using State Space Model	Xiao Wang et.al.	2408.10487	link
2024-08-05	VoxelTrack: Exploring Voxel Representation for 3D Point Cloud Object Tracking	Yuxuan Lu et.al.	2408.02263	null
2024-09-06	3D Single-object Tracking in Point Clouds with High Temporal Variation	Qiao Wu et.al.	2408.02049	null
2024-09-09	SiamMo: Siamese Motion-Centric 3D Object Tracking	Yuxiang Yang et.al.	2408.01688	link
2024-08-02	Visible-Thermal Multiple Object Tracking: Large-scale Video Dataset and Progressive Fusion Approach	Yabin Zhu et.al.	2408.00969	link
2024-08-06	Broadband THz wave generation and detection in organic crystal PNPA at MHz repetition rates	Lukasz A. Sterczewski et.al.	2407.20745	null
2024-07-16	Diff-Tracker: Text-to-Image Diffusion Models are Unsupervised Trackers	Zhengbo Zhang et.al.	2407.08394	null
2024-07-11	PINN-Ray: A Physics-Informed Neural Network to Model Soft Robotic Fin Ray Fingers	Xing Wang et.al.	2407.08222	null
2024-07-07	Addressing single object tracking in satellite imagery through prompt-engineered solutions	Athena Psalta et.al.	2407.05518	null
2024-07-07	Learning Motion Blur Robust Vision Transformers with Dynamic Early Exit for Real-Time UAV Tracking	You Wu et.al.	2407.05383	null
2024-07-09	P2P: Part-to-Part Motion Cues Guide a Strong Tracking Framework for LiDAR Point Clouds	Jiahao Nie et.al.	2407.05238	link
2024-07-07	Tracking Reflected Objects: A Benchmark	Xiaoyu Guo et.al.	2407.05235	null
2024-07-04	TrackPGD: A White-box Attack using Binary Masks against Robust Transformer Trackers	Fatemeh Nourilenjan Nokabadi et.al.	2407.03946	link
2024-07-02	FlowTrack: Point-level Flow Network for 3D Single Object Tracking	Shuo Li et.al.	2407.01959	null
2024-09-07	eMoE-Tracker: Environmental MoE-based Transformer for Robust Event-guided Object Tracking	Yucheng Chen et.al.	2406.20024	null
2024-06-14	Constrained Motion Planning for a Robotic Endoscope Holder based on Hierarchical Quadratic Programming	Jacinto Colan et.al.	2406.09982	null
2024-06-14	Robust compressive tracking via online weighted multiple instance learning	Sandeep Singh Sengar et.al.	2406.09914	null
2024-07-01	Adaptively Bypassing Vision Transformer Blocks for Efficient Visual Tracking	Xiangyang Yang et.al.	2406.08037	null
2024-06-07	Multi-Granularity Language-Guided Multi-Object Tracking	Yuhao Li et.al.	2406.04844	link
2024-06-02	Robust Visual Tracking via Iterative Gradient Descent and Threshold Selection	Zhuang Qi et.al.	2406.00589	null
2024-05-28	Reliable Object Tracking by Multimodal Hybrid Feature Extraction and Transformer-Based Fusion	Hongze Sun et.al.	2405.17903	link
2024-05-27	LoReTrack: Efficient and Accurate Low-Resolution Transformer Tracking	Shaohua Dong et.al.	2405.17660	null
2024-05-31	Awesome Multi-modal Object Tracking	Chunhui Zhang et.al.	2405.14200	link
2024-05-20	DTLLM-VLT: Diverse Text Generation for Visual Language Tracking Based on LLM	Xuchen Li et.al.	2405.12139	null
2024-05-16	A Novel Bounding Box Regression Method for Single Object Tracking	Omar Abdelaziz et.al.	2405.10444	null
2024-05-16	Beyond Traditional Single Object Tracking: A Survey	Omar Abdelaziz et.al.	2405.10439	null
2024-05-08	TENet: Targetness Entanglement Incorporating with Multi-Scale Pooling and Mutually-Guided Fusion for RGB-E Object Tracking	Pengcheng Shao et.al.	2405.05004	link
2024-04-22	360VOTS: Visual Object Tracking and Segmentation in Omnidirectional Videos	Yinzhe Xu et.al.	2404.13953	null
2024-05-25	An Experimental Study on Exploring Strong Lightweight Vision Transformers via Masked Image Modeling Pre-Training	Jin Gao et.al.	2404.12210	link
2024-04-16	Attention-Aware Visualization: Tracking and Responding to User Perception Over Time	Arvind Srinivasan et.al.	2404.10732	null
2024-04-15	Empowering Embodied Visual Tracking with Visual Foundation Models and Offline RL	Fangwei Zhong et.al.	2404.09857	null
2024-04-15	Learning Tracking Representations from Single Point Annotations	Qiangqiang Wu et.al.	2404.09504	null
2024-04-11	PillarTrack: Redesigning Pillar-based Transformer Network for Single Object Tracking on Point Clouds	Weisheng Xu et.al.	2404.07495	link
2024-05-02	Longitudinal Analysis and Quantitative Assessment of Child Development through Mobile Interaction	Juan Carlos Ruiz-Garcia et.al.	2404.06919	link
2024-04-09	LRR: Language-Driven Resamplable Continuous Representation against Adversarial Tracking Attacks	Jianlang Chen et.al.	2404.06247	link
2024-04-08	Semi-Supervised Novelty Detection for Precise Ultra-Wideband Error Signal Prediction	Umberto Albertin et.al.	2404.05351	null
2024-03-29	Context-Aware Integration of Language and Visual References for Natural Language Tracking	Yanyan Shao et.al.	2403.19975	null
2024-03-27	TAFormer: A Unified Target-Aware Transformer for Video and Motion Joint Prediction in Aerial Scenes	Liangyu Xu et.al.	2403.18238	null
2024-03-26	OmniVid: A Generative Framework for Universal Video Understanding	Junke Wang et.al.	2403.17935	link
2024-03-26	Exploring Dynamic Transformer for Efficient Object Tracking	Jiawen Zhu et.al.	2403.17651	null
2024-03-29	Elysium: Exploring Object-level Perception in Videos via MLLM	Han Wang et.al.	2403.16558	link
2024-03-25	Multi-attention Associate Prediction Network for Visual Tracking	Xinglong Sun et.al.	2403.16395	null
2024-03-28	SDSTrack: Self-Distillation Symmetric Adapter Learning for Multi-Modal Visual Object Tracking	Xiaojun Hou et.al.	2403.16002	link
2024-03-23	Spatio-Temporal Bi-directional Cross-frame Memory for Distractor Filtering Point Cloud Single Object Tracking	Shaoyu Sun et.al.	2403.15831	null
2024-03-19	TON-VIO: Online Time Offset Modeling Networks for Robust Temporal Alignment in High Dynamic Motion VIO	Chaoran Xiong et.al.	2403.12504	null
2024-03-18	Pedestrian Tracking with Monocular Camera using Unconstrained 3D Motion Model	Jan Krejčí et.al.	2403.11978	null
2024-03-16	A Spectrum-based Image Denoising Method with Edge Feature Enhancement	Peter Luvton et.al.	2403.11036	null
2024-03-15	Autoregressive Queries for Adaptive Tracking with Spatio-TemporalTransformers	Jinxia Xie et.al.	2403.10574	null
2024-03-14	OneTracker: Unifying Visual Object Tracking with Foundation Models and Efficient Tuning	Lingyi Hong et.al.	2403.09634	null
2024-02-27	ACTrack: Adding Spatio-Temporal Condition for Visual Object Tracking	Yushan Han et.al.	2403.07914	null
2024-04-03	Long-term Frame-Event Visual Tracking: Benchmark Dataset and Baseline	Xiao Wang et.al.	2403.05839	link
2024-03-08	Tracking Meets LoRA: Faster Training, Larger Model, Stronger Performance	Liting Lin et.al.	2403.05231	link
2024-03-08	Motion-Guided Dual-Camera Tracker for Low-Cost Skill Evaluation of Gastric Endoscopy	Yuelin Zhang et.al.	2403.05146	link
2024-03-06	VastTrack: Vast Category Visual Object Tracking	Liang Peng et.al.	2403.03493	link
2024-02-28	Enhancing Tracking Robustness with Auxiliary Adversarial Defense Networks	Zhewei Wu et.al.	2402.17976	null
2024-02-26	SeqTrack3D: Exploring Sequence Information for Robust 3D Point Cloud Tracking	Yu Lin et.al.	2402.16249	link
2024-02-26	Reading Relevant Feature from Global Representation Memory for Visual Object Tracking	Xinyu Zhou et.al.	2402.14392	null
2024-02-13	Optimized Information Flow for Transformer Tracking	Janani Kugarajeevan et.al.	2402.08195	link
2024-02-07	BioDrone: A Bionic Drone-based Single Object Tracking Benchmark for Robust Vision	Xin Zhao et.al.	2402.04519	null
2024-02-04	Spatio-temporal Prompting Network for Robust Video Feature Extraction	Guanxiong Sun et.al.	2402.02574	link
2024-01-24	Small Object Tracking in LiDAR Point Cloud: Learning the Target-awareness Prototype and Fine-grained Search Region	Shengjing Tian et.al.	2401.13285	null
2024-01-23	Correlation-Embedded Transformer Tracking: A Single-Branch Framework	Fei Xie et.al.	2401.12743	link
2024-01-20	Unifying Visual and Vision-Language Tracking via Contrastive Learning	Yinchao Ma et.al.	2401.11228	link
2024-01-20	Towards Category Unification of 3D Single Object Tracking on Point Clouds	Jiahao Nie et.al.	2401.11204	null
2024-01-18	Multi-task Learning for Joint Re-identification, Team Affiliation, and Role Classification for Sports Visual Tracking	Amir M. Mansourian et.al.	2401.09942	null
2024-01-12	Dense Optical Flow Estimation Using Sparse Regularizers from Reduced Measurements	Muhammad Wasim Nawaz et.al.	2401.06396	null
2024-01-18	Hold 'em and Fold 'em: Towards Human-scale, Feedback-Controlled Soft Origami Robots	Immanuel Ampomah Mensah et.al.	2401.04650	null
2024-01-06	Explicit Visual Prompts for Visual Object Tracking	Liangtao Shi et.al.	2401.03142	link
2024-01-03	ODTrack: Online Dense Temporal Token Learning for Visual Tracking	Yaozong Zheng et.al.	2401.01686	link
2023-12-27	X Modality Assisting RGBT Object Tracking	Zhaisheng Ding et.al.	2312.17273	null
2023-12-22	Cross-Modal Object Tracking via Modality-Aware Fusion Network and A Large-Scale Dataset	Lei Liu et.al.	2312.14446	link
2023-12-18	Multi-Correlation Siamese Transformer Network with Dense Connection for 3D Single Object Tracking	Shihao Feng et.al.	2312.11051	link
2023-12-17	Robust 3D Tracking with Quality-Aware Shape Completion	Jingwen Zhang et.al.	2312.10608	null
2023-12-15	Tracking Skiers from the Top to the Bottom	Matteo Dunnhofer et.al.	2312.09723	null
2023-12-11	M3SOT: Multi-frame, Multi-field, Multi-space 3D Single Object Tracking	Jiaming Liu et.al.	2312.06117	link
2023-12-07	Instance Tracking in 3D Scenes from Egocentric Videos	Yunhan Zhao et.al.	2312.04117	link
2024-02-19	Beyond Visual Cues: Synchronously Exploring Target-Centric Semantics for Vision-Language Tracking	Jiawei Ge et.al.	2311.17085	null
2023-11-21	Visual tracking brain computer interface	Changxing Huang et.al.	2311.12592	null
2024-01-10	ViKi-HyCo: A Hybrid-Control approach for complex car-like maneuvers	Edison P. Velasco Sánchez et.al.	2311.07268	null

(back to top)

Large Language Model

Publish Date	Title	Authors	PDF	Code
2025-01-31	Low-Rank Adapting Models for Sparse Autoencoders	Matthew Chen et.al.	2501.19406	null
2025-01-31	Vintix: Action Model via In-Context Reinforcement Learning	Andrey Polubarov et.al.	2501.19400	link
2025-01-31	Scalable-Softmax Is Superior for Attention	Ken M. Nakanishi et.al.	2501.19399	null
2025-01-31	Do LLMs Strategically Reveal, Conceal, and Infer Information? A Theoretical and Empirical Analysis in The Chameleon Game	Mustafa O. Karabag et.al.	2501.19398	null
2025-01-31	s1: Simple test-time scaling	Niklas Muennighoff et.al.	2501.19393	link
2025-01-31	Cache Me If You Must: Adaptive Key-Value Quantization for Large Language Models	Alina Shutova et.al.	2501.19392	null
2025-01-31	Federated Sketching LoRA: On-Device Collaborative Fine-Tuning of Large Language Models	Wenzhi Fang et.al.	2501.19389	null
2025-01-31	Decoding-based Regression	Xingyou Song et.al.	2501.19383	link
2025-01-31	TableMaster: A Recipe to Advance Table Understanding with Language Models	Lang Cao et.al.	2501.19378	null
2025-01-31	SELMA: A Speech-Enabled Language Model for Virtual Assistant Interactions	Dominik Wagner et.al.	2501.19377	null
2025-01-31	We're Different, We're the Same: Creative Homogeneity Across LLMs	Emily Wenger et.al.	2501.19361	null
2025-01-31	Mechanical Properties of the Meninges: Large Language Model Assisted Systematic Review of over 25,000 Studies	Brandon P. Chelstrom et.al.	2501.19359	null
2025-01-31	The Energy Loss Phenomenon in RLHF: A New Perspective on Mitigating Reward Hacking	Yuchun Miao et.al.	2501.19358	null
2025-01-31	Towards Adaptive Self-Improvement for Smarter Energy Systems	Alexander Sommer et.al.	2501.19340	null
2025-01-31	PixelWorld: Towards Perceiving Everything as Pixels	Zhiheng Lyu et.al.	2501.19339	null
2025-01-31	Homogeneity Bias as Differential Sampling Uncertainty in Language Models	Messi H. J. Lee et.al.	2501.19337	null
2025-01-31	Reward-Guided Speculative Decoding for Efficient LLM Reasoning	Baohao Liao et.al.	2501.19324	null
2025-01-31	MINDSTORES: Memory-Informed Neural Decision Synthesis for Task-Oriented Reinforcement in Embodied Systems	Anirudh Chari et.al.	2501.19318	null
2025-01-31	LLM-based Affective Text Generation Quality Based on Different Quantization Values	Yarik Menchaca Resendiz et.al.	2501.19317	null
2025-01-31	An Efficient Approach for Machine Translation on Low-resource Languages: A Case Study in Vietnamese-Chinese	Tran Ngoc Son et.al.	2501.19314	null
2025-01-30	Foundational Models for 3D Point Clouds: A Survey and Outlook	Vishal Thengane et.al.	2501.18594	null
2025-01-30	Advances in Multimodal Adaptation and Generalization: From Traditional Approaches to Foundation Models	Hao Dong et.al.	2501.18592	link
2025-01-30	Thoughts Are All Over the Place: On the Underthinking of o1-Like LLMs	Yue Wang et.al.	2501.18585	null
2025-01-30	Prediction-Powered Inference with Imputed Covariates and Nonuniform Sampling	Dan M. Kluger et.al.	2501.18577	link
2025-01-30	Token-Hungry, Yet Precise: DeepSeek R1 Highlights the Need for Multi-Step Reasoning Over Speed in MATH	Evgenii Evstafev et.al.	2501.18576	null
2025-01-30	BounTCHA: A CAPTCHA Utilizing Boundary Identification in AI-extended Videos	Lehao Lin et.al.	2501.18565	null
2025-01-30	SAM2Act: Integrating Visual Foundation Model with A Memory Architecture for Robotic Manipulation	Haoquan Fang et.al.	2501.18564	null
2025-01-30	Semantic Web and Creative AI -- A Technical Report from ISWS 2023	Raia Abu Ahmad et.al.	2501.18542	null
2025-01-30	Loss Functions and Operators Generated by f-Divergences	Vincent Roulet et.al.	2501.18537	null
2025-01-30	Illusions of Relevance: Using Content Injection Attacks to Deceive Retrievers, Rerankers, and LLM Judges	Manveer Singh Tamber et.al.	2501.18536	link
2025-01-30	Rethinking Bottlenecks in Safety Fine-Tuning of Vision Language Models	Yi Ding et.al.	2501.18533	null
2025-01-30	Differentially Private Steering for Large Language Model Alignment	Anmol Goel et.al.	2501.18532	link
2025-01-30	Learn from the Past: Language-conditioned Object Rearrangement with Large Language Models	Guanqun Cao et.al.	2501.18516	null
2025-01-30	Streaming DiLoCo with overlapping communication: Towards a Distributed Free Lunch	Arthur Douillard et.al.	2501.18512	null
2025-01-30	WILDCHAT-50M: A Deep Dive Into the Role of Synthetic Data in Post-Training	Benjamin Feuer et.al.	2501.18511	link
2025-01-30	CLEAR: Cue Learning using Evolution for Accurate Recognition Applied to Sustainability Data Extraction	Peter J. Bentley et.al.	2501.18504	null
2025-01-30	A Tool for In-depth Analysis of Code Execution Reasoning of Large Language Models	Changshu Liu et.al.	2501.18482	null
2025-01-30	CLoQ: Enhancing Fine-Tuning of Quantized LLMs via Calibrated LoRA Initialization	Yanxia Deng et.al.	2501.18475	null
2025-01-30	Tuning Vision Foundation Model via Test-Time Prompt-Guided Training for VFSS Segmentations	Chengxi Zeng et.al.	2501.18474	null
2025-01-30	A Benchmark and Evaluation for Real-World Out-of-Distribution Detection Using Vision-Language Models	Shiho Noda et.al.	2501.18463	link
2025-01-29	Learning Beyond the Surface: How Far Can Continual Pre-Training with LoRA Enhance LLMs' Domain-Specific Insight Learning?	Pouya Pezeshkpour et.al.	2501.17840	link
2025-01-29	Matrix Product Sketching via Coordinated Sampling	Majid Daliri et.al.	2501.17836	null
2025-01-29	Aggregation Schemes for Single-Vector WSI Representation Learning in Digital Pathology	Sobhan Hemati et.al.	2501.17822	null
2025-01-29	Leveraging Multimodal LLM for Inspirational User Interface Search	Seokhyeon Park et.al.	2501.17799	link
2025-01-29	BreezyVoice: Adapting TTS for Taiwanese Mandarin with Enhanced Polyphone Disambiguation -- Challenges and Insights	Chan-Jan Hsu et.al.	2501.17790	null
2025-01-29	Reasoning Over the Glyphs: Evaluation of LLM's Decipherment of Rare Scripts	Yu-Fei Shih et.al.	2501.17785	null
2025-01-29	AdditiveLLM: Large Language Models Predict Defects in Additive Manufacturing	Peter Pak et.al.	2501.17784	null
2025-01-29	2SSP: A Two-Stage Framework for Structured Pruning of LLMs	Fabrizio Sandri et.al.	2501.17771	link
2025-01-29	Hybrid Graphs for Table-and-Text based Question Answering using LLMs	Ankush Agarwal et.al.	2501.17767	null
2025-01-29	On the Partitioning of GPU Power among Multi-Instances	Tirth Vamja et.al.	2501.17752	null
2025-01-29	Early External Safety Testing of OpenAI's o3-mini: Insights from the Pre-Deployment Evaluation	Aitor Arrieta et.al.	2501.17749	null
2025-01-29	A technical review of multi-omics data integration methods: from classical statistical to deep generative approaches	Ana R. Baião et.al.	2501.17729	null
2025-01-29	Using Code Generation to Solve Open Instances of Combinatorial Design Problems	Christopher D. Rosin et.al.	2501.17725	link
2025-01-29	RICoTA: Red-teaming of In-the-wild Conversation with Test Attempts	Eujeong Choi et.al.	2501.17715	link
2025-01-29	Critique Fine-Tuning: Learning to Critique is More Effective than Learning to Imitate	Yubo Wang et.al.	2501.17703	null
2025-01-29	Planning with Vision-Language Models and a Use Case in Robot-Assisted Teaching	Xuzhe Dang et.al.	2501.17665	null
2025-01-29	Exploring Vision Language Models for Multimodal and Multilingual Stance Detection	Jake Vasilakes et.al.	2501.17654	null
2025-01-29	Tonguescape: Exploring Language Models Understanding of Vowel Articulation	Haruki Sakajo et.al.	2501.17643	link
2025-01-29	Efficient Redundancy Reduction for Open-Vocabulary Semantic Segmentation	Lin Chen et.al.	2501.17642	null
2025-01-29	In-Context Meta LoRA Generation	Yihua Shao et.al.	2501.17635	null
2025-01-28	SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training	Tianzhe Chu et.al.	2501.17161	null
2025-01-28	AxBench: Steering LLMs? Even Simple Baselines Outperform Sparse Autoencoders	Zhengxuan Wu et.al.	2501.17148	link
2025-01-28	FactCG: Enhancing Fact Checkers with Graph-Based Multi-Hop Data	Deren Lei et.al.	2501.17144	link
2025-01-28	ASTRAL: Automated Safety Testing of Large Language Models	Miriam Ugarte et.al.	2501.17132	null
2025-01-28	Scenario Understanding of Traffic Scenes Through Large Visual Language Models	Rivera Esteban et.al.	2501.17131	null
2025-01-28	Histoires Morales: A French Dataset for Assessing Moral Alignment	Thibaud Leteno et.al.	2501.17117	link
2025-01-28	Optimizing Large Language Model Training Using FP4 Quantization	Ruizhe Wang et.al.	2501.17116	null
2025-01-28	Unlocking Transparent Alignment Through Enhanced Inverse Constitutional AI for Principle Extraction	Carl-Leander Henneking et.al.	2501.17112	null
2025-01-28	COS(M+O)S: Curiosity and RL-Enhanced MCTS for Exploring Story Space via Language Models	Tobias Materzok et.al.	2501.17104	null
2025-01-28	Token-by-Token Regeneration and Domain Biases: A Benchmark of LLMs on Advanced Mathematical Problem-Solving	Evgenii Evstafev et.al.	2501.17084	null
2025-01-28	Contextual Self-paced Learning for Weakly Supervised Spatio-Temporal Video Grounding	Akash Kumar et.al.	2501.17053	null
2025-01-28	How Linguistics Learned to Stop Worrying and Love the Language Models	Richard Futrell et.al.	2501.17047	null
2025-01-28	Enhanced Retrieval of Long Documents: Leveraging Fine-Grained Block Representations with Large Language Models	Minghan Li et.al.	2501.17039	null
2025-01-28	Challenges in Ensuring AI Safety in DeepSeek-R1 Models: The Shortcomings of Reinforcement Learning Strategies	Manojkumar Parmar et.al.	2501.17030	null
2025-01-28	Automated Refactoring of Non-Idiomatic Python Code: A Differentiated Replication with LLMs	Alessandro Midolo et.al.	2501.17024	link
2025-01-28	Mobile Manipulation Instruction Generation from Multiple Images with Automatic Metric Enhancement	Kei Katsumata et.al.	2501.17022	link
2025-01-28	Large Language Models for Code Generation: The Practitioners Perspective	Zeeshan Rasheed et.al.	2501.16998	link
2025-01-28	Artificial Intelligence Clones	Annie Liang et.al.	2501.16996	null
2025-01-28	FedEFM: Federated Endovascular Foundation Model with Unseen Data	Tuong Do et.al.	2501.16992	null
2025-01-28	Modulating CNN Features with Pre-Trained ViT Representations for Open-Vocabulary Object Detection	Xiangyu Gao et.al.	2501.16981	null
2025-01-27	LUCY: Linguistic Understanding and Control Yielding Early Stage of Her	Heting Gao et.al.	2501.16327	link
2025-01-27	Evaluating The Performance of Using Large Language Models to Automate Summarization of CT Simulation Orders in Radiation Oncology	Meiyun Cao et.al.	2501.16309	null
2025-01-27	RAPID: Retrieval-Augmented Parallel Inference Drafting for Text-Based Video Event Retrieval	Long Nguyen et.al.	2501.16303	null
2025-01-27	Matryoshka Re-Ranker: A Flexible Re-Ranking Architecture With Configurable Depth and Width	Zheng Liu et.al.	2501.16302	null
2025-01-27	Large Models in Dialogue for Active Perception and Anomaly Detection	Tzoulio Chamiti et.al.	2501.16300	link
2025-01-27	FALCON: Resolving Visual Redundancy and Fragmentation in High-resolution Multimodal Large Language Models via Visual Registers	Renshan Zhang et.al.	2501.16297	null
2025-01-27	Brain-Adapter: Enhancing Neurological Disorder Analysis with Adapter-Tuning Multimodal Large Language Models	Jing Zhang et.al.	2501.16282	null
2025-01-27	Do LLMs Have Visualization Literacy? An Evaluation on Modified Visualizations to Test Generalization in Data Interpretation	Jiayi Hong et.al.	2501.16277	link
2025-01-27	URAG: Implementing a Unified Hybrid RAG for Precise Answers in University Admission Chatbots -- A Case Study at HCMUT	Long Nguyen et.al.	2501.16276	null
2025-01-27	Return of the Encoder: Maximizing Parameter Efficiency for SLMs	Mohamed Elfeki et.al.	2501.16273	link
2025-01-27	A foundation model for human-AI collaboration in medical literature mining	Zifeng Wang et.al.	2501.16255	null
2025-01-27	Multi-Agent Geospatial Copilots for Remote Sensing Workflows	Chaehong Lee et.al.	2501.16254	null
2025-01-27	Zero-Shot Decision Tree Construction via Large Language Models	Lucas Carrasco et.al.	2501.16247	null
2025-01-27	CLISC: Bridging clip and sam by enhanced cam for unsupervised brain tumor segmentation	Xiaochuan Ma et.al.	2501.16246	null
2025-01-27	Phase Transitions in Large Language Models and the $O(N)$ Model	Youran Sun et.al.	2501.16241	null
2025-01-27	AiGet: Transforming Everyday Moments into Hidden Knowledge Discovery with AI Assistance on Smart Glasses	Runze Cai et.al.	2501.16240	null
2025-01-27	Distilling foundation models for robust and efficient models in digital pathology	Alexandre Filiot et.al.	2501.16239	null
2025-01-27	Language-Based Bayesian Optimization Research Assistant (BORA)	Abdoulatif Cissé et.al.	2501.16224	null
2025-01-27	Enhancing Visual Inspection Capability of Multi-Modal Large Language Models on Medical Time Series with Supportive Conformalized and Interpretable Small Specialized Models	Huayu Li et.al.	2501.16215	link
2025-01-27	Provence: efficient and robust context pruning for retrieval-augmented generation	Nadezhda Chirkova et.al.	2501.16214	null
2025-01-24	HERMES: A Unified Self-Driving World Model for Simultaneous 3D Scene Understanding and Generation	Xin Zhou et.al.	2501.14729	link
2025-01-24	Do LLMs Provide Consistent Answers to Health-Related Questions across Languages?	Ipek Baris Schlicht et.al.	2501.14719	null
2025-01-24	Towards Better Understanding Table Instruction Tuning: Decoupling the Effects from Data versus Models	Naihao Deng et.al.	2501.14717	null
2025-01-24	FlexiGPT: Pruning and Extending Large Language Models with Low-Rank Weight Sharing	James Seale Smith et.al.	2501.14713	null
2025-01-24	The Karp Dataset	Mason DiCicco et.al.	2501.14705	null
2025-01-24	Rethinking Table Instruction Tuning	Naihao Deng et.al.	2501.14693	null
2025-01-24	Rethinking Foundation Models for Medical Image Classification through a Benchmark Study on MedMNIST	Fuping Wu et.al.	2501.14685	null
2025-01-24	An Empirical Study on LLM-based Classification of Requirements-related Provisions in Food-safety Regulations	Shabnam Hassani et.al.	2501.14683	null
2025-01-24	Diffusion based Text-to-Music Generationwith Global and Local Text based Conditioning	Jisi Zhang et.al.	2501.14680	null
2025-01-24	MedAgentBench: Dataset for Benchmarking LLMs as Agents in Medical Applications	Yixing Jiang et.al.	2501.14654	link
2025-01-24	Investigating the (De)Composition Capabilities of Large Language Models in Natural-to-Formal Language Conversion	Ziyao Xu et.al.	2501.14649	link
2025-01-24	Recommending Actionable Strategies: A Semantic Approach to Integrating Analytical Frameworks with Decision Heuristics	Renato Ghisellini et.al.	2501.14634	null
2025-01-24	Extracting Problem Structure with LLMs for Optimized SAT Local Search	André Schilder et.al.	2501.14630	null
2025-01-24	ReferDINO: Referring Video Object Segmentation with Visual Grounding Foundations	Tianming Liang et.al.	2501.14607	null
2025-01-24	Knowledge Graphs Construction from Criminal Court Appeals: Insights from the French Cassation Court	Alexander V. Belikov et.al.	2501.14579	null
2025-01-24	ZETA: Leveraging Z-order Curves for Efficient Top-k Attention	Qiuhao Zeng et.al.	2501.14577	null
2025-01-24	Large-scale and Fine-grained Vision-language Pre-training for Enhanced CT Image Understanding	Zhongyi Shui et.al.	2501.14548	link
2025-01-24	Leveraging ChatGPT's Multimodal Vision Capabilities to Rank Satellite Images by Poverty Level: Advancing Tools for Social Science Research	Hamid Sarmadi et.al.	2501.14546	null
2025-01-24	VERUS-LM: a Versatile Framework for Combining LLMs with Symbolic Reasoning	Benjamin Callewaert et.al.	2501.14540	null
2025-01-24	Design and Implementation of a Psychiatry Resident Training System Based on Large Language Models	Zhenguang Zhong et.al.	2501.14530	link
2025-01-23	CRPO: Confidence-Reward Driven Preference Optimization for Machine Translation	Guofeng Cui et.al.	2501.13927	null
2025-01-23	The Breeze 2 Herd of Models: Traditional Chinese LLMs Based on Llama with Vision-Aware and Function-Calling Capabilities	Chan-Jan Hsu et.al.	2501.13921	link
2025-01-23	Analysis of Indic Language Capabilities in LLMs	Aatman Vaidya et.al.	2501.13912	null
2025-01-23	Privacy-Preserving Personalized Federated Prompt Learning for Multimodal Large Language Models	Linh Tran et.al.	2501.13904	null
2025-01-23	Exploring Finetuned Audio-LLM on Heart Murmur Features	Adrian Florea et.al.	2501.13884	null
2025-01-23	The machine learning platform for developers of large systems	Alexey Naikov et.al.	2501.13881	null
2025-01-23	A RAG-Based Institutional Assistant	Gustavo Kuratomi et.al.	2501.13880	null
2025-01-23	Dual-Modal Prototype Joint Learning for Compositional Zero-Shot Learning	Shiyu Zhang et.al.	2501.13859	null
2025-01-23	Large Vision-Language Models for Knowledge-Grounded Data Annotation of Memes	Shiling Deng et.al.	2501.13851	link
2025-01-23	Think Outside the Data: Colonial Biases and Systemic Issues in Automated Moderation Pipelines for Low-Resource Languages	Farhana Shahid et.al.	2501.13836	null
2025-01-23	On the Reasoning Capacity of AI Models and How to Quantify It	Santosh Kumar Radha et.al.	2501.13833	null
2025-01-23	Predicting Compact Phrasal Rewrites with Large Language Models for ASR Post Editing	Hao Zhang et.al.	2501.13831	null
2025-01-23	Hallucinations Can Improve Large Language Models in Drug Discovery	Shuzhou Yuan et.al.	2501.13824	null
2025-01-23	Large Language Model driven Policy Exploration for Recommender Systems	Jie Wang et.al.	2501.13816	null
2025-01-23	Enhancing LLMs for Governance with Human Oversight: Evaluating and Aligning LLMs on Expert Classification of Climate Misinformation for Detecting False or Misleading Claims about Climate Change	Mowafak Allaham et.al.	2501.13802	null
2025-01-23	PromptMono: Cross Prompting Attention for Self-Supervised Monocular Depth Estimation in Challenging Environments	Changhao Wang et.al.	2501.13796	null
2025-01-23	Training-Free Zero-Shot Temporal Action Detection with Vision-Language Models	Chaolei Han et.al.	2501.13795	null
2025-01-23	Parameter-Efficient Fine-Tuning for Foundation Models	Dan Zhang et.al.	2501.13787	link
2025-01-23	Not Every AI Problem is a Data Problem: We Should Be Intentional About Data Scaling	Tanya Rodchenko et.al.	2501.13779	null
2025-01-23	Explainable XR: Understanding User Behaviors of XR Environments using LLM-assisted Analytics Framework	Yoonsang Kim et.al.	2501.13778	link
2025-01-22	VideoLLaMA 3: Frontier Multimodal Foundation Models for Image and Video Understanding	Boqiang Zhang et.al.	2501.13106	link
2025-01-22	Refining Input Guardrails: Enhancing LLM-as-a-Judge Efficiency Through Chain-of-Thought Fine-Tuning and Alignment	Melissa Kazemi Rad et.al.	2501.13080	null
2025-01-22	Autonomy-of-Experts Models	Ang Lv et.al.	2501.13074	null
2025-01-22	Does Table Source Matter? Benchmarking and Improving Multimodal Scientific Table Understanding and Reasoning	Bohao Yang et.al.	2501.13042	link
2025-01-22	Pairwise RM: Perform Best-of-N Sampling with Knockout Tournament	Yantao Liu et.al.	2501.13007	link
2025-01-22	Large Language Model-Based Semantic Communication System for Image Transmission	Soheyb Ribouh et.al.	2501.12988	null
2025-01-22	LLM4WM: Adapting LLM for Wireless Multi-Tasking	Xuanyu Liu et.al.	2501.12983	null
2025-01-22	OnionEval: An Unified Evaluation of Fact-conflicting Hallucination for Small-Large Language Models	Chongren Sun et.al.	2501.12975	link
2025-01-22	Accessible Smart Contracts Verification: Synthesizing Formal Models with Tamed LLMs	Jan Corazza et.al.	2501.12972	null
2025-01-22	It's complicated. The relationship of algorithmic fairness and non-discrimination regulations in the EU AI Act	Kristof Meding et.al.	2501.12962	null
2025-01-22	Efficient Prompt Compression with Evaluator Heads for Long-Context Transformer Inference	Weizhi Fei et.al.	2501.12959	null
2025-01-22	GANQ: GPU-Adaptive Non-Uniform Quantization for Large Language Models	Pengxiang Zhao et.al.	2501.12956	null
2025-01-22	Correctness Assessment of Code Generated by Large Language Models Using Internal Representations	Tuan-Dung Bui et.al.	2501.12934	null
2025-01-22	DynamicEarth: How Far are We from Open-Vocabulary Change Detection?	Kaiyu Li et.al.	2501.12931	null
2025-01-22	A Functional Software Reference Architecture for LLM-Integrated Systems	Alessio Bucaioni et.al.	2501.12904	null
2025-01-22	Architectural Fusion Through Contextual Partitioning in Large Language Models: A Novel Approach to Parameterized Knowledge Integration	Offa Kingsleigh et.al.	2501.12901	null
2025-01-22	Test-Time Preference Optimization: On-the-Fly Alignment via Iterative Textual Feedback	Yafu Li et.al.	2501.12895	link
2025-01-22	Generative AI Misuse Potential in Cyber Security Education: A Case Study of a UK Degree Program	Carlton Shepherd et.al.	2501.12883	null
2025-01-22	WisdomBot: Tuning Large Language Models with Artificial Intelligence Knowledge	Jingyuan Chen et.al.	2501.12877	null
2025-01-22	HierPromptLM: A Pure PLM-based Framework for Representation Learning on Heterogeneous Text-rich Networks	Qiuyu Zhu et.al.	2501.12857	null
2025-01-21	InternVideo2.5: Empowering Video MLLMs with Long and Rich Context Modeling	Yi Wang et.al.	2501.12386	link
2025-01-21	MMVU: Measuring Expert-Level Multi-Discipline Video Understanding	Yilun Zhao et.al.	2501.12380	link
2025-01-21	Expertise elevates AI usage: experimental evidence comparing laypeople and professional artists	Thomas F. Eisenmann et.al.	2501.12374	link
2025-01-21	Is Long Context All You Need? Leveraging LLM's Extended Context for NL2SQL	Yeounoh Chung et.al.	2501.12372	null
2025-01-21	Parameters vs FLOPs: Scaling Laws for Optimal Sparsity for Mixture-of-Experts Language Models	Samira Abnar et.al.	2501.12370	null
2025-01-21	InternLM-XComposer2.5-Reward: A Simple Yet Effective Multi-Modal Reward Model	Yuhang Zang et.al.	2501.12368	link
2025-01-21	Vision-Language Models for Automated Chest X-ray Interpretation: Leveraging ViT and GPT-2	Md. Rakibul Islam et.al.	2501.12356	null
2025-01-21	Automatic Labelling with Open-source LLMs using Dynamic Label Schema Integration	Thomas Walshe et.al.	2501.12332	null
2025-01-21	Cinepro: Robust Training of Foundation Models for Cancer Detection in Prostate Ultrasound Cineloops	Mohamed Harmanani et.al.	2501.12331	link
2025-01-21	VARGPT: Unified Understanding and Generation in a Visual Autoregressive Multimodal Large Language Model	Xianwei Zhuang et.al.	2501.12327	link
2025-01-21	LLM-Assisted Knowledge Graph Completion for Curriculum and Domain Modelling in Personalized Higher Education Recommendations	Hasan Abu-Rasheed et.al.	2501.12300	null
2025-01-21	MoGERNN: An Inductive Traffic Predictor for Unobserved Locations in Dynamic Sensing Networks	Qishen Zhou et.al.	2501.12281	link
2025-01-21	Condor: Enhance LLM Alignment with Knowledge-Driven Data Synthesis and Refinement	Maosong Cao et.al.	2501.12273	link
2025-01-21	CBVLM: Training-free Explainable Concept-based Large Vision Language Models for Medical Image Classification	Cristiano Patrício et.al.	2501.12266	null
2025-01-21	FOCUS: First Order Concentrated Updating Scheme	Yizhou Liu et.al.	2501.12243	null
2025-01-21	InsTALL: Context-aware Instructional Task Assistance with Multi-modal Large Language Models	Pha Nguyen et.al.	2501.12231	null
2025-01-21	CDW-CoT: Clustered Distance-Weighted Chain-of-Thoughts Reasoning	Yuanheng Fang et.al.	2501.12226	null
2025-01-21	Leveraging Large Language Models for Realizing Truly Intelligent User Interfaces	Allard Oelen et.al.	2501.12221	null
2025-01-21	You Can't Eat Your Cake and Have It Too: The Performance Degradation of LLMs with Jailbreak Defense	Wuyuao Mai et.al.	2501.12210	null
2025-01-21	Fixing Imbalanced Attention to Mitigate In-Context Hallucination of Large Vision-Language Model	Kazi Hasan Ibn Arif et.al.	2501.12206	link
2025-01-17	FaceXBench: Evaluating Multimodal LLMs on Face Understanding	Kartik Narayan et.al.	2501.10360	link
2025-01-17	Agent4Edu: Generating Learner Response Data by Generative Agents for Intelligent Education Systems	Weibo Gao et.al.	2501.10332	null
2025-01-17	BoK: Introducing Bag-of-Keywords Loss for Interpretable Dialogue Response Generation	Suvodip Dey et.al.	2501.10328	link
2025-01-17	Large language models for automated scholarly paper review: A survey	Zhenzhen Zhuang et.al.	2501.10326	null
2025-01-17	Hierarchical Autoregressive Transformers: Combining Byte-~and Word-Level Processing for Robust, Adaptable Language Models	Pit Neitemeier et.al.	2501.10322	null
2025-01-17	HiMix: Reducing Computational Complexity in Large Vision-Language Models	Xuange Zhang et.al.	2501.10318	null
2025-01-17	Addressing Popularity Bias in Third-Party Library Recommendations Using LLMs	Claudio Di Sipio et.al.	2501.10313	null
2025-01-17	Computational Protein Science in the Era of Large Language Models (LLMs)	Wenqi Fan et.al.	2501.10282	null
2025-01-17	Test Wars: A Comparative Study of SBST, Symbolic Execution, and LLM-Based Approaches to Unit Test Generation	Azat Abdullin et.al.	2501.10200	null
2025-01-17	Generative Artificial Intelligence: Implications for Biomedical and Health Professions Education	William Hersh et.al.	2501.10186	null
2025-01-17	Multi-stage Training of Bilingual Islamic LLM for Neural Passage Retrieval	Vera Pavlova et.al.	2501.10175	null
2025-01-17	Dual Debiasing: Remove Stereotypes and Keep Factual Gender for Fair Language Modeling and Translation	Tomasz Limisiewicz et.al.	2501.10150	null
2025-01-17	A Vision-Language Framework for Multispectral Scene Representation Using Language-Grounded Features	Enes Karanfil et.al.	2501.10144	null
2025-01-17	Exploring the Impact of Generative Artificial Intelligence in Education: A Thematic Analysis	Abhishek Kaushik et.al.	2501.10134	null
2025-01-17	ComplexFuncBench: Exploring Multi-Step and Constrained Function Calling under Long-Context Scenario	Lucen Zhong et.al.	2501.10132	link
2025-01-17	PaSa: An LLM Agent for Comprehensive Academic Paper Search	Yichen He et.al.	2501.10120	link
2025-01-17	LLM Reasoner and Automated Planner: A new NPC approach	Israel Puerta-Merino et.al.	2501.10106	null
2025-01-17	Universal Actions for Enhanced Embodied Foundation Models	Jinliang Zheng et.al.	2501.10105	link
2025-01-17	Few-shot Structure-Informed Machinery Part Segmentation with Foundation Models and Graph Neural Networks	Michael Schwingshackl et.al.	2501.10080	link
2025-01-17	SpatialCoT: Advancing Spatial Reasoning through Coordinate Alignment and Chain-of-Thought for Embodied Task Planning	Yuecheng Liu et.al.	2501.10074	null
2025-01-16	Distilling Multi-modal Large Language Models for Autonomous Driving	Deepti Hegde et.al.	2501.09757	null
2025-01-16	Lost in Translation, Found in Context: Sign Language Translation with Contextual Cues	Youngjoon Jang et.al.	2501.09754	null
2025-01-16	OmniThink: Expanding Knowledge Boundaries in Machine Writing through Thinking	Zekun Xi et.al.	2501.09751	null
2025-01-16	Enhancing Lexicon-Based Text Embeddings with Large Language Models	Yibin Lei et.al.	2501.09749	null
2025-01-16	Suggesting Code Edits in Interactive Machine Learning Notebooks Using Large Language Models	Bihui Jin et.al.	2501.09745	null
2025-01-16	Inference-Time Scaling for Diffusion Models beyond Scaling Denoising Steps	Nanye Ma et.al.	2501.09732	null
2025-01-16	A Simple Aerial Detection Baseline of Multimodal Language Models	Qingyun Li et.al.	2501.09720	link
2025-01-16	CyberMentor: AI Powered Learning Tool Platform to Address Diverse Student Needs in Cybersecurity Education	Tianyu Wang et.al.	2501.09709	link
2025-01-16	Domain Adaptation of Foundation LLMs for e-Commerce	Christian Herold et.al.	2501.09706	null
2025-01-16	Cueless EEG imagined speech for subject identification: dataset and benchmarks	Ali Derakhshesh et.al.	2501.09700	link
2025-01-16	Mitigating Hallucinations in Large Vision-Language Models via DPO: On-Policy Data Hold the Key	Zhihe Yang et.al.	2501.09695	link
2025-01-16	Simulated Interactive Debugging	Yannic Noller et.al.	2501.09694	null
2025-01-16	Towards Large Reasoning Models: A Survey of Reinforced Reasoning with Large Language Models	Fengli Xu et.al.	2501.09686	null
2025-01-16	Reward-Guided Controlled Generation for Inference-Time Alignment in Diffusion Models: Tutorial and Review	Masatoshi Uehara et.al.	2501.09685	null
2025-01-16	Robin: a Suite of Multi-Scale Vision-Language Models and the CHIRP Evaluation Benchmark	Alexis Roger et.al.	2501.09672	null
2025-01-16	A Survey of Research in Large Language Models for Electronic Design Automation	Jingyu Pan et.al.	2501.09655	null
2025-01-16	The Heap: A Contamination-Free Multilingual Code Dataset for Evaluating Large Language Models	Jonathan Katzy et.al.	2501.09653	null
2025-01-16	CarMem: Enhancing Long-Term Memory in LLM Voice Assistants through Category-Bounding	Johannes Kirmayr et.al.	2501.09645	link
2025-01-16	LLM-Based Routing in Mixture of Experts: A Novel Framework for Trading	Kuan-Ming Liu et.al.	2501.09636	null
2025-01-16	Empowering Large Language Models in Wireless Communication: A Novel Dataset and Fine-Tuning Framework	Yushen Lin et.al.	2501.09631	null
2025-01-15	Towards Fast, Specialized Machine Learning Force Fields: Distilling Foundation Models via Energy Hessians	Ishan Amin et.al.	2501.09009	link
2025-01-15	Aegis2.0: A Diverse AI Safety Dataset and Risks Taxonomy for Alignment of LLM Guardrails	Shaona Ghosh et.al.	2501.09004	null
2025-01-15	Vision Foundation Models for Computed Tomography	Suraj Pai et.al.	2501.09001	null
2025-01-15	CityLoc: 6 DoF Localization of Text Descriptions in Large-Scale Scenes with Gaussian Representation	Qi Ma et.al.	2501.08982	null
2025-01-15	Development and Validation of the Provider Documentation Summarization Quality Instrument for Large Language Models	Emma Croxford et.al.	2501.08977	null
2025-01-15	Learning to Extract Cross-Domain Aspects and Understanding Sentiments Using Large Language Models	Karukriti Kaushik Ghosh et.al.	2501.08974	null
2025-01-15	Analyzing the Ethical Logic of Six Large Language Models	W. Russell Neuman et.al.	2501.08951	null
2025-01-15	Applying General Turn-taking Models to Conversational Human-Robot Interaction	Gabriel Skantze et.al.	2501.08946	null
2025-01-15	Disentangling Exploration of Large Language Models by Optimal Exploitation	Tim Grams et.al.	2501.08925	null
2025-01-15	GenAI Content Detection Task 3: Cross-Domain Machine-Generated Text Detection Challenge	Liam Dugan et.al.	2501.08913	link
2025-01-15	Leveraging Large Language Models as Knowledge-Driven Agents for Reliable Retrosynthesis Planning	Qinyu Ma et.al.	2501.08897	link
2025-01-15	Generative Planning with 3D-vision Language Pre-training for End-to-End Autonomous Driving	Tengpeng Li et.al.	2501.08861	null
2025-01-15	Exploring Task-Level Optimal Prompts for Visual In-Context Learning	Yan Zhu et.al.	2501.08841	null
2025-01-15	IDEA: Image Description Enhanced CLIP-Adapter	Zhipeng Ye et.al.	2501.08816	link
2025-01-15	How Developers Interact with AI: A Taxonomy of Human-AI Collaboration in Software Engineering	Christoph Treude et.al.	2501.08774	null
2025-01-15	Admitting Ignorance Helps the Video Question Answering Models to Answer	Haopeng Li et.al.	2501.08771	null
2025-01-15	Enhanced Large Language Models for Effective Screening of Depression and Anxiety	June M. Liu et.al.	2501.08769	null
2025-01-15	Leveraging LLM Agents for Translating Network Configurations	Yunze Wei et.al.	2501.08760	null
2025-01-15	Expanding Vietnamese SentiWordNet to Improve Performance of Vietnamese Sentiment Analysis Models	Hong-Viet Tran et.al.	2501.08758	null
2025-01-15	The Inherent Limits of Pretrained LLMs: The Unexpected Convergence of Instruction Tuning and In-Context Learning Capabilities	Irina Bigoulaeva et.al.	2501.08716	link
2025-01-14	PokerBench: Training Large Language Models to become Professional Poker Players	Richard Zhuang et.al.	2501.08328	link
2025-01-14	Omni-RGPT: Unifying Image and Video Region-level Understanding via Token Marks	Miran Heo et.al.	2501.08326	null
2025-01-14	ADAM-1: AI and Bioinformatics for Alzheimer's Detection and Microbiome-Clinical Data Integrations	Ziyuan Huang et.al.	2501.08324	null
2025-01-14	Exploring Robustness of Multilingual LLMs on Real-World Noisy Data	Amirhossein Aliakbarzadeh et.al.	2501.08322	link
2025-01-14	Enhancing Automated Interpretability with Output-Centric Feature Descriptions	Yoav Gur-Arieh et.al.	2501.08319	link
2025-01-14	MiniMax-01: Scaling Foundation Models with Lightning Attention	MiniMax et.al.	2501.08313	null
2025-01-14	HALoGEN: Fantastic LLM Hallucinations and Where to Find Them	Abhilasha Ravichander et.al.	2501.08292	null
2025-01-14	LLaVA-ST: A Multimodal Large Language Model for Fine-Grained Spatial-Temporal Understanding	Hongyu Li et.al.	2501.08282	link
2025-01-14	Exploring Robustness of LLMs to Sociodemographically-Conditioned Paraphrasing	Pulkit Arora et.al.	2501.08276	null
2025-01-14	Addressing the sustainable AI trilemma: a case study on LLM agents and RAG	Hui Wu et.al.	2501.08262	null
2025-01-14	Eliciting In-context Retrieval and Reasoning for Long-context Large Language Models	Yifu Qiu et.al.	2501.08248	null
2025-01-14	Text-Diffusion Red-Teaming of Large Language Models: Unveiling Harmful Behaviors with Proximity Constraints	Jonathan Nöther et.al.	2501.08246	null
2025-01-14	Investigating Energy Efficiency and Performance Trade-offs in LLM Inference Across Tasks and DVFS Settings	Paul Joe Maliakel et.al.	2501.08219	null
2025-01-14	ASTRID -- An Automated and Scalable TRIaD for the Evaluation of RAG-based Clinical Question Answering Systems	Mohita Chowdhury et.al.	2501.08208	null
2025-01-14	ArithmAttack: Evaluating Robustness of LLMs to Noisy Context in Math Problem Solving	Zain Ul Abedin et.al.	2501.08203	null
2025-01-14	CWEval: Outcome-driven Evaluation on Functionality and Security of LLM Code Generation	Jinjun Peng et.al.	2501.08200	link
2025-01-14	OpenCSG Chinese Corpus: A Series of High-quality Chinese Datasets for LLM Training	Yijiong Yu et.al.	2501.08197	link
2025-01-14	PRESERVE: Prefetching Model Weights and KV-Cache in Distributed LLM Serving	Ahmet Caner Yüzügüler et.al.	2501.08192	null
2025-01-14	A Critical Synthesis of Uncertainty Quantification and Foundation Models in Monocular Depth Estimation	Steven Landgraf et.al.	2501.08188	null
2025-01-14	A Multi-Modal AI Copilot for Single-Cell Analysis with Instruction Following	Yin Fang et.al.	2501.08187	link
2025-01-13	Training-Free Motion-Guided Video Generation with Enhanced Temporal Consistency Using Motion Consistency Loss	Xinyu Zhang et.al.	2501.07563	null
2025-01-13	SST-EM: Advanced Metrics for Evaluating Semantic, Spatial and Temporal Aspects in Video Editing	Varun Biyyala et.al.	2501.07554	link
2025-01-13	Imagine while Reasoning in Space: Multimodal Visualization-of-Thought	Chengzu Li et.al.	2501.07542	null
2025-01-13	ML Mule: Mobile-Driven Context-Aware Collaborative Learning	Haoxiang Yu et.al.	2501.07536	null
2025-01-13	Investigating Large Language Models in Inferring Personality Traits from User Conversations	Jianfeng Zhu et.al.	2501.07532	null
2025-01-13	RadAlign: Advancing Radiology Report Generation with Vision-Language Concept Alignment	Difei Gu et.al.	2501.07525	link
2025-01-13	Parallel Key-Value Cache Fusion for Position Invariant RAG	Philhoon Oh et.al.	2501.07523	null
2025-01-13	Exploring and Mitigating Adversarial Manipulation of Voting-Based Leaderboards	Yangsibo Huang et.al.	2501.07493	null
2025-01-13	TiEBe: A Benchmark for Assessing the Current Knowledge of Large Language Models	Thales Sales Almeida et.al.	2501.07482	null
2025-01-13	A Survey of Embodied AI in Healthcare: Techniques, Applications, and Opportunities	Yihao Liu et.al.	2501.07468	null
2025-01-13	Understanding and Benchmarking Artificial Intelligence: OpenAI's o3 Is Not AGI	Rolf Pfister et.al.	2501.07458	null
2025-01-13	Enhancing LLM's Ability to Generate More Repository-Aware Unit Tests Through Precise Contextual Information Injection	Xin Yin et.al.	2501.07425	null
2025-01-13	Initial Findings on Sensor based Open Vocabulary Activity Recognition via Text Embedding Inversion	Lala Shakti Swarup Ray et.al.	2501.07408	null
2025-01-13	Zero-Shot Scene Understanding for Automatic Target Recognition Using Large Vision-Language Models	Yasiru Ranasinghe et.al.	2501.07396	null
2025-01-13	Enhancing Retrieval-Augmented Generation: A Study of Best Practices	Siran Li et.al.	2501.07391	link
2025-01-13	Extracting Participation in Collective Action from Social Media	Arianna Pera et.al.	2501.07368	null
2025-01-13	Emergent effects of scaling on the functional hierarchies within large language models	Paul C. Bogdan et.al.	2501.07359	null
2025-01-13	Evaluating Pre-Trained Models for Multi-Language Vulnerability Patching	Zanis Ali Khan et.al.	2501.07339	null
2025-01-13	TempoGPT: Enhancing Temporal Reasoning via Quantizing Embedding	Haochuan Zhang et.al.	2501.07335	null
2025-01-13	Foundation Models at Work: Fine-Tuning for Fairness in Algorithmic Hiring	Buse Sibel Korkmaz et.al.	2501.07324	link
2025-01-10	LlamaV-o1: Rethinking Step-by-step Visual Reasoning in LLMs	Omkar Thawakar et.al.	2501.06186	link
2025-01-10	PEACE: Empowering Geologic Map Holistic Understanding with MLLMs	Yangyu Huang et.al.	2501.06184	null
2025-01-10	VideoAuteur: Towards Long Narrative Video Generation	Junfei Xiao et.al.	2501.06173	null
2025-01-10	Multilingual Performance of a Multimodal Artificial Intelligence System on Multisubject Physics Concept Inventories	Gerd Kortemeyer et.al.	2501.06143	null
2025-01-10	Supervision policies can shape long-term risk management in general-purpose AI models	Manuel Cebrian et.al.	2501.06137	link
2025-01-10	CoDriveVLM: VLM-Enhanced Urban Cooperative Dispatching and Motion Planning for Future Autonomous Mobility on Demand Systems	Haichao Liu et.al.	2501.06132	link
2025-01-10	Contextual ASR Error Handling with LLMs Augmentation for Goal-Oriented Conversational AI	Yuya Asano et.al.	2501.06129	null
2025-01-10	Merging Feed-Forward Sublayers for Compressed Transformers	Neha Verma et.al.	2501.06126	link
2025-01-10	Fleurs-SLU: A Massively Multilingual Benchmark for Spoken Language Understanding	Fabian David Schmidt et.al.	2501.06117	link
2025-01-10	From Conversation to Automation: Leveraging Large Language Models to Analyze Strategies in Problem Solving Therapy	Elham Aghakhani et.al.	2501.06101	null
2025-01-10	Personalized Language Model Learning on Text Data Without User Identifiers	Yucheng Ding et.al.	2501.06062	link
2025-01-10	AI-powered virtual tissues from spatial proteomics for clinical diagnostics and biomedical discovery	Johann Wenckstern et.al.	2501.06039	link
2025-01-10	Generate, Transduct, Adapt: Iterative Transduction with VLMs	Oindrila Saha et.al.	2501.06031	null
2025-01-10	Addressing speaker gender bias in large scale speech translation systems	Shubham Bansal et.al.	2501.05989	null
2025-01-10	Comparing Self-Supervised Learning Models Pre-Trained on Human Speech and Animal Vocalizations for Bioacoustics Processing	Eklavya Sarkar et.al.	2501.05987	link
2025-01-10	Exploring LLMs for Automated Pre-Testing of Cross-Cultural Surveys	Divya Mani Adhikari et.al.	2501.05985	null
2025-01-10	Hermit Kingdom Through the Lens of Multiple Perspectives: A Case Study of LLM Hallucination on North Korea	Eunjung Cho et.al.	2501.05981	null
2025-01-10	Model Inversion in Split Learning for Personalized LLMs: New Insights from Information Bottleneck Theory	Yunmeng Shu et.al.	2501.05965	null
2025-01-10	Effective faking of verbal deception detection with target-aligned adversarial attacks	Bennett Kleinberg et.al.	2501.05962	null
2025-01-10	Scalable Vision Language Model Training via High Quality Data Curation	Hongyuan Dong et.al.	2501.05952	null
2025-01-09	An Empirical Study of Autoregressive Pre-training from Videos	Jathushan Rajasegaran et.al.	2501.05453	null
2025-01-09	ReFocus: Visual Editing as a Chain of Thought for Structured Image Understanding	Xingyu Fu et.al.	2501.05452	null
2025-01-09	Relative Pose Estimation through Affine Corrections of Monocular Depth Priors	Yifan Yu et.al.	2501.05446	link
2025-01-09	Can MLLMs Reason in Multimodality? EMMA: An Enhanced MultiModal ReAsoning Benchmark	Yunzhuo Hao et.al.	2501.05444	null
2025-01-09	A survey of textual cyber abuse detection using cutting-edge language models and large language models	Jose A. Diaz-Garcia et.al.	2501.05443	null
2025-01-09	Using LLMs to Infer Non-Binary COVID-19 Sentiments of Chinese Micro-bloggers	Jerry Chongyi Hu et.al.	2501.05423	null
2025-01-09	LongProc: Benchmarking Long-Context Language Models on Long Procedural Generation	Xi Ye et.al.	2501.05414	null
2025-01-09	Seeing Sound: Assembling Sounds from Visuals for Audio-to-Image Generation	Darius Petermann et.al.	2501.05413	null
2025-01-09	A Novel Pathology Foundation Model by Mayo Clinic, Charité, and Aignostics	Maximilian Alber et.al.	2501.05409	null
2025-01-09	Mechanistic understanding and validation of large AI models with SemanticLens	Maximilian Dreyer et.al.	2501.05398	null
2025-01-09	FairCode: Evaluating Social Bias of LLMs in Code Generation	Yongkang Du et.al.	2501.05396	link
2025-01-09	Large Physics Models: Towards a collaborative approach with Large Language Models and Foundation Models	Kristian G. Barman et.al.	2501.05382	null
2025-01-09	Arc2Avatar: Generating Expressive 3D Avatars from a Single Image via ID Guidance	Dimitrios Gerogiannis et.al.	2501.05379	null
2025-01-09	Accelerated Diffusion Models via Speculative Sampling	Valentin De Bortoli et.al.	2501.05370	null
2025-01-09	Stream Aligner: Efficient Sentence-Level Alignment via Distribution Induction	Hantao Lou et.al.	2501.05336	link
2025-01-09	"What's Happening"- A Human-centered Multimodal Interpreter Explaining the Actions of Autonomous Vehicles	Xuewen Luo et.al.	2501.05322	null
2025-01-09	Comparison Study: Glacier Calving Front Delineation in Synthetic Aperture Radar Images With Deep Learning	Nora Gourmelon et.al.	2501.05281	link
2025-01-09	CellViT++: Energy-Efficient and Adaptive Cell Segmentation and Classification Using Foundation Models	Fabian Hörst et.al.	2501.05269	link
2025-01-09	Enhancing Plagiarism Detection in Marathi with a Weighted Ensemble of TF-IDF and BERT Embeddings for Low-Resource Language Processing	Atharva Mutsaddi et.al.	2501.05260	link
2025-01-09	CallNavi: A Study and Challenge on Function Calling Routing and Invocation in Large Language Models	Yewei Song et.al.	2501.05255	null
2025-01-08	EditAR: Unified Conditional Generation with Autoregressive Models	Jiteng Mu et.al.	2501.04699	null
2025-01-08	Re-ranking the Context for Multimodal Retrieval Augmented Generation	Matin Mortaheb et.al.	2501.04695	null
2025-01-08	URSA: Understanding and Verifying Chain-of-thought Reasoning in Multimodal Mathematics	Ruilin Luo et.al.	2501.04686	link
2025-01-08	Enhancing Financial VQA in Vision Language Models using Intermediate Structured Representations	Archita Srivastava et.al.	2501.04675	null
2025-01-08	DRIVINGVQA: Analyzing Visual Chain-of-Thought Reasoning of Vision Language Models in Real-World Scenarios with Driving Theory Tests	Charles Corbière et.al.	2501.04671	null
2025-01-08	On The Origin of Cultural Biases in Language Models: From Pre-training Data to Linguistic Phenomena	Tarek Naous et.al.	2501.04662	null
2025-01-08	Assessing Language Comprehension in Large Language Models Using Construction Grammar	Wesley Scivetti et.al.	2501.04661	null
2025-01-08	Multi-task retriever fine-tuning for domain-specific and efficient RAG	Patrice Béchard et.al.	2501.04652	null
2025-01-08	FlairGPT: Repurposing LLMs for Interior Designs	Gabrielle Littlefair et.al.	2501.04648	null
2025-01-08	A Statistical Theory of Contrastive Pre-training and Multimodal Generative AI	Kazusato Oko et.al.	2501.04641	link
2025-01-08	Knowledge Retrieval Based on Generative AI	Te-Lun Yang et.al.	2501.04635	null
2025-01-08	"Can you be my mum?": Manipulating Social Robots in the Large Language Models Era	Giulio Antonio Abbo et.al.	2501.04633	null
2025-01-08	MedCoDi-M: A Multi-Prompt Foundation Model for Multimodal Medical Data Generation	Daniele Molino et.al.	2501.04614	null
2025-01-08	Quantum-inspired Embeddings Projection and Similarity Metrics for Representation Learning	Ivan Kankeu et.al.	2501.04591	link
2025-01-08	Boosting Salient Object Detection with Knowledge Distillated from Large Foundation Models	Miaoyang He et.al.	2501.04582	null
2025-01-08	InfiGUIAgent: A Multimodal Generalist GUI Agent with Native Reasoning and Reflection	Yuhang Liu et.al.	2501.04575	link
2025-01-08	Supervision-free Vision-Language Alignment	Giorgio Giannone et.al.	2501.04568	null
2025-01-08	OpenOmni: Large Language Models Pivot Zero-shot Omnimodal Alignment across Language with Real-time Self-Aware Emotional Speech Synthesis	Run Luo et.al.	2501.04561	link
2025-01-08	The Impostor is Among Us: Can Large Language Models Capture the Complexity of Human Personas?	Christopher Lazik et.al.	2501.04543	null
2025-01-08	rStar-Math: Small LLMs Can Master Math Reasoning with Self-Evolved Deep Thinking	Xinyu Guan et.al.	2501.04519	null
2025-01-07	LargeAD: Large-Scale Cross-Sensor Data Pretraining for Autonomous Driving	Lingdong Kong et.al.	2501.04005	null
2025-01-07	Are VLMs Ready for Autonomous Driving? An Empirical Study from the Reliability, Data, and Metric Perspectives	Shaoyuan Xie et.al.	2501.04003	link
2025-01-07	Sa2VA: Marrying SAM2 with LLaVA for Dense Grounded Understanding of Images and Videos	Haobo Yuan et.al.	2501.04001	link
2025-01-07	RAG-Check: Evaluating Multimodal Retrieval Augmented Generation Performance	Matin Mortaheb et.al.	2501.03995	null
2025-01-07	Influences on LLM Calibration: A Study of Response Agreement, Loss Functions, and Prompt Styles	Yuxi Xia et.al.	2501.03991	null
2025-01-07	(De)-Indexing and the Right to be Forgotten	Salvatore Vilella et.al.	2501.03989	null
2025-01-07	VLM-driven Behavior Tree for Context-aware Task Planning	Naoki Wake et.al.	2501.03968	link
2025-01-07	Vision Language Models as Values Detectors	Giulio Antonio Abbo et.al.	2501.03957	null
2025-01-07	Localizing AI: Evaluating Open-Weight Language Models for Languages of Baltic States	Jurgita Kapočiūtė-Dzikienė et.al.	2501.03952	null
2025-01-07	Not all tokens are created equal: Perplexity Attention Weighted Networks for AI generated text detection	Pablo Miralles-González et.al.	2501.03940	null
2025-01-07	Visual question answering: from early developments to recent advances -- a survey	Ngoc Dung Huynh et.al.	2501.03939	null
2025-01-07	Exploring the Potential of Large Language Models in Public Transportation: San Antonio Case Study	Ramya Jonnala et.al.	2501.03904	null
2025-01-07	LLaVA-Mini: Efficient Image and Video Large Multimodal Models with One Vision Token	Shaolei Zhang et.al.	2501.03895	link
2025-01-07	AlphaPO -- Reward shape matters for LLM alignment	Aman Gupta et.al.	2501.03884	null
2025-01-07	CL3DOR: Contrastive Learning for 3D Large Multimodal Models via Odds Ratio on High-Resolution Point Clouds	Keonwoo Kim et.al.	2501.03879	null
2025-01-07	Improving Dialectal Slot and Intent Detection with Auxiliary Tasks: A Multi-Dialectal Bavarian Case Study	Xaver Maria Krückl et.al.	2501.03863	link
2025-01-07	Progressive Document-level Text Simplification via Large Language Models	Dengzhao Fang et.al.	2501.03857	null
2025-01-07	BabyLMs for isiXhosa: Data-Efficient Language Modelling in a Low-Resource Context	Alexis Matzopoulos et.al.	2501.03855	null
2025-01-07	OmniManip: Towards General Robotic Manipulation via Object-Centric Interaction Primitives as Spatial Constraints	Mingjie Pan et.al.	2501.03841	null
2025-01-07	MedFocusCLIP : Improving few shot classification in medical datasets using pixel wise attention	Aadya Arora et.al.	2501.03839	null
2025-01-06	BoostStep: Boosting mathematical capability of Large Language Models via improved single-step reasoning	Beichen Zhang et.al.	2501.03226	link
2025-01-06	Automated Generation of Challenging Multiple-Choice Questions for Vision Language Model Evaluation	Yuhui Zhang et.al.	2501.03225	link
2025-01-06	Leveraging Explainable AI for LLM Text Attribution: Differentiating Human-Written and Multiple LLMs-Generated Text	Ayat Najjar et.al.	2501.03212	null
2025-01-06	Detecting AI-Generated Text in Educational Content: Leveraging Machine Learning and Explainable AI for Academic Integrity	Ayat A. Najjar et.al.	2501.03203	null
2025-01-06	The FACTS Grounding Leaderboard: Benchmarking LLMs' Ability to Ground Responses to Long-Form Input	Alon Jacovi et.al.	2501.03200	null
2025-01-06	CLIX: Cross-Lingual Explanations of Idiomatic Expressions	Aaron Gluck et.al.	2501.03191	null
2025-01-06	Semantic Captioning: Benchmark Dataset and Graph-Aware Few-Shot In-Context Learning for SQL2Text	Ali Al-Lawati et.al.	2501.03166	link
2025-01-06	Segment Anything Model for Zero-shot Single Particle Tracking in Liquid Phase Transmission Electron Microscopy	Risha Goel et.al.	2501.03153	link
2025-01-06	Large language models for artificial general intelligence (AGI): A survey of foundational principles and approaches	Alhassan Mumuni et.al.	2501.03151	null
2025-01-06	VicSim: Enhancing Victim Simulation with Emotional and Linguistic Fidelity	Yerong Li et.al.	2501.03139	null
2025-01-06	PRMBench: A Fine-grained and Challenging Benchmark for Process-Level Reward Models	Mingyang Song et.al.	2501.03124	link
2025-01-06	CAT: Content-Adaptive Image Tokenization	Junhong Shen et.al.	2501.03120	null
2025-01-06	LangFair: A Python Package for Assessing Bias and Fairness in Large Language Model Use Cases	Dylan Bouchard et.al.	2501.03112	link
2025-01-06	Sentiment-guided Commonsense-aware Response Generation for Mental Health Counseling	Aseem Srivastava et.al.	2501.03088	null
2025-01-06	Retrieval-Augmented TLAPS Proof Generation with Large Language Models	Yuhao Zhou et.al.	2501.03073	null
2025-01-06	Trust Modeling in Counseling Conversations: A Benchmark Study	Aseem Srivastava et.al.	2501.03064	null
2025-01-06	ChronoSense: Exploring Temporal Understanding in Large Language Models with Time Intervals of Events	Duygu Sezen Islakoglu et.al.	2501.03040	null
2025-01-06	Piano Transcription by Hierarchical Language Modeling with Pretrained Roll-based Encoders	Dichucheng Li et.al.	2501.03038	null
2025-01-06	Quantization Meets Reasoning: Exploring LLM Low-Bit Quantization Degradation for Mathematical Reasoning	Zhen Li et.al.	2501.03035	null
2025-01-06	CALM: Curiosity-Driven Auditing for Large Language Models	Xiang Zheng et.al.	2501.02997	link
2025-01-03	VITA-1.5: Towards GPT-4o Level Real-Time Vision and Speech Interaction	Chaoyou Fu et.al.	2501.01957	link
2025-01-03	Metadata Conditioning Accelerates Language Model Pre-training	Tianyu Gao et.al.	2501.01956	link
2025-01-03	Cold-Start Recommendation towards the Era of Large Language Models (LLMs): A Comprehensive Survey and Roadmap	Weizhi Zhang et.al.	2501.01945	link
2025-01-03	Abstractive Text Summarization for Contemporary Sanskrit Prose: Issues and Challenges	Shagun Sinha et.al.	2501.01933	null
2025-01-03	Bridging Classification and Segmentation in Osteosarcoma Assessment via Foundation and Discrete Diffusion Models	Manh Duong Nguyen et.al.	2501.01932	link
2025-01-03	Mitigating Hallucination for Large Vision Language Model by Inter-Modality Correlation Calibration Decoding	Jiaming Li et.al.	2501.01926	link
2025-01-03	Virgo: A Preliminary Exploration on Reproducing o1-like MLLM	Yifan Du et.al.	2501.01904	link
2025-01-03	QuArch: A Question-Answering Dataset for AI Agents in Computer Architecture	Shvetank Prakash et.al.	2501.01892	null
2025-01-03	Turning Logic Against Itself : Probing Model Defenses Through Contrastive Questions	Rachneet Sachdeva et.al.	2501.01872	link
2025-01-03	Multi-Agent Conversational Online Learning for Adaptive LLM Response Identification	Xiangxiang Dai et.al.	2501.01849	link
2025-01-03	MoColl: Agent-Based Specific and General Model Collaboration for Image Captioning	Pu Yang et.al.	2501.01834	null
2025-01-03	Time Series Language Model for Descriptive Caption Generation	Mohamed Trabelsi et.al.	2501.01832	null
2025-01-03	Auto-RT: Automatic Jailbreak Strategy Exploration for Red-Teaming Large Language Models	Yanjiang Liu et.al.	2501.01830	null
2025-01-03	SDPO: Segment-Level Direct Preference Optimization for Social Agents	Aobo Kong et.al.	2501.01821	link
2025-01-03	BERT4MIMO: A Foundation Model using BERT Architecture for Massive MIMO Channel State Information Prediction	Ferhat Ozgur Catak et.al.	2501.01802	link
2025-01-03	Reading Between the Lines: A dataset and a study on why some texts are tougher than others	Nouran Khallaf et.al.	2501.01796	link
2025-01-03	Creating Artificial Students that Never Existed: Leveraging Large Language Models and CTGANs for Synthetic Data Generation	Mohammad Khalil et.al.	2501.01793	link
2025-01-03	Efficient LLM Inference with Activation Checkpointing and Hybrid Caching	Sanghyeon Lee et.al.	2501.01792	null
2025-01-03	LogicAD: Explainable Anomaly Detection via VLM-based Text Feature Extraction	Er Jin et.al.	2501.01767	null
2025-01-03	SaLoRA: Safety-Alignment Preserved Low-Rank Adaptation	Mingjie Li et.al.	2501.01765	null
2025-01-02	GPT4Scene: Understand 3D Scenes from Videos with Vision-Language Models	Zhangyang Qi et.al.	2501.01428	null
2025-01-02	Unifying Specialized Visual Encoders for Video Language Models	Jihoon Chung et.al.	2501.01426	link
2025-01-02	Reconstruction vs. Generation: Taming Optimization Dilemma in Latent Diffusion Models	Jingfeng Yao et.al.	2501.01423	link
2025-01-02	OmniChat: Enhancing Spoken Dialogue Systems with Scalable Synthetic Data for Diverse Scenarios	Xize Cheng et.al.	2501.01384	null
2025-01-02	Training Medical Large Vision-Language Models with Abnormal-Aware Feedback	Yucheng Zhou et.al.	2501.01377	null
2025-01-02	ScarNet: A Novel Foundation Model for Automated Myocardial Scar Quantification from LGE in Cardiac MRI	Neda Tavakoli et.al.	2501.01372	link
2025-01-02	CLIP-UP: CLIP-Based Unanswerable Problem Detection for Visual Question Answering	Ben Vardi et.al.	2501.01371	null
2025-01-02	Large Vision-Language Model Alignment and Misalignment: A Survey Through the Lens of Explainability	Dong Shu et.al.	2501.01346	null
2025-01-02	Aligning Large Language Models for Faithful Integrity Against Opposing Argument	Yong Zhao et.al.	2501.01336	link
2025-01-02	CySecBench: Generative AI-based CyberSecurity-focused Prompt Dataset for Benchmarking Large Language Models	Johan Wahréus et.al.	2501.01335	link
2025-01-02	Decoding Knowledge in Large Language Models: A Framework for Categorization and Comprehension	Yanbo Fang et.al.	2501.01332	null
2025-01-02	The Prompt Alchemist: Automated LLM-Tailored Prompt Optimization for Test Case Generation	Shuzheng Gao et.al.	2501.01329	null
2025-01-02	Think More, Hallucinate Less: Mitigating Hallucinations via Dual Process of Fast and Slow Thinking	Xiaoxue Cheng et.al.	2501.01306	null
2025-01-02	Large Language Models for Mental Health Diagnostic Assessments: Exploring The Potential of Large Language Models for Assisting with Mental Health Diagnostic Assessments -- The Depression and Anxiety Case	Kaushik Roy et.al.	2501.01305	null
2025-01-02	NeutraSum: A Language Model can help a Balanced Media Diet by Neutralizing News Summaries	Xi Luo et.al.	2501.01284	null
2025-01-02	CultureVLM: Characterizing and Improving Cultural Understanding of Vision-Language Models for over 100 Countries	Shudong Liu et.al.	2501.01282	null
2025-01-02	Language Models for Code Optimization: Survey, Challenges and Future Directions	Jingzhi Gong et.al.	2501.01277	link
2025-01-02	Does a Large Language Model Really Speak in Human-Like Language?	Mose Park et.al.	2501.01273	null
2025-01-02	ProgCo: Program Helps Self-Correction of Large Language Models	Xiaoshuai Song et.al.	2501.01264	null
2025-01-02	CodeElo: Benchmarking Competition-level Code Generation of LLMs with Human-comparable Elo Ratings	Shanghaoran Quan et.al.	2501.01257	null
2024-12-30	Distributed Mixture-of-Agents for Edge Inference with Large Language Models	Purbesh Mitra et.al.	2412.21200	link
2024-12-31	HumanEval Pro and MBPP Pro: Evaluating Large Language Models on Self-invoking Code Generation	Zhaojian Yu et.al.	2412.21199	link
2024-12-30	Aviary: training language agents on challenging scientific tasks	Siddharth Narayanan et.al.	2412.21154	null
2024-12-30	Facilitating large language model Russian adaptation with Learned Embedding Propagation	Mikhail Tikhomirov et.al.	2412.21140	link
2024-12-30	Training Software Engineering Agents and Verifiers with SWE-Gym	Jiayi Pan et.al.	2412.21139	link
2024-12-30	Adaptive Batch Size Schedules for Distributed Training of Language Models with Data and Model Parallelism	Tim Tsz-Kit Lau et.al.	2412.21124	null
2024-12-30	ExpShield: Safeguarding Web Text from Unauthorized Crawling and Language Modeling Exploitation	Ruixuan Liu et.al.	2412.21123	null
2024-12-30	Vinci: A Real-time Embodied Smart Assistant based on Egocentric Vision-Language Model	Yifei Huang et.al.	2412.21080	link
2024-12-30	Efficient Multi-Task Inferencing with a Shared Backbone and Lightweight Task-Specific Adapters for Automatic Scoring	Ehsan Latif et.al.	2412.21065	null
2024-12-30	Toward Intelligent and Secure Cloud: Large Language Model Empowered Proactive Defense	Yuyang Zhou et.al.	2412.21051	link
2024-12-30	Visual Style Prompt Learning Using Diffusion Models for Blind Face Restoration	Wanglong Lu et.al.	2412.21042	link
2024-12-30	TangoFlux: Super Fast and Faithful Text to Audio Generation with Flow Matching and Clap-Ranked Preference Optimization	Chia-Yu Hung et.al.	2412.21037	link
2024-12-30	GePBench: Evaluating Fundamental Geometric Perception for Multimodal Large Language Models	Shangyu Xing et.al.	2412.21036	null
2024-12-30	MapQaTor: A System for Efficient Annotation of Map Query Datasets	Mahir Labib Dihan et.al.	2412.21015	link
2024-12-31	Verbosity-Aware Rationale Reduction: Effective Reduction of Redundant Rationale via Principled Criteria	Joonwon Jang et.al.	2412.21006	null
2024-12-30	Plug-and-Play Training Framework for Preference Optimization	Jingyuan Ma et.al.	2412.20996	null
2024-12-30	KARPA: A Training-free Method of Adapting Knowledge Graph as References for Large Language Model's Reasoning Path Aggregation	Siyuan Fang et.al.	2412.20995	null
2024-12-30	Efficiently Serving LLM Reasoning Programs with Certaindex	Yichao Fu et.al.	2412.20993	null
2024-12-30	AlignAb: Pareto-Optimal Energy Alignment for Designing Nature-Like Antibodies	Yibo Wen et.al.	2412.20984	null
2024-12-30	UnrealZoo: Enriching Photo-realistic Virtual Worlds for Embodied AI	Fangwei Zhong et.al.	2412.20977	null
2024-12-27	MVTamperBench: Evaluating Robustness of Vision-Language Models	Amit Agarwal et.al.	2412.19794	null
2024-12-27	InfAlign: Inference-aware language model alignment	Ananth Balashankar et.al.	2412.19792	null
2024-12-27	Enhancing Whisper's Accuracy and Speed for Indian Languages through Prompt-Tuning and Tokenization	Kumud Tripathi et.al.	2412.19785	null
2024-12-27	Can AI Help with Your Personal Finances?	Oudom Hean et.al.	2412.19784	null
2024-12-27	Fortran2CPP: Automating Fortran-to-C++ Migration using LLMs via Multi-Turn Dialogue and Dual-Agent Integration	Le Chen et.al.	2412.19770	link
2024-12-27	On dual-projectively equivalent connections associated to second order superintegrable systems	Andreas Vollmer et.al.	2412.19739	null
2024-12-27	Can Large Language Models Adapt to Other Agents In-Context?	Matthew Riemer et.al.	2412.19726	null
2024-12-27	OS-Genesis: Automating GUI Agent Trajectory Construction via Reverse Task Synthesis	Qiushi Sun et.al.	2412.19723	null
2024-12-27	Toward Adaptive Reasoning in Large Language Models with Thought Rollback	Sijia Chen et.al.	2412.19707	link
2024-12-27	A Large-scale Interpretable Multi-modality Benchmark for Facial Image Forgery Localization	Jingchun Lian et.al.	2412.19685	null
2024-12-27	Boosting Private Domain Understanding of Efficient MLLMs: A Tuning-free, Adaptive, Universal Prompt Optimization Framework	Jiang Liu et.al.	2412.19684	null
2024-12-27	CAD-GPT: Synthesising CAD Construction Sequence with Spatial Reasoning-Enhanced Multimodal LLMs	Siyu Wang et.al.	2412.19663	null
2024-12-27	Asymmetrical Reciprocity-based Federated Learning for Resolving Disparities in Medical Diagnosis	Jiaqi Wang et.al.	2412.19654	link
2024-12-27	FreStega: A Plug-and-Play Method for Boosting Imperceptibility and Capacity in Generative Linguistic Steganography for Real-World Scenarios	Kaiyi Pang et.al.	2412.19652	null
2024-12-27	Xmodel-2 Technical Report	Wang Qun et.al.	2412.19638	null
2024-12-27	IMTP: Search-based Code Generation for In-memory Tensor Programs	Yongwon Shin et.al.	2412.19630	null
2024-12-27	Signatures of prediction during natural listening in MEG data?	Sahel Azizpour et.al.	2412.19622	null
2024-12-27	Gradient Weight-normalized Low-rank Projection for Efficient LLM Training	Jia-Hong Huang et.al.	2412.19616	link
2024-12-27	Let Watermarks Speak: A Robust and Unforgeable Watermark for Language Models	Minhao Bai et.al.	2412.19603	null
2024-12-27	SocRATES: Towards Automated Scenario-based Testing of Social Navigation Algorithms	Shashank Rao Marpally et.al.	2412.19595	null
2024-12-24	Video-Panda: Parameter-efficient Alignment for Encoder-free Video-Language Models	Jinhui Yi et.al.	2412.18609	link
2024-12-24	Orient Anything: Learning Robust Object Orientation Estimation from Rendering 3D Models	Zehan Wang et.al.	2412.18605	link
2024-12-24	Explaining in Diffusion: Explaining a Classifier Through Hierarchical Semantics with Text-to-Image Diffusion Models	Tahira Kazimi et.al.	2412.18604	null
2024-12-24	Long-Form Speech Generation with Spoken Language Models	Se Jin Park et.al.	2412.18603	link
2024-12-24	Decentralized Intelligence in GameFi: Embodied AI Agents and the Convergence of DeFi and Virtual Ecosystems	Fernando Jia et.al.	2412.18601	link
2024-12-24	A Paragraph is All It Takes: Rich Robot Behaviors from Interacting, Trusted LLMs	OpenMind et.al.	2412.18588	null
2024-12-24	Exploring Embedding Priors in Prompt-Tuning for Improved Interpretability and Control	Sergey Sedov et.al.	2412.18582	null
2024-12-24	Zero-resource Speech Translation and Recognition with LLMs	Karel Mundnich et.al.	2412.18566	null
2024-12-24	Distilling Fine-grained Sentiment Understanding from Large Language Models	Yice Zhang et.al.	2412.18552	link
2024-12-24	Token-Budget-Aware LLM Reasoning	Tingxu Han et.al.	2412.18547	link
2024-12-24	Consistency Checks for Language Model Forecasters	Daniel Paleka et.al.	2412.18544	null
2024-12-24	PLD-Tree: Persistent Laplacian Decision Tree for Protein-Protein Binding Free Energy Prediction	Xingjian Xu et.al.	2412.18541	null
2024-12-24	Harnessing Large Language Models for Knowledge Graph Question Answering via Adaptive Multi-Aspect Retrieval-Augmentation	Derong Xu Xinhang Li et.al.	2412.18537	link
2024-12-24	Automated Code Review In Practice	Umut Cihan et.al.	2412.18531	null
2024-12-24	The Key of Understanding Vision Tasks: Explanatory Instructions	Yang Shen et.al.	2412.18525	link
2024-12-24	Large Language Model guided Deep Reinforcement Learning for Decision Making in Autonomous Driving	Hao Pang et.al.	2412.18511	null
2024-12-24	Think or Remember? Detecting and Directing LLMs Towards Memorization or Generalization	Yi-Fu Fu et.al.	2412.18497	null
2024-12-24	Generating event descriptions under syntactic and semantic constraints	Angela Cao et.al.	2412.18496	link
2024-12-24	Segment-Based Attention Masking for GPTs	Shahar Katz et.al.	2412.18487	link
2024-12-24	3DGraphLLM: Combining Semantic Graphs and Large Language Models for 3D Scene Understanding	Tatiana Zemskova et.al.	2412.18450	link
2024-12-23	ChatGarment: Garment Estimation, Generation and Editing via Large Language Models	Siyuan Bian et.al.	2412.17811	null
2024-12-23	Reconstructing People, Places, and Cameras	Lea Müller et.al.	2412.17806	null
2024-12-23	Examining Imbalance Effects on Performance and Demographic Fairness of Clinical Language Models	Precious Jones et.al.	2412.17803	null
2024-12-23	Comprehensive Multi-Modal Prototypes are Simple and Effective Classifiers for Vast-Vocabulary Object Detection	Yitong Chen et.al.	2412.17800	link
2024-12-23	Automating the Search for Artificial Life with Foundation Models	Akarsh Kumar et.al.	2412.17799	link
2024-12-23	Memory makes computation universal, remember?	Erik Garrison et.al.	2412.17794	null
2024-12-23	Cross-Lingual Text-Rich Visual Comprehension: An Information Theory Perspective	Xinmiao Yu et.al.	2412.17787	null
2024-12-23	PepTune: De Novo Generation of Therapeutic Peptides with Multi-Objective-Guided Discrete Diffusion	Sophia Tang et.al.	2412.17780	null
2024-12-23	ResearchTown: Simulator of Human Research Community	Haofei Yu et.al.	2412.17767	link
2024-12-23	Survey of Large Multimodal Model Datasets, Application Categories and Taxonomy	Priyaranjan Pattnayak et.al.	2412.17759	null
2024-12-23	ADC: Enhancing Function Calling Via Adversarial Datasets and Code Line-Level Feedback	Wei Zhang et.al.	2412.17754	null
2024-12-23	Deliberation in Latent Space via Differentiable Cache Augmentation	Luyang Liu et.al.	2412.17747	null
2024-12-23	YuLan-Mini: An Open Data-efficient Language Model	Yiwen Hu et.al.	2412.17743	link
2024-12-23	Reasoning to Attend: Try to Understand How Token Works	Rui Qian et.al.	2412.17741	link
2024-12-23	Fourier Position Embedding: Enhancing Attention's Periodic Extension for Length Generalization	Ermo Hua et.al.	2412.17739	link
2024-12-23	Knowledge Editing through Chain-of-Thought	Changyue Wang et.al.	2412.17727	link
2024-12-23	From Models to Microtheories: Distilling a Model's Topical Knowledge for Grounded Question Answering	Nathaniel Weir et.al.	2412.17701	link
2024-12-23	Understanding the Logic of Direct Preference Alignment through Logic	Kyle Richardson et.al.	2412.17696	null
2024-12-23	FedTLU: Federated Learning with Targeted Layer Updates	Jong-Ik Park et.al.	2412.17692	null
2024-12-23	Large Language Model Safety: A Holistic Survey	Dan Shi et.al.	2412.17686	link
2024-12-20	HoVLE: Unleashing the Power of Monolithic Vision-Language Models with Holistic Vision-Language Embedding	Chenxin Tao et.al.	2412.16158	null
2024-12-20	Frequency Is What You Need: Word-frequency Masking Benefits Vision-Language Model Pre-training	Mingliang Liang et.al.	2412.16148	link
2024-12-20	Offline Reinforcement Learning for LLM Multi-Step Reasoning	Huaijie Wang et.al.	2412.16145	link
2024-12-20	Can LLMs Obfuscate Code? A Systematic Analysis of Large Language Models into Assembly Code Obfuscation	Seyedreza Mohseni et.al.	2412.16135	null
2024-12-20	Data-Driven Mechanism Design: Jointly Eliciting Preferences and Information	Dirk Bergemann et.al.	2412.16132	null
2024-12-20	PromptOptMe: Error-Aware Prompt Compression for LLM-based MT Evaluation Metrics	Daniil Larionov et.al.	2412.16120	null
2024-12-20	Deciphering the Underserved: Benchmarking LLM OCR for Low-Resource Scripts	Muhammad Abdullah Sohail et.al.	2412.16119	link
2024-12-20	PruneVid: Visual Token Pruning for Efficient Video Large Language Models	Xiaohu Huang et.al.	2412.16117	link
2024-12-20	The Content Moderator's Dilemma: Removal of Toxic Content and Distortions to Online Discourse	Mahyar Habibi et.al.	2412.16114	null
2024-12-20	Demystifying the Potential of ChatGPT-4 Vision for Construction Progress Monitoring	Ahmet Bahaddin Ersoz et.al.	2412.16108	null
2024-12-20	Interleaved Speech-Text Language Models are Simple Streaming Text to Speech Synthesizers	Yifan Yang et.al.	2412.16102	null
2024-12-20	Logical Consistency of Large Language Models in Fact-checking	Bishwamittra Ghosh et.al.	2412.16100	null
2024-12-20	The Evolution of LLM Adoption in Industry Data Curation Practices	Crystal Qian et.al.	2412.16089	null
2024-12-20	Efficient MedSAMs: Segment Anything in Medical Images on Laptop	Jun Ma et.al.	2412.16085	link
2024-12-20	Formal Mathematical Reasoning: A New Frontier in AI	Kaiyu Yang et.al.	2412.16075	null
2024-12-20	A Framework for Streaming Event-Log Prediction in Business Processes	Benedikt Bollig et.al.	2412.16032	null
2024-12-20	The Only Way is Ethics: A Guide to Ethical Research with Large Language Models	Eddie L. Ungless et.al.	2412.16022	link
2024-12-20	Fearful Falcons and Angry Llamas: Emotion Category Annotations of Arguments by Humans and LLMs	Lynn Greschner et.al.	2412.15993	null
2024-12-20	BabyHGRN: Exploring RNNs for Sample-Efficient Training of Language Models	Patrick Haller et.al.	2412.15978	null
2024-12-20	Legommenders: A Comprehensive Content-Based Recommendation Library with LLM Support	Qijiong Liu et.al.	2412.15973	link
2024-12-19	PRIMA: Multi-Image Vision-Language Models for Reasoning Segmentation	Muntasir Wahed et.al.	2412.15209	null
2024-12-19	OpenEMMA: Open-Source Multimodal Model for End-to-End Autonomous Driving	Shuo Xing et.al.	2412.15208	link
2024-12-19	AutoTrust: Benchmarking Trustworthiness in Large Vision Language Models for Autonomous Driving	Shuo Xing et.al.	2412.15206	link
2024-12-19	MMLU-CF: A Contamination-free Multi-task Language Understanding Benchmark	Qihao Zhao et.al.	2412.15194	link
2024-12-19	EarthDial: Turning Multi-sensory Earth Observations to Interactive Dialogues	Sagar Soni et.al.	2412.15190	null
2024-12-19	LlamaFusion: Adapting Pretrained Language Models for Multimodal Generation	Weijia Shi et.al.	2412.15188	null
2024-12-19	Data for Mathematical Copilots: Better Ways of Presenting Proofs for Machine Learning	Simon Frieder et.al.	2412.15184	null
2024-12-19	STRAP: Robot Sub-Trajectory Retrieval for Augmented Policy Learning	Marius Memmel et.al.	2412.15182	null
2024-12-19	HPC-Coder-V2: Studying Code LLMs Across Low-Resource Parallel Languages	Aman Chaturvedi et.al.	2412.15178	null
2024-12-19	Critical-Questions-of-Thought: Steering LLM reasoning with Argumentative Querying	Federico Castagna et.al.	2412.15177	link
2024-12-19	Rethinking Uncertainty Estimation in Natural Language Generation	Lukas Aichberger et.al.	2412.15176	null
2024-12-19	Language Models as Continuous Self-Evolving Data Engineers	Peidong Wang et.al.	2412.15151	null
2024-12-19	Adaptive Pruning for Large Language Models with Structural Importance Awareness	Haotian Zheng et.al.	2412.15127	null
2024-12-19	Outcome-Refining Process Supervision for Code Generation	Zhuohao Yu et.al.	2412.15118	link
2024-12-19	Qwen2.5 Technical Report	Qwen et.al.	2412.15115	link
2024-12-19	Associative memory inspires improvements for in-context learning using a novel attention residual stream architecture	Thomas F Burns et.al.	2412.15113	link
2024-12-19	Knowing Where to Focus: Attention-Guided Alignment for Text-based Person Search	Lei Tan et.al.	2412.15106	null
2024-12-19	Review-Then-Refine: A Dynamic Framework for Multi-Hop Question Answering with Temporal Adaptability	Xiangsen Chen et.al.	2412.15101	null
2024-12-19	Nano-ESG: Extracting Corporate Sustainability Information from News Articles	Fabian Billert et.al.	2412.15093	link
2024-12-19	ScamChatBot: An End-to-End Analysis of Fake Account Recovery on Social Media via Chatbots	Bhupendra Acharya et.al.	2412.15072	null
2024-12-18	Thinking in Space: How Multimodal Large Language Models See, Remember, and Recall Spaces	Jihan Yang et.al.	2412.14171	link
2024-12-18	TheAgentCompany: Benchmarking LLM Agents on Consequential Real World Tasks	Frank F. Xu et.al.	2412.14161	link
2024-12-18	Advanced Reasoning and Transformation Engine for Multi-Step Insight Synthesis in Data Analytics with Large Language Models	Atin Sakkeer Hussain et.al.	2412.14146	null
2024-12-18	Incorporating Feature Pyramid Tokenization and Open Vocabulary Semantic Segmentation	Jianyu Zhang et.al.	2412.14145	null
2024-12-18	LLMs can realize combinatorial creativity: generating creative ideas via LLMs for scientific research	Tianyang Gu et.al.	2412.14141	null
2024-12-18	Design choices made by LLM-based test generators prevent them from finding bugs	Noble Saji Mathews et.al.	2412.14137	null
2024-12-18	Performance Gap in Entity Knowledge Extraction Across Modalities in Vision Language Models	Ido Cohen et.al.	2412.14133	link
2024-12-18	Foundation Models Meet Low-Cost Sensors: Test-Time Adaptation for Rescaling Disparity for Zero-Shot Metric Depth Estimation	Rémi Marsal et.al.	2412.14103	null
2024-12-18	Adaptive Concept Bottleneck for Foundation Models Under Distribution Shifts	Jihye Choi et.al.	2412.14097	null
2024-12-18	Alignment faking in large language models	Ryan Greenblatt et.al.	2412.14093	link
2024-12-18	Future Research Avenues for Artificial Intelligence in Digital Gaming: An Exploratory Report	Markus Dablander et.al.	2412.14085	null
2024-12-18	Rango: Adaptive Retrieval-Augmented Proving for Automated Software Verification	Kyle Thompson et.al.	2412.14063	link
2024-12-18	Understanding and Evaluating Trust in Generative AI and Large Language Models for Spreadsheets	Simon Thorne et.al.	2412.14062	null
2024-12-18	Towards Generalist Robot Policies: What Matters in Building Vision-Language-Action Models	Xinghang Li et.al.	2412.14058	null
2024-12-18	A Review of Multimodal Explainable Artificial Intelligence: Past, Present and Future	Shilin Sun et.al.	2412.14056	link
2024-12-18	Digestion Algorithm in Hierarchical Symbolic Forests: A Fast Text Normalization Algorithm and Semantic Parsing Framework for Specific Scenarios and Lightweight Deployment	Kevin You et.al.	2412.14054	null
2024-12-18	Cross-Lingual Transfer of Debiasing and Detoxification in Multilingual LLMs: An Extensive Investigation	Vera Neplenbroek et.al.	2412.14050	link
2024-12-18	CAD-Recode: Reverse Engineering CAD Code from Point Clouds	Danila Rukhovich et.al.	2412.14042	link
2024-12-18	Hansel: Output Length Controlling Framework for Large Language Models	Seoha Song et.al.	2412.14033	null
2024-12-18	Discovering maximally consistent distribution of causal tournaments with Large Language Models	Federico Baldo et.al.	2412.14019	null
2024-12-17	Proposer-Agent-Evaluator(PAE): Autonomous Skill Discovery For Foundation Model Internet Agents	Yifei Zhou et.al.	2412.13194	null
2024-12-17	GaussTR: Foundation Model-Aligned Gaussian Transformer for Self-Supervised 3D Spatial Understanding	Haoyi Jiang et.al.	2412.13193	link
2024-12-17	HandsOnVLM: Vision-Language Models for Hand-Object Interaction Prediction	Chen Bao et.al.	2412.13187	null
2024-12-17	Feather the Throttle: Revisiting Visual Token Pruning for Vision-Language Model Acceleration	Mark Endo et.al.	2412.13180	null
2024-12-17	SafeAgentBench: A Benchmark for Safe Task Planning of Embodied LLM Agents	Sheng Yin et.al.	2412.13178	link
2024-12-17	DnDScore: Decontextualization and Decomposition for Factuality Verification in Long-Form Text Generation	Miriam Wanner et.al.	2412.13175	null
2024-12-17	Locate n' Rotate: Two-stage Openable Part Detection with Foundation Model Priors	Siqi Li et.al.	2412.13173	link
2024-12-17	Compressed Chain of Thought: Efficient Reasoning Through Dense Representations	Jeffrey Cheng et.al.	2412.13171	null
2024-12-17	Algorithmic Fidelity of Large Language Models in Generating Synthetic German Public Opinions: A Case Study	Bolei Ma et.al.	2412.13169	link
2024-12-17	C-FedRAG: A Confidential Federated Retrieval-Augmented Generation System	Parker Addison et.al.	2412.13163	null
2024-12-17	SWAN: Preprocessing SGD Enables Adam-Level Performance On LLM Training With Significant Memory Reduction	Chao Ma et.al.	2412.13148	null
2024-12-17	Are Your LLMs Capable of Stable Reasoning?	Junnan Liu et.al.	2412.13147	link
2024-12-17	A Knowledge-enhanced Pathology Vision-language Foundation Model for Cancer Diagnosis	Xiao Zhou et.al.	2412.13126	null
2024-12-17	AI PERSONA: Towards Life-long Personalization of LLMs	Tiannan Wang et.al.	2412.13103	null
2024-12-17	AIR-Bench: Automated Heterogeneous Information Retrieval Benchmark	Jianlyu Chen et.al.	2412.13102	link
2024-12-17	Uchaguzi-2022: A Dataset of Citizen Reports on the 2022 Kenyan Election	Roberto Mondini et.al.	2412.13098	null
2024-12-17	LMUnit: Fine-grained Evaluation with Natural Language Unit Tests	Jon Saad-Falcon et.al.	2412.13091	null
2024-12-17	Taming Multi-Domain, -Fidelity Data: Towards Foundation Models for Atomistic Scale Simulations	Tomoya Shiota et.al.	2412.13088	link
2024-12-17	Modality-Inconsistent Continual Learning of Multimodal Large Language Models	Weiguo Pian et.al.	2412.13050	null
2024-12-17	Harnessing Event Sensory Data for Error Pattern Prediction in Vehicles: A Language Model Approach	Hugo Math et.al.	2412.13041	link
2024-12-16	SepLLM: Accelerate Large Language Models by Compressing One Segment into One Separator	Guoxuan Chen et.al.	2412.12094	link
2024-12-16	Instruction-based Image Manipulation by Watching How Things Move	Mingdeng Cao et.al.	2412.12087	null
2024-12-16	CPath-Omni: A Unified Multimodal Foundation Model for Patch and Whole Slide Image Analysis in Computational Pathology	Yuxuan Sun et.al.	2412.12077	null
2024-12-16	CG-Bench: Clue-grounded Question Answering Benchmark for Long Video Understanding	Guo Chen et.al.	2412.12075	null
2024-12-16	Making FETCH! Happen: Finding Emergent Dog Whistles Through Common Habitats	Kuleen Sasse et.al.	2412.12072	link
2024-12-16	How Private are Language Models in Abstractive Summarization?	Anthony Hughes et.al.	2412.12040	null
2024-12-16	Can LLM Prompting Serve as a Proxy for Static Analysis in Vulnerability Detection	Ira Ceka et.al.	2412.12039	null
2024-12-16	FSFM: A Generalizable Face Security Foundation Model via Self-Supervised Facial Representation Learning	Gaojian Wang et.al.	2412.12032	link
2024-12-16	SpeechPrune: Context-aware Token Pruning for Speech Information Retrieval	Yueqian Lin et.al.	2412.12009	null
2024-12-16	The Open Source Advantage in Large Language Models (LLMs)	Jiya Manchanda et.al.	2412.12004	null
2024-12-16	LLM-RG4: Flexible and Factual Radiology Report Generation across Diverse Input Contexts	Zhuhao Wang et.al.	2412.12001	link
2024-12-16	SAMIC: Segment Anything with In-Context Spatial Prompt Engineering	Savinay Nagendra et.al.	2412.11998	null
2024-12-16	Combining Large Language Models with Tutoring System Intelligence: A Case Study in Caregiver Homework Support	Devika Venugopalan et.al.	2412.11995	link
2024-12-16	ExecRepoBench: Multi-level Executable Code Completion Evaluation	Jian Yang et.al.	2412.11990	null
2024-12-16	SciFaultyQA: Benchmarking LLMs on Faulty Science Question Detection with a GAN-Inspired Approach to Synthetic Dataset Generation	Debarshi Kundu et.al.	2412.11988	link
2024-12-16	Cost-Effective Label-free Node Classification with LLMs	Taiyan Zhang et.al.	2412.11983	null
2024-12-16	AlphaZero Neural Scaling and Zipf's Law: a Tale of Board Games and Power Laws	Oren Neumann et.al.	2412.11979	link
2024-12-16	Speech Foundation Models and Crowdsourcing for Efficient, High-Quality Data Collection	Beomseok Lee et.al.	2412.11978	null
2024-12-16	Emma-X: An Embodied Multimodal Action Model with Grounded Chain of Thought and Look-ahead Spatial Reasoning	Qi Sun et.al.	2412.11974	link
2024-12-16	DARWIN 1.5: Large Language Models as Materials Science Adapted Learners	Tong Xie et.al.	2412.11970	link
2024-12-13	UniMed-CLIP: Towards a Unified Image-Text Pretraining Paradigm for Diverse Medical Imaging Modalities	Muhammad Uzair Khattak et.al.	2412.10372	link
2024-12-13	A Grounded Typology of Word Classes	Coleman Haley et.al.	2412.10369	null
2024-12-13	Robust image classification with multi-modal large language models	Francesco Villani et.al.	2412.10353	null
2024-12-13	Towards a foundation model for heavy-ion collision experiments through point cloud diffusion	Manjunath Omana Kuttan et.al.	2412.10352	null
2024-12-13	A dual contrastive framework	Yuan Sun et.al.	2412.10348	null
2024-12-13	COMET: Benchmark for Comprehensive Biological Multi-omics Evaluation Tasks and Language Models	Yuchen Ren et.al.	2412.10347	null
2024-12-13	Iris: Breaking GUI Complexity with Adaptive Focus and Self-Refining	Zhiqi Ge et.al.	2412.10342	null
2024-12-13	AdvPrefix: An Objective for Nuanced LLM Jailbreaks	Sicheng Zhu et.al.	2412.10321	link
2024-12-13	BrushEdit: All-In-One Image Inpainting and Editing	Yaowei Li et.al.	2412.10316	null
2024-12-13	DeepSeek-VL2: Mixture-of-Experts Vision-Language Models for Advanced Multimodal Understanding	Zhiyu Wu et.al.	2412.10302	link
2024-12-13	Still "Talking About Large Language Models": Some Clarifications	Murray Shanahan et.al.	2412.10291	null
2024-12-13	One world, one opinion? The superstar effect in LLM responses	Sofie Goethals et.al.	2412.10281	null
2024-12-13	Benchmarking Linguistic Diversity of Large Language Models	Yanzhu Guo et.al.	2412.10271	link
2024-12-13	Cultural Evolution of Cooperation among LLM Agents	Aron Vallinder et.al.	2412.10270	null
2024-12-13	Does Multiple Choice Have a Future in the Age of Generative AI? A Posttest-only RCT	Danielle R. Thomas et.al.	2412.10267	link
2024-12-13	Reasoner Outperforms: Generative Stance Detection with Rationalization for Social Media	Jiaqing Yuan et.al.	2412.10266	null
2024-12-13	Targeted Angular Reversal of Weights (TARS) for Knowledge Removal in Large Language Models	Harry J. Davies et.al.	2412.10257	null
2024-12-13	Detecting LLM Hallucination Through Layer-wise Information Deficiency: Analysis of Unanswerable Questions and Ambiguous Prompts	Hazel Kim et.al.	2412.10246	null
2024-12-13	Efficient Continual Pre-training of LLMs for Low-resource Languages	Arijit Nag et.al.	2412.10244	null
2024-12-13	Retrieval-Augmented Semantic Parsing: Using Large Language Models to Improve Generalization	Xiao Zhang et.al.	2412.10207	null
2024-12-12	EasyRef: Omni-Generalized Group Image Reference for Diffusion Models via Multimodal LLM	Zhuofan Zong et.al.	2412.09618	null
2024-12-12	V2PE: Improving Multimodal Long-Context Capability of Vision-Language Models with Variable Visual Position Encoding	Junqi Ge et.al.	2412.09616	link
2024-12-12	PVC: Progressive Visual Token Compression for Unified Image and Video Processing in Large Vision-Language Models	Chenyu Yang et.al.	2412.09613	null
2024-12-12	Olympus: A Universal Task Router for Computer Vision Tasks	Yuanze Lin et.al.	2412.09612	link
2024-12-12	Feat2GS: Probing Visual Foundation Models with Gaussian Splatting	Yue Chen et.al.	2412.09606	null
2024-12-12	AgentTrek: Agent Trajectory Synthesis via Guiding Replay with Web Tutorials	Yiheng Xu et.al.	2412.09605	null
2024-12-12	SynerGen-VL: Towards Synergistic Image Understanding and Generation with Vision Experts and Token Folding	Hao Li et.al.	2412.09604	null
2024-12-12	Do Multimodal Large Language Models See Like Humans?	Jiaying Lin et.al.	2412.09603	null
2024-12-12	Hidden Biases of End-to-End Driving Datasets	Julian Zimmerlin et.al.	2412.09602	link
2024-12-12	InternLM-XComposer2.5-OmniLive: A Comprehensive Multimodal System for Long-term Streaming Video and Audio Interactions	Pan Zhang et.al.	2412.09596	link
2024-12-12	OpenNER 1.0: Standardized Open-Access Named Entity Recognition Datasets in 50+ Languages	Chester Palen-Michel et.al.	2412.09587	null
2024-12-12	DiverseAgentEntropy: Quantifying Black-Box LLM Uncertainty through Diverse Perspectives and Multi-Agent Interaction	Yu Feng et.al.	2412.09572	null
2024-12-12	Does Representation Matter? Exploring Intermediate Layers in Large Language Models	Oscar Skean et.al.	2412.09563	null
2024-12-12	Foundational Large Language Models for Materials Research	Vaibhav Mishra et.al.	2412.09560	link
2024-12-12	Video Creation by Demonstration	Yihong Sun et.al.	2412.09551	null
2024-12-12	Exemplar Masking for Multimodal Incremental Learning	Yi-Lun Lee et.al.	2412.09549	link
2024-12-12	Capturing the Temporal Dependence of Training Data Influence	Jiachen T. Wang et.al.	2412.09538	null
2024-12-12	Dynamic-VLM: Simple Dynamic Visual Token Compression for VideoLLM	Han Wang et.al.	2412.09530	link
2024-12-12	Can Modern LLMs Act as Agent Cores in Radiology~Environments?	Qiaoyu Zheng et.al.	2412.09529	link
2024-12-12	Efficient and Comprehensive Feature Extraction in Large Vision-Language Model for Clinical Pathology Analysis	Shengxuming Zhang et.al.	2412.09521	null
2024-12-11	Generative Semantic Communication: Architectures, Technologies, and Applications	Jinke Ren et.al.	2412.08642	null
2024-12-11	Fast Prompt Alignment for Text-to-Image Generation	Khalil Mrini et.al.	2412.08639	link
2024-12-11	Multimodal Latent Language Modeling with Next-Token Diffusion	Yutao Sun et.al.	2412.08635	link
2024-12-11	Synthetic Vision: Training Vision-Language Models to Understand Physics	Vahid Balazadeh et.al.	2412.08619	null
2024-12-11	Exploiting the Index Gradients for Optimization-Based Jailbreaking on Large Language Models	Jiahui Li et.al.	2412.08615	link
2024-12-11	Benchmarking Large Vision-Language Models via Directed Scene Graph for Comprehensive Image Captioning	Fan Lu et.al.	2412.08614	link
2024-12-11	Competition and Diversity in Generative AI	Manish Raghavan et.al.	2412.08610	link
2024-12-11	AdvWave: Stealthy Adversarial Jailbreak Attack against Large Audio-Language Models	Mintong Kang et.al.	2412.08608	null
2024-12-11	Preference Discerning with LLM-Enhanced Generative Retrieval	Fabian Paischer et.al.	2412.08604	null
2024-12-11	Empirical Measurements of AI Training Power Demand on a GPU-Accelerated Node	Imran Latif et.al.	2412.08602	null
2024-12-11	Leveraging Graph-RAG and Prompt Engineering to Enhance LLM-Based Automated Requirement Traceability and Compliance Checks	Arsalan Masoudifard et.al.	2412.08593	null
2024-12-11	Advancing Single- and Multi-task Text Classification through Large Language Model Fine-tuning	Hang Zhao et.al.	2412.08587	null
2024-12-11	TURBOATTENTION: Efficient Attention Approximation For High Throughputs LLMs	Hao Kang et.al.	2412.08585	null
2024-12-11	LAION-SG: An Enhanced Large-Scale Dataset for Training Complex Image-Text Models with Structural Annotations	Zejian Li et.al.	2412.08580	link
2024-12-11	Underestimated Privacy Risks for Minority Populations in Large Language Model Unlearning	Rongzhe Wei et.al.	2412.08559	null
2024-12-11	MaestroMotif: Skill Design from Artificial Intelligence Feedback	Martin Klissarov et.al.	2412.08542	null
2024-12-11	SenCLIP: Enhancing zero-shot land-use mapping for Sentinel-2 with ground-level prompting	Pallavi Jain et.al.	2412.08536	link
2024-12-11	Continual Learning for Encoder-only Language Models via a Discrete Key-Value Bottleneck	Andor Diera et.al.	2412.08528	null
2024-12-11	EMS: Adaptive Evict-then-Merge Strategy for Head-wise KV Cache Compression Based on Global-Local Importance	Yingxin Li et.al.	2412.08521	null
2024-12-11	Bridging Relevance and Reasoning: Rationale Distillation in Retrieval-Augmented Generation	Pengyue Jia et.al.	2412.08519	null
2024-12-10	Bayesian Optimization of Antibodies Informed by a Generative Model of Evolving Sequences	Alan Nawzad Amin et.al.	2412.07763	link
2024-12-10	SAT: Spatial Aptitude Training for Multimodal Language Models	Arijit Ray et.al.	2412.07755	null
2024-12-10	LoRA3D: Low-Rank Self-Calibration of 3D Geometric Foundation Models	Ziqi Lu et.al.	2412.07746	null
2024-12-10	Zero-Shot ATC Coding with Large Language Models for Clinical Assessments	Zijian Chen et.al.	2412.07743	null
2024-12-10	AI Expands Scientists' Impact but Contracts Science's Focus	Qianyue Hao et.al.	2412.07727	link
2024-12-10	Granite Guardian	Inkit Padhi et.al.	2412.07724	link
2024-12-10	Leveraging Content and Context Cues for Low-Light Image Enhancement	Igor Morawski et.al.	2412.07693	link
2024-12-10	DriveMM: All-in-One Large Multimodal Model for Autonomous Driving	Zhijian Huang et.al.	2412.07689	link
2024-12-10	Privacy-Preserving Customer Support: A Framework for Secure and Scalable Interactions	Anant Prakash Awasthi et.al.	2412.07687	null
2024-12-10	TRIM: Token Reduction and Inference Modeling for Cost-Effective Language Generation	Alfredo Garrachón Ruiz et.al.	2412.07682	null
2024-12-10	RADIO Amplified: Improved Baselines for Agglomerative Vision Foundation Models	Greg Heinrich et.al.	2412.07679	link
2024-12-10	Ask Humans or AI? Exploring Their Roles in Visualization Troubleshooting	Shuyu Shen et.al.	2412.07673	link
2024-12-10	FlexLLM: Exploring LLM Customization for Moving Target Defense on Black-Box LLMs Against Jailbreak Attacks	Bocheng Chen et.al.	2412.07672	null
2024-12-10	Automating Business Intelligence Requirements with Generative AI and Semantic Search	Nimrod Busany et.al.	2412.07668	null
2024-12-10	Searching for Structure: Investigating Emergent Communication with Large Language Models	Tom Kouwenhoven et.al.	2412.07646	null
2024-12-10	TrojanWhisper: Evaluating Pre-trained LLMs to Detect and Localize Hardware Trojans	Md Omar Faruque et.al.	2412.07636	null
2024-12-10	ChocoLlama: Lessons Learned From Teaching Llamas Dutch	Matthieu Meeus et.al.	2412.07633	null
2024-12-10	Piece of Table: A Divide-and-Conquer Approach for Selecting Sub-Tables in Table Question Answering	Wonjin Lee et.al.	2412.07629	null
2024-12-10	OmniDocBench: Benchmarking Diverse PDF Document Parsing with Comprehensive Annotations	Linke Ouyang et.al.	2412.07626	link
2024-12-10	DRUM: Learning Demonstration Retriever for Large MUlti-modal Models	Ellen Yi-Ge et.al.	2412.07619	null
2024-12-09	Delve into Visual Contrastive Decoding for Hallucination Mitigation of Large Vision-Language Models	Yi-Lun Lee et.al.	2412.06775	link
2024-12-09	Visual Lexicon: Rich Image Features in Language Space	XuDong Wang et.al.	2412.06774	null
2024-12-09	Training Large Language Models to Reason in a Continuous Latent Space	Shibo Hao et.al.	2412.06769	null
2024-12-09	Ranking-aware adapter for text-driven image ordering with CLIP	Wei-Hsiang Yu et.al.	2412.06760	link
2024-12-09	Why Do Developers Engage with ChatGPT in Issue-Tracker? Investigating Usage and Reliance on ChatGPT-Generated Code	Joy Krishan Das et.al.	2412.06757	null
2024-12-09	Refusal Tokens: A Simple Way to Calibrate Refusals in Large Language Models	Neel Jain et.al.	2412.06748	null
2024-12-09	ONEBench to Test Them All: Sample-Level Benchmarking Over Open-Ended Capabilities	Adhiraj Ghosh et.al.	2412.06745	null
2024-12-09	JAPAGEN: Efficient Few/Zero-shot Learning via Japanese Training Dataset Generation with LLM	Takuro Fujii et.al.	2412.06738	link
2024-12-09	AutoDCWorkflow: LLM-based Data Cleaning Workflow Auto-Generation and Benchmark	Lan Li et.al.	2412.06724	link
2024-12-09	How to Merge Your Multimodal Models Over Time?	Sebastian Dziadzio et.al.	2412.06712	link
2024-12-09	OmniEvalKit: A Modular, Lightweight Toolbox for Evaluating Large Language Model and its Omni-Extensions	Yi-Kai Zhang et.al.	2412.06693	null
2024-12-09	Exploring Critical Testing Scenarios for Decision-Making Policies: An LLM Approach	Weichao Xu et.al.	2412.06684	null
2024-12-09	Toward LLM-Agent-Based Modeling of Transportation Systems: A Conceptual Framework	Tianming Liu et.al.	2412.06681	null
2024-12-09	I Don't Know: Explicit Modeling of Uncertainty with an [IDK] Token	Roi Cohen et.al.	2412.06676	null
2024-12-09	ILLUME: Illuminating Your LLMs to See, Draw, and Self-Enhance	Chunwei Wang et.al.	2412.06673	null
2024-12-09	MuMu-LLaMA: Multi-modal Music Understanding and Generation via Large Language Models	Shansong Liu et.al.	2412.06660	link
2024-12-09	Chatbots im Schulunterricht: Wir testen das Fobizz-Tool zur automatischen Bewertung von Hausaufgaben	Rainer Mühlhoff et.al.	2412.06651	null
2024-12-09	The Narrow Gate: Localized Image-Text Communication in Vision-Language Models	Alessandro Serra et.al.	2412.06646	null
2024-12-09	MAVias: Mitigate any Visual Bias	Ioannis Sarridis et.al.	2412.06632	null
2024-12-09	Copyright-Protected Language Generation via Adaptive Model Fusion	Javier Abad et.al.	2412.06619	link
2024-12-06	Birth and Death of a Rose	Chen Geng et.al.	2412.05278	null
2024-12-06	Sparse autoencoders reveal selective remapping of visual concepts during adaptation	Hyesu Lim et.al.	2412.05276	link
2024-12-06	Expanding Performance Boundaries of Open-Source Multimodal Models with Model, Data, and Test-Time Scaling	Zhe Chen et.al.	2412.05271	link
2024-12-06	APOLLO: SGD-like Memory, AdamW-level Performance	Hanqing Zhu et.al.	2412.05270	link
2024-12-06	Uncertainty Quantification for Transformer Models for Dark-Pattern Detection	Javier Muñoz et.al.	2412.05251	null
2024-12-06	Enhancing Foundation Models for Time Series Forecasting via Wavelet-based Tokenization	Luca Masserano et.al.	2412.05244	null
2024-12-06	CompCap: Improving Multimodal Large Language Models with Composite Captions	Xiaohui Chen et.al.	2412.05243	null
2024-12-06	MAmmoTH-VL: Eliciting Multimodal Reasoning with Instruction Tuning at Scale	Jarvis Guo et.al.	2412.05237	null
2024-12-06	BEExformer: A Fast Inferencing Transformer Architecture via Binarization with Multiple Early Exits	Wazib Ansar et.al.	2412.05225	null
2024-12-06	100% Hallucination Elimination Using Acurai	Michael C. Wood et.al.	2412.05223	link
2024-12-06	Evaluating and Aligning CodeLLMs on Human Preference	Jian Yang et.al.	2412.05210	null
2024-12-06	A Survey of Large Language Model-Based Generative AI for Text-to-SQL: Benchmarks, Applications, Use Cases, and Challenges	Aditi Singh et.al.	2412.05208	null
2024-12-06	Are Frontier Large Language Models Suitable for Q&A in Science Centres?	Jacob Watson et.al.	2412.05200	null
2024-12-06	SurgBox: Agent-Driven Operating Room Sandbox with Surgery Copilot	Jinlin Wu et.al.	2412.05187	link
2024-12-06	LinVT: Empower Your Image-level Large Language Model to Understand Videos	Lishuai Gao et.al.	2412.05185	link
2024-12-06	QueEn: A Large Language Model for Quechua-English Translation	Junhao Chen et.al.	2412.05184	null
2024-12-06	Benchmarking Open-ended Audio Dialogue Understanding for Large Audio-Language Models	Kuofeng Gao et.al.	2412.05167	null
2024-12-06	Enhancing Cross-Language Code Translation via Task-Specific Embedding Alignment in Retrieval-Augmented Generation	Manish Bhattarai et.al.	2412.05159	null
2024-12-06	Multimodal Fact-Checking with Vision Language Models: A Probing Classifier based Solution with Embedding Strategies	Recep Firat Cekinel et.al.	2412.05155	link
2024-12-06	A text-to-tabular approach to generate synthetic patient data using LLMs	Margaux Tornqvist et.al.	2412.05153	link
2024-12-05	Stereo Anywhere: Robust Zero-Shot Deep Stereo Matching Even Where Either Stereo or Mono Fail	Luca Bartolomei et.al.	2412.04472	link
2024-12-05	NVILA: Efficient Frontier Visual Language Models	Zhijian Liu et.al.	2412.04468	null
2024-12-05	VisionZip: Longer is Better but Not Necessary in Vision Language Models	Senqiao Yang et.al.	2412.04467	link
2024-12-05	Code-as-Monitor: Constraint-aware Visual Programming for Reactive and Proactive Robotic Failure Detection	Enshen Zhou et.al.	2412.04455	null
2024-12-05	p-MoD: Building Mixture-of-Depths MLLMs via Progressive Ratio Decay	Jun Zhang et.al.	2412.04449	link
2024-12-05	EgoPlan-Bench2: A Benchmark for Multimodal Large Language Model Planning in Real-World Scenarios	Lu Qiu et.al.	2412.04447	null
2024-12-05	DiCoDe: Diffusion-Compressed Deep Tokens for Autoregressive Video Generation with Language Models	Yizhuo Li et.al.	2412.04446	null
2024-12-05	Moto: Latent Motion Token as the Bridging Language for Robot Manipulation	Yi Chen et.al.	2412.04445	null
2024-12-05	Towards Real-Time Open-Vocabulary Video Instance Segmentation	Bin Yan et.al.	2412.04434	null
2024-12-05	Divot: Diffusion Powers Video Tokenizer for Comprehension and Generation	Yuying Ge et.al.	2412.04432	link
2024-12-05	Grounding Descriptions in Images informs Zero-Shot Visual Recognition	Shaunak Halbe et.al.	2412.04429	link
2024-12-05	Florence-VL: Enhancing Vision-Language Models with Generative Vision Encoder and Depth-Breadth Fusion	Jiuhai Chen et.al.	2412.04424	link
2024-12-05	Targeting the Core: A Simple and Effective Method to Attack RAG-based Agents via Direct LLM Manipulation	Xuying Li et.al.	2412.04415	null
2024-12-05	Establishing Task Scaling Laws via Compute-Efficient Model Ladders	Akshita Bhagia et.al.	2412.04403	null
2024-12-05	SeeGround: See and Ground for Zero-Shot Open-Vocabulary 3D Visual Grounding	Rong Li et.al.	2412.04383	null
2024-12-05	Discriminative Fine-tuning of LVLMs	Yassine Ouali et.al.	2412.04378	null
2024-12-05	Finer Behavioral Foundation Models via Auto-Regressive Features and Advantage Weighting	Edoardo Cetin et.al.	2412.04368	null
2024-12-05	Approximate Top- $k$ for Increased Parallelism	Oscar Key et.al.	2412.04358	null
2024-12-05	Retrieval-Augmented Machine Translation with Unstructured Knowledge	Jiaan Wang et.al.	2412.04342	link
2024-12-05	Liquid: Language Models are Scalable Multi-modal Generators	Junfeng Wu et.al.	2412.04332	link
2024-12-04	From Individual to Society: A Survey on Social Simulation Driven by Large Language Model-based Agents	Xinyi Mou et.al.	2412.03563	link
2024-12-04	FLAIR: VLM with Fine-grained Language-informed Image Representations	Rui Xiao et.al.	2412.03561	link
2024-12-04	Best-of-N Jailbreaking	John Hughes et.al.	2412.03556	link
2024-12-04	PaliGemma 2: A Family of Versatile VLMs for Transfer	Andreas Steiner et.al.	2412.03555	null
2024-12-04	SPICE: Smart Projection Interface for Cooking Enhancement	Vera Prohaska et.al.	2412.03551	null
2024-12-04	Perception Tokens Enhance Visual Reasoning in Multimodal Language Models	Mahtab Bigverdi et.al.	2412.03548	null
2024-12-04	Evaluating Gender Bias Transfer between Pre-trained and Prompt-Adapted Language Models	Natalie Mackraz et.al.	2412.03537	null
2024-12-04	A Review on Scientific Knowledge Extraction using Large Language Models in Biomedical Sciences	Gabriel Lino Garcia et.al.	2412.03531	null
2024-12-04	FANAL -- Financial Activity News Alerting Language Modeling Framework	Urjitkumar Patel et.al.	2412.03527	null
2024-12-04	You're (Not) My Type -- Can LLMs Generate Feedback of Specific Types for Introductory Programming Tasks?	Dominic Lohr et.al.	2412.03516	null
2024-12-04	Distillation of Diffusion Features for Semantic Correspondence	Frank Fundel et.al.	2412.03512	null
2024-12-04	Tight PAC-Bayesian Risk Certificates for Contrastive Learning	Anna van Elst et.al.	2412.03486	link
2024-12-04	Training-Free Mitigation of Language Reasoning Degradation After Multimodal Instruction Tuning	Neale Ratzlaff et.al.	2412.03467	null
2024-12-04	Pre-trained Multiple Latent Variable Generative Models are good defenders against Adversarial Attacks	Dario Serez et.al.	2412.03453	link
2024-12-04	From Words to Workflows: Automating Business Processes	Laura Minkova et.al.	2412.03446	null
2024-12-04	Assessing Foundation Models' Transferability to Physiological Signals in Precision Medicine	Matthias Christenson et.al.	2412.03427	null
2024-12-04	PrefixKV: Adaptive Prefix KV Cache is What Vision Instruction-Following Models Need for Efficient Generation	Ao Wang et.al.	2412.03409	link
2024-12-04	RedStone: Curating General, Code, Math, and QA Data for Large Language Models	Yaoyao Chang et.al.	2412.03398	null
2024-12-04	Enhancing Supply Chain Visibility with Generative AI: An Exploratory Case Study on Relationship Prediction in Knowledge Graphs	Ge Zheng et.al.	2412.03390	null
2024-12-04	WiS Platform: Enhancing Evaluation of LLM-Based Multi-Agent Systems Through Game-Based Analysis	Chengwei Hu et.al.	2412.03359	null
2024-12-03	T-REG: Preference Optimization with Token-Level Reward Regularization	Wenxuan Zhou et.al.	2412.02685	null
2024-12-03	Mind the Gap: Examining the Self-Improvement Capabilities of Large Language Models	Yuda Song et.al.	2412.02674	null
2024-12-03	LLM-Enhanced Path Planning: Safe and Efficient Autonomous Navigation with Instructional Inputs	Pranav Doma et.al.	2412.02655	null
2024-12-03	Time-Reversal Provides Unsupervised Feedback to LLMs	Yerram Varun et.al.	2412.02626	null
2024-12-03	Medical Multimodal Foundation Models in Clinical Diagnosis and Treatment: Applications, Challenges, and Future Directions	Kai Sun et.al.	2412.02621	null
2024-12-03	Improving Dynamic Object Interactions in Text-to-Video Generation with AI Feedback	Hiroki Furuta et.al.	2412.02617	null
2024-12-03	GLM-4-Voice: Towards Intelligent and Human-Like End-to-End Spoken Chatbot	Aohan Zeng et.al.	2412.02612	link
2024-12-03	AV-Odyssey Bench: Can Your Multimodal LLMs Really Understand Audio-Visual Information?	Kaixiong Gong et.al.	2412.02611	null
2024-12-03	Interpretable Company Similarity with Sparse Autoencoders	Marco Molinari et.al.	2412.02605	null
2024-12-03	CEGI: Measuring the trade-off between efficiency and carbon emissions for SLMs and VLMs	Abhas Kumar et.al.	2412.02602	null
2024-12-03	PrefixLLM: LLM-aided Prefix Circuit Design	Weihua Xiao et.al.	2412.02594	null
2024-12-03	OCR Hinders RAG: Evaluating the Cascading Impact of OCR on Retrieval-Augmented Generation	Junyuan Zhang et.al.	2412.02592	link
2024-12-03	Explainable CTR Prediction via LLM Reasoning	Xiaohan Yu et.al.	2412.02588	null
2024-12-03	Remote Sensing Temporal Vision-Language Models: A Comprehensive Survey	Chenyang Liu et.al.	2412.02573	link
2024-12-03	SJTU:Spatial judgments in multimodal models towards unified segmentation through coordinate detection	Joongwon Chae et.al.	2412.02565	link
2024-12-03	Semantic Tokens in Retrieval Augmented Generation	Joel Suro et.al.	2412.02563	null
2024-12-03	Patent-CR: A Dataset for Patent Claim Revision	Lekang Jiang et.al.	2412.02549	null
2024-12-03	Multimodal Remote Sensing Scene Classification Using VLMs and Dual-Cross Attention Networks	Jinjin Cai et.al.	2412.02531	null
2024-12-03	LLMForecaster: Improving Seasonal Event Forecasts with Unstructured Textual Data	Hanyu Zhang et.al.	2412.02525	null
2024-12-03	OODFace: Benchmarking Robustness of Face Recognition under Common Corruptions and Appearance Variations	Caixin Kang et.al.	2412.02479	null
2024-12-02	T2Vid: Translating Long Text into Multi-Image is the Catalyst for Video-LLMs	Shukang Yin et.al.	2411.19951	link
2024-12-02	Critical Tokens Matter: Token-Level Contrastive Estimation Enhances LLM's Reasoning Capability	Zicheng Lin et.al.	2411.19943	null
2024-11-29	VLSBench: Unveiling Visual Leakage in Multimodal Safety	Xuhao Hu et.al.	2411.19939	null
2024-11-29	On Domain-Specific Post-Training for Multimodal Large Language Models	Daixuan Cheng et.al.	2411.19930	null
2024-11-29	SIMS: Simulating Human-Scene Interactions with Real World Script Planning	Wenjia Wang et.al.	2411.19921	null
2024-11-29	FlowCLAS: Enhancing Normalizing Flow Via Contrastive Learning For Anomaly Segmentation	Chang Won Lee et.al.	2411.19888	null
2024-11-29	PDDLFuse: A Tool for Generating Diverse Planning Domains	Vedant Khandelwal et.al.	2411.19886	null
2024-12-02	LUMIA: Linear probing for Unimodal and MultiModal Membership Inference Attacks leveraging internal LLM states	Luis Ibanez-Lissen et.al.	2411.19876	null
2024-11-29	DeMo: Decoupled Momentum Optimization	Bowen Peng et.al.	2411.19870	link
2024-11-29	AIDetx: a compression-based method for identification of machine-learning generated text	Leonardo Almeida et.al.	2411.19869	link
2024-11-29	Reverse Thinking Makes LLMs Stronger Reasoners	Justin Chih-Yao Chen et.al.	2411.19865	null
2024-11-29	Cross-Domain Recommendation Meets Large Language Models	Ajay Krishna Vajjala et.al.	2411.19862	link
2024-11-29	What fifty-one years of Linguistics and Artificial Intelligence research tell us about their correlation: A scientometric review	Mohammed Q. Shormani et.al.	2411.19858	null
2024-11-29	Sensitive Content Classification in Social Media: A Holistic Resource and Evaluation	Dimosthenis Antypas et.al.	2411.19832	null
2024-11-29	Advanced System Integration: Analyzing OpenAPI Chunking for Retrieval-Augmented Generation	Robin D. Pesl et.al.	2411.19804	null
2024-11-29	INCLUDE: Evaluating Multilingual Language Understanding with Regional Knowledge	Angelika Romanou et.al.	2411.19799	null
2024-11-29	MoTe: Learning Motion-Text Diffusion Model for Multiple Generation Tasks	Yiming Wu et.al.	2411.19786	null
2024-11-29	PerLA: Perceptive 3D Language Assistant	Guofeng Mei et.al.	2411.19774	null
2024-11-29	LongVALE: Vision-Audio-Language-Event Benchmark Towards Time-Aware Omni-Modal Perception of Long Videos	Tiantian Geng et.al.	2411.19772	null
2024-11-29	Dual Risk Minimization: Towards Next-Level Robustness in Fine-tuning Zero-Shot Models	Kaican Li et.al.	2411.19757	link
2024-11-27	Lift3D Foundation Policy: Lifting 2D Large-Scale Pretrained Models for Robust 3D Robotic Manipulation	Yueru Jia et.al.	2411.18623	null
2024-11-27	Cross-modal Information Flow in Multimodal Large Language Models	Zhi Zhang et.al.	2411.18620	null
2024-11-27	Diffusion Self-Distillation for Zero-Shot Customized Image Generation	Shengqu Cai et.al.	2411.18616	null
2024-11-27	Automated Literature Review Using NLP Techniques and LLM-Based Retrieval-Augmented Generation	Nurshat Fateh Ali et.al.	2411.18583	null
2024-11-27	Challenges in Adapting Multilingual LLMs to Low-Resource Languages using LoRA PEFT Tuning	Omkar Khade et.al.	2411.18571	null
2024-11-27	A Pipeline of Neural-Symbolic Integration to Enhance Spatial Reasoning in Large Language Models	Rong Wang et.al.	2411.18564	null
2024-11-27	DexDiffuser: Interaction-aware Diffusion Planning for Adaptive Dexterous Manipulation	Zhixuan Liang et.al.	2411.18562	null
2024-11-27	Retrofitting (Large) Language Models with Dynamic Tokenization	Darius Feher et.al.	2411.18553	null
2024-11-27	AdaVLN: Towards Visual Language Navigation in Continuous Indoor Environments with Moving Humans	Dillon Loh et.al.	2411.18539	link
2024-11-27	Emergence of Self-Identity in AI: A Mathematical Framework and Empirical Study with Generative Large Language Models	Minhyeok Lee et.al.	2411.18530	link
2024-11-27	LLM-ABBA: Understand time series via symbolic approximation	Erin Carson et.al.	2411.18506	null
2024-11-27	GATE OpenING: A Comprehensive Benchmark for Judging Open-ended Interleaved Image-Text Generation	Pengfei Zhou et.al.	2411.18499	null
2024-11-27	Beyond Examples: High-level Automated Reasoning Paradigm in In-Context Learning via MCTS	Jinyang Wu et.al.	2411.18478	null
2024-11-27	Draft Model Knows When to Stop: A Self-Verification Length Policy for Speculative Decoding	Ziyin Zhang et.al.	2411.18462	link
2024-11-27	Is my Meeting Summary Good? Estimating Quality with a Multi-LLM Evaluator	Frederic Kirstein et.al.	2411.18444	null
2024-11-27	An AI-Assisted Multi-Agent Dual Dialogue System to Support Mental Health Care Providers	Onno P. Kampman et.al.	2411.18429	null
2024-11-27	FastSwitch: Optimizing Context Switching Efficiency in Fairness-aware Large Language Model Serving	Ao Shen et.al.	2411.18424	null
2024-11-27	Politicians vs ChatGPT. A study of presuppositions in French and Italian political communication	Davide Garassino et.al.	2411.18403	null
2024-11-27	Topic Modeling and Sentiment Analysis on Japanese Online Media's Coverage of Nuclear Energy	Yifan Sun et.al.	2411.18383	null
2024-11-27	ChatGPT as speechwriter for the French presidents	Dominique Labbé et.al.	2411.18382	null
2024-11-26	Adaptive Deployment of Untrusted LLMs Reduces Distributed Threats	Jiaxin Wen et.al.	2411.17693	null
2024-11-26	Low-Bit Quantization Favors Undertrained LLMs: Scaling Laws for Quantized LLMs with 100T Training Tokens	Xu Ouyang et.al.	2411.17691	null
2024-11-26	Rethinking Token Reduction in MLLMs: Towards a Unified Paradigm for Training-Free Acceleration	Yuhang Han et.al.	2411.17686	null
2024-11-26	Enhancing Character-Level Understanding in LLMs through Token Internal Structure Learning	Zhu Xu et.al.	2411.17679	link
2024-11-26	Instance-Aware Graph Prompt Learning	Jiazheng Li et.al.	2411.17676	null
2024-11-26	Push the Limit of Multi-modal Emotion Recognition by Prompting LLMs with Receptive-Field-Aware Attention Weighting	Liyun Zhang et.al.	2411.17674	null
2024-11-26	SketchAgent: Language-Driven Sequential Sketch Generation	Yael Vinker et.al.	2411.17673	null
2024-11-26	Synthetic Data Generation with LLM for Improved Depression Prediction	Andrea Kang et.al.	2411.17672	null
2024-11-26	How do Multimodal Foundation Models Encode Text and Speech? An Analysis of Cross-Lingual and Cross-Modal Representations	Hyunji Lee et.al.	2411.17666	null
2024-11-26	Toward High-Performance LLM Serving: A Simulation-Based Approach for Identifying Optimal Parallelism	Yi-Chien Lin et.al.	2411.17651	null
2024-11-26	On Limitations of LLM as Annotator for Low Resource Languages	Suramya Jadhav et.al.	2411.17637	null
2024-11-26	MALMM: Multi-Agent Large Language Models for Zero-Shot Robotics Manipulation	Harsh Singh et.al.	2411.17636	null
2024-11-26	Data-driven development of cycle prediction models for lithium metal batteries using multi modal mining	Jaewoong Lee et.al.	2411.17625	null
2024-11-26	Scaling Speech-Text Pre-training with Synthetic Interleaved Data	Aohan Zeng et.al.	2411.17607	null
2024-11-26	HyperSeg: Towards Universal Visual Segmentation with Large Language Model	Cong Wei et.al.	2411.17606	link
2024-11-26	Making History Readable	Bipasha Banerjee et.al.	2411.17600	null
2024-11-26	Agentic AI for Improving Precision in Identifying Contributions to Sustainable Development Goals	William A. Ingram et.al.	2411.17598	null
2024-11-26	Can artificial intelligence predict clinical trial outcomes?	Shuyi Jin et.al.	2411.17595	null
2024-11-26	RTL-Breaker: Assessing the Security of LLMs against Backdoor Attacks on HDL Code Generation	Lakshmi Likhitha Mankali et.al.	2411.17569	null
2024-11-26	Natural Language Understanding and Inference with MLLM in Visual Question Answering: A Survey	Jiayi Kuang et.al.	2411.17558	null
2024-11-25	Do Large Language Models Perform Latent Multi-Hop Reasoning without Exploiting Shortcuts?	Sohee Yang et.al.	2411.16679	null
2024-11-25	Diffusion Features for Zero-Shot 6DoF Object Pose Estimation	Bernd Von Gimborn et.al.	2411.16668	null
2024-11-25	DreamRunner: Fine-Grained Storytelling Video Generation with Retrieval-Augmented Motion Adaptation	Zun Wang et.al.	2411.16657	null
2024-11-25	Self-Generated Critiques Boost Reward Modeling for Language Models	Yue Yu et.al.	2411.16646	null
2024-11-25	Preventing Jailbreak Prompts as Malicious Tools for Cybercriminals: A Cyber Defense Perspective	Jean Marie Tshimula et.al.	2411.16642	null
2024-11-25	StructFormer: Document Structure-based Masked Attention and its Impact on Language Model Pre-Training	Kaustubh Ponkshe et.al.	2411.16618	null
2024-11-25	Chat2SVG: Vector Graphics Generation with Large Language Models and Image Diffusion Models	Ronghuan Wu et.al.	2411.16602	null
2024-11-25	From Generation to Judgment: Opportunities and Challenges of LLM-as-a-judge	Dawei Li et.al.	2411.16594	link
2024-11-25	Large Language Model-based Decision-making for COLREGs and the Control of Autonomous Surface Vehicles	Klinsmann Agyei et.al.	2411.16587	link
2024-11-25	MarketGPT: Developing a Pre-trained transformer (GPT) for Modeling Financial Time Series	Aaron Wheeler et.al.	2411.16585	link
2024-11-25	Enhancing LLM Reasoning via Critique Models with Test-Time and Training-Time Supervision	Zhiheng Xi et.al.	2411.16579	null
2024-11-25	Predictive Power of LLMs in Financial Markets	Jerick Shi et.al.	2411.16569	null
2024-11-25	EnStack: An Ensemble Stacking Framework of Large Language Models for Enhanced Vulnerability Detection in Source Code	Shahriyar Zaman Ridoy et.al.	2411.16561	null
2024-11-25	Generating Out-Of-Distribution Scenarios Using Language Models	Erfan Aasi et.al.	2411.16554	null
2024-11-25	Representation Collapsing Problems in Vector Quantization	Wenhao Zhao et.al.	2411.16550	null
2024-11-25	RoboSpatial: Teaching Spatial Understanding to 2D and 3D Vision-Language Models for Robotics	Chan Hee Song et.al.	2411.16537	null
2024-11-25	Profiling Bias in LLMs: Stereotype Dimensions in Contextual Word Embeddings	Carolin M. Schuster et.al.	2411.16527	null
2024-11-25	Fundamental Limits of Prompt Tuning Transformers: Universality, Capacity and Efficiency	Jerry Yao-Chieh Hu et.al.	2411.16525	null
2024-11-25	LaB-RAG: Label Boosted Retrieval Augmented Generation for Radiology Report Generation	Steven Song et.al.	2411.16523	link
2024-11-25	Noise Diffusion for Enhancing Semantic Faithfulness in Text-to-Image Synthesis	Boming Miao et.al.	2411.16503	null
2024-11-22	Measuring Bullshit in the Language Games played by ChatGPT	Alessandro Trevisan et.al.	2411.15129	null
2024-11-22	Health AI Developer Foundations	Atilla P. Kiraly et.al.	2411.15128	null
2024-11-22	TÜLU 3: Pushing Frontiers in Open Language Model Post-Training	Nathan Lambert et.al.	2411.15124	link
2024-11-22	RE-Bench: Evaluating frontier AI R&D capabilities of language model agents against human experts	Hjalmar Wijk et.al.	2411.15114	link
2024-11-22	Efficient Pruning of Text-to-Image Models: Insights from Pruning Stable Diffusion	Samarth N Ramesh et.al.	2411.15113	null
2024-11-22	AttriBoT: A Bag of Tricks for Efficiently Approximating Leave-One-Out Context Attribution	Fengyuan Liu et.al.	2411.15102	link
2024-11-22	What You See is Not What You Get: Neural Partial Differential Equations and The Illusion of Learning	Arvind Mohan et.al.	2411.15101	null
2024-11-22	XGrammar: Flexible and Efficient Structured Generation Engine for Large Language Models	Yixin Dong et.al.	2411.15100	null
2024-11-22	Context-Aware Multimodal Pretraining	Karsten Roth et.al.	2411.15099	null
2024-11-22	mR $^2$ AG: Multimodal Retrieval-Reflection-Augmented Generation for Knowledge-Based VQA	Tao Zhang et.al.	2411.15041	null
2024-11-22	One to rule them all: natural language to bind communication, perception and action	Simone Colombani et.al.	2411.15033	null
2024-11-22	Time is on my sight: scene graph filtering for dynamic environment perception in an LLM-driven robot	Simone Colombani et.al.	2411.15027	null
2024-11-22	DyCoke: Dynamic Compression of Tokens for Fast Video Large Language Models	Keda Tao et.al.	2411.15024	link
2024-11-22	FTA generation using GenAI with an Autonomy sensor Usecase	Sneha Sudhir Shetiya et.al.	2411.15007	null
2024-11-22	ScribeAgent: Towards Specialized Web Agents Using Production-Scale Workflow Data	Junhong Shen et.al.	2411.15004	link
2024-11-22	Generative AI may backfire for counterspeech	Dominik Bär et.al.	2411.14986	null
2024-11-22	Exploring Foundation Models Fine-Tuning for Cytology Classification	Manon Dausort et.al.	2411.14975	link
2024-11-22	Open-Amp: Synthetic Data Framework for Audio Effect Foundation Models	Alec Wright et.al.	2411.14972	link
2024-11-22	SwissADT: An Audio Description Translation System for Swiss Languages	Lukas Fischer et.al.	2411.14967	null
2024-11-22	LoRA-FAIR: Federated LoRA Fine-Tuning with Aggregation and Initialization Refinement	Jieming Bian et.al.	2411.14961	null
2024-11-21	Insight-V: Exploring Long-Chain Visual Reasoning with Multimodal Large Language Models	Yuhao Dong et.al.	2411.14432	link
2024-11-21	Unleashing the Potential of Multi-modal Foundation Models and Video Diffusion for 4D Dynamic Physical Scene Simulation	Zhuoman Liu et.al.	2411.14423	null
2024-11-21	From RNNs to Foundation Models: An Empirical Study on Commercial Building Energy Consumption	Shourya Bose et.al.	2411.14421	null
2024-11-21	Beyond Training: Dynamic Token Merging for Zero-Shot Video Understanding	Yiming Zhang et.al.	2411.14401	null
2024-11-21	Lightweight Safety Guardrails Using Fine-tuned BERT Embeddings	Aaron Zheng et.al.	2411.14398	null
2024-11-21	UnifiedCrawl: Aggregated Common Crawl for Affordable Adaptation of LLMs on Low-Resource Languages	Bethel Melesse Tessema et.al.	2411.14343	link
2024-11-21	SplatR : Experience Goal Visual Rearrangement with 3D Gaussian Splatting and Dense Feature Matching	Arjun P S et.al.	2411.14322	link
2024-11-21	Velocitune: A Velocity-based Dynamic Domain Reweighting Method for Continual Pre-training	Zheheng Luo et.al.	2411.14318	null
2024-11-21	Automated Generation of Code Debugging Exercises	Victor-Alexandru Pădurean et.al.	2411.14303	null
2024-11-21	Auto-SPICE: Leveraging LLMs for Dataset Creation via Automated SPICE Netlist Extraction from Analog Circuit Diagrams	Jitendra Bhandari et.al.	2411.14299	link
2024-11-21	EasyHOI: Unleashing the Power of Large Models for Reconstructing Hand-Object Interactions in the Wild	Yumeng Liu et.al.	2411.14280	null
2024-11-21	Looking Beyond Text: Reducing Language bias in Large Vision-Language Models via Multimodal Dual-Attention and Soft-Image Guidance	Haozhe Zhao et.al.	2411.14279	null
2024-11-21	Efficient Aspect-Based Summarization of Climate Change Reports with Small Language Models	Iacopo Ghinassi et.al.	2411.14272	link
2024-11-21	Knowledge Graphs, Large Language Models, and Hallucinations: An NLP Perspective	Ernests Lavrinovics et.al.	2411.14258	null
2024-11-21	Do I Know This Entity? Knowledge Awareness and Hallucinations in Language Models	Javier Ferrando et.al.	2411.14257	null
2024-11-21	Generalizing End-To-End Autonomous Driving In Real-World Environments Using Zero-Shot LLMs	Zeyu Dong et.al.	2411.14256	null
2024-11-21	Intent-Aware Dialogue Generation and Multi-Task Contrastive Learning for Multi-Turn Intent Classification	Junhua Liu et.al.	2411.14252	null
2024-11-21	Natural Language Reinforcement Learning	Xidong Feng et.al.	2411.14251	link
2024-11-21	FocusLLaVA: A Coarse-to-Fine Approach for Efficient and Effective Visual Token Compression	Yuke Zhu et.al.	2411.14228	null
2024-11-21	Towards Context-Rich Automated Biodiversity Assessments: Deriving AI-Powered Insights from Camera Trap Data	Paul Fergus et.al.	2411.14219	null
2024-11-20	Find Any Part in 3D	Ziqi Ma et.al.	2411.13550	null
2024-11-20	SpecTool: A Benchmark for Characterizing Errors in Tool-Use LLMs	Shirley Kokane et.al.	2411.13547	null
2024-11-20	Promoting User Data Autonomy During the Dissolution of a Monopolistic Firm	Rushabh Solanki et.al.	2411.13546	null
2024-11-20	BALROG: Benchmarking Agentic LLM and VLM Reasoning On Games	Davide Paglieri et.al.	2411.13543	null
2024-11-20	Metacognition for Unknown Situations and Environments (MUSE)	Rodolfo Valiente et.al.	2411.13537	null
2024-11-20	Predictive Insights into LGBTQ+ Minority Stress: A Transductive Exploration of Social Media Discourse	S. Chapagain et.al.	2411.13534	link
2024-11-20	Advancing Complex Medical Communication in Arabic with Sporo AraSum: Surpassing Existing Large Language Models	Chanseo Lee et.al.	2411.13518	null
2024-11-20	Disentangling Memory and Reasoning Ability in Large Language Models	Mingyu Jin et.al.	2411.13504	link
2024-11-20	Neural machine translation of seismic waves for petrophysical inversion	José Cunha Teixeira et.al.	2411.13491	null
2024-11-20	Utilizing Large Language Models to Synthesize Product Desirability Datasets	John D. Hastings et.al.	2411.13485	null
2024-11-20	PatentEdits: Framing Patent Novelty as Textual Entailment	Ryan Lee et.al.	2411.13477	null
2024-11-20	When Precision Meets Position: BFloat16 Breaks Down RoPE in Long-Context Training	Haonan Wang et.al.	2411.13476	link
2024-11-20	SoK: A Systems Perspective on Compound AI Threats and Countermeasures	Sarbartha Banerjee et.al.	2411.13459	null
2024-11-20	LIMBA: An Open-Source Framework for the Preservation and Valorization of Low-Resource Languages using Generative Models	Salvatore Mario Carta et.al.	2411.13453	null
2024-11-20	AdaptAgent: Adapting Multimodal Web Agents with Few-Shot Learning from Human Demonstrations	Gaurav Verma et.al.	2411.13451	null
2024-11-20	WaterPark: A Robustness Assessment of Language Model Watermarking	Jiacheng Liang et.al.	2411.13425	link
2024-11-20	Unleashing the Power of Large Language Models for Group POI Recommendations	Jing Long et.al.	2411.13415	null
2024-11-20	A Survey On Enhancing Reinforcement Learning in Complex Environments: Insights from Human and LLM Feedback	Alireza Rashidi Laleh et.al.	2411.13410	null
2024-11-20	Unification of Balti and trans-border sister dialects in the essence of LLMs and AI Technology	Muhammad Sharif et.al.	2411.13409	null
2024-11-20	Transformer-Based Contextualized Language Models Joint with Neural Networks for Natural Language Inference in Vietnamese	Dat Van-Thanh Nguyen et.al.	2411.13407	null
2024-11-19	ACING: Actor-Critic for Instruction Learning in Black-Box Large Language Models	Salma Kharrat et.al.	2411.12736	link
2024-11-19	Information Theory of Meaningful Communication	Doron Sivan et.al.	2411.12728	null
2024-11-19	CATCH: Complementary Adaptive Token-level Contrastive Decoding to Mitigate Hallucinations in LVLMs	Zhehan Kan et.al.	2411.12713	null
2024-11-19	Enhancing Multi-Class Disease Classification: Neoplasms, Cardiovascular, Nervous System, and Digestive Disorders Using Advanced LLMs	Ahmed Akib Jawad Karim et.al.	2411.12712	null
2024-11-19	Strengthening Fake News Detection: Leveraging SVM and Sophisticated Text Vectorization Techniques. Defying BERT?	Ahmed Akib Jawad Karim et.al.	2411.12703	null
2024-11-19	When Backdoors Speak: Understanding LLM Backdoor Attacks Through Model-Generated Explanations	Huaizhi Ge et.al.	2411.12701	null
2024-11-19	SparseInfer: Training-free Prediction of Activation Sparsity for Fast LLM Inference	Jiho Shin et.al.	2411.12692	null
2024-11-19	Neurosymbolic Graph Enrichment for Grounded World Models	Stefano De Giorgis et.al.	2411.12671	null
2024-11-19	DLBacktrace: A Model Agnostic Explainability for any Deep Learning Models	Vinay Kumar Sankarapu et.al.	2411.12643	link
2024-11-19	Improving Controllability and Editability for Pretrained Text-to-Music Generation Models	Yixiao Zhang et.al.	2411.12641	null
2024-11-19	Provable unlearning in topic modeling and downstream tasks	Stanley Wei et.al.	2411.12600	null
2024-11-19	AdaCM $^2$ : On Understanding Extremely Long-Term Video with Adaptive Cross-Modality Memory Reduction	Yuanbin Man et.al.	2411.12593	null
2024-11-19	Procedural Knowledge in Pretraining Drives Reasoning in Large Language Models	Laura Ruis et.al.	2411.12580	link
2024-11-19	Large Language Models for Combinatorial Optimization of Design Structure Matrix	Shuo Jiang et.al.	2411.12571	null
2024-11-19	Unlocking State-Tracking in Linear RNNs Through Negative Eigenvalues	Riccardo Grazzi et.al.	2411.12537	link
2024-11-19	Contourlet Refinement Gate Framework for Thermal Spectrum Distribution Regularized Infrared Image Super-Resolution	Yang Zou et.al.	2411.12530	link
2024-11-19	Enhancing Reasoning Capabilities of LLMs via Principled Synthetic Logic Corpus	Terufumi Morishita et.al.	2411.12498	link
2024-11-19	AI Flow at the Network Edge	Jiawei Shao et.al.	2411.12469	null
2024-11-19	Guide-to-Explain for Controllable Summarization	Sangwon Ryu et.al.	2411.12460	null
2024-11-19	\textsc{Neon}: News Entity-Interaction Extraction for Enhanced Question Answering	Sneha Singhania et.al.	2411.12449	null
2024-11-18	Bi-Mamba: Towards Accurate 1-Bit State Space Models	Shengkun Tang et.al.	2411.11843	null
2024-11-18	Tackling prediction tasks in relational databases with LLMs	Marek Wydmuch et.al.	2411.11829	null
2024-11-18	Exploring adversarial robustness of JPEG AI: methodology, comparison and new methods	Egor Kovalev et.al.	2411.11795	null
2024-11-18	LLM-IE: A Python Package for Generative Information Extraction with Large Language Models	Enshuo Hsu et.al.	2411.11779	null
2024-11-18	sMoRe: Enhancing Object Manipulation and Organization in Mixed Reality Spaces with LLMs and Generative AI	Yunhao Xing et.al.	2411.11752	null
2024-11-18	BitMoD: Bit-serial Mixture-of-Datatype LLM Acceleration	Yuzong Chen et.al.	2411.11745	link
2024-11-18	Moral Persuasion in Large Language Models: Evaluating Susceptibility and Ethical Alignment	Allison Huang et.al.	2411.11731	link
2024-11-18	Semantic-Geometric-Physical-Driven Robot Manipulation Skill Transfer via Skill Library and Tactile Representation	Mingchao Qi et.al.	2411.11714	link
2024-11-18	FedCoLLM: A Parameter-Efficient Federated Co-tuning Framework for Large and Small Language Models	Tao Fan et.al.	2411.11707	null
2024-11-18	MC-LLaVA: Multi-Concept Personalized Vision-Language Model	Ruichuan An et.al.	2411.11706	link
2024-11-18	Technical Report: Enhancing LLM Reasoning with Reward-guided Tree Search	Jinhao Jiang et.al.	2411.11694	null
2024-11-18	TrojanRobot: Backdoor Attacks Against Robotic Manipulation in the Physical World	Xianlong Wang et.al.	2411.11683	null
2024-11-18	PSPO: An Effective Process-supervised Policy Optimization for Reasoning Alignment*	Jiawei Li et.al.	2411.11681	link
2024-11-18	Dissecting Misalignment of Multimodal Large Language Models via Influence Function	Lijie Hu et.al.	2411.11667	null
2024-11-18	TSINR: Capturing Temporal Continuity via Implicit Neural Representations for Time Series Anomaly Detection	Mengxuan Li et.al.	2411.11641	link
2024-11-18	Chapter 7 Review of Data-Driven Generative AI Models for Knowledge Extraction from Scientific Literature in Healthcare	Leon Kopitar et.al.	2411.11635	null
2024-11-18	Signaling and Social Learning in Swarms of Robots	Leo Cazenille et.al.	2411.11616	null
2024-11-18	Leveraging Computational Pathology AI for Noninvasive Optical Imaging Analysis Without Retraining	Danny Barash et.al.	2411.11613	null
2024-11-18	VLN-Game: Vision-Language Equilibrium Search for Zero-Shot Semantic Navigation	Bangguo Yu et.al.	2411.11609	null
2024-11-18	Exploring LLMs for Verifying Technical System Specifications Against Requirements	Lasse M. Reinpold et.al.	2411.11582	null
2024-11-15	VeriGraph: Scene Graphs for Execution Verifiable Robot Planning	Daniel Ekpo et.al.	2411.10446	null
2024-11-15	Enhancing the Reasoning Ability of Multimodal Large Language Models via Mixed Preference Optimization	Weiyun Wang et.al.	2411.10442	null
2024-11-15	LLaVA-o1: Let Vision Language Models Reason Step-by-Step	Guowei Xu et.al.	2411.10440	link
2024-11-15	MARS: Unleashing the Power of Variance Reduction for Training Large Models	Huizhuo Yuan et.al.	2411.10438	link
2024-11-15	Mitigating Hallucination in Multimodal Large Language Model via Hallucination-targeted Direct Preference Optimization	Yuhan Fu et.al.	2411.10436	null
2024-11-15	Evaluating Creativity and Deception in Large Language Models: A Simulation Framework for Multi-Agent Balderdash	Parsa Hejabi et.al.	2411.10422	link
2024-11-15	On the Foundation Model for Cardiac MRI Reconstruction	Chi Zhang et.al.	2411.10403	null
2024-11-15	Interactive Cycle Model -- The Linkage Combination among Automatic Speech Recognition, Large Language Models and Smart Glasses	Libo Wang et.al.	2411.10362	link
2024-11-15	Bias Unveiled: Investigating Social Bias in LLM-Generated Code	Lin Ling et.al.	2411.10351	null
2024-11-15	Y-MAP-Net: Real-time depth, normals, segmentation, multi-label captioning and 2D human pose in RGB images	Ammar Qammaz et.al.	2411.10334	null
2024-11-15	Number it: Temporal Grounding Videos like Flipping Manga	Yongliang Wu et.al.	2411.10332	link
2024-11-15	Modification Takes Courage: Seamless Image Stitching via Reference-Driven Inpainting	Ziqi Xie et.al.	2411.10309	link
2024-11-15	Static network structure cannot stabilize cooperation among Large Language Model agents	Jin Han et.al.	2411.10294	null
2024-11-15	Scaling Law for Post-training after Model Pruning	Xiaodong Chen et.al.	2411.10272	null
2024-11-15	Visual-Linguistic Agent: Towards Collaborative Contextual Object Reasoning	Jingru Yang et.al.	2411.10252	null
2024-11-15	Measuring Non-Adversarial Reproduction of Training Data in Large Language Models	Michael Aerni et.al.	2411.10242	null
2024-11-15	Generative AI in Multimodal User Interfaces: Trends, Challenges, and Cross-Platform Adaptability	J. Bieniek et.al.	2411.10234	null
2024-11-15	An Empirical Study on LLM-based Agents for Automated Bug Fixing	Xiangxin Meng et.al.	2411.10213	null
2024-11-15	Agentic LLMs in the Supply Chain: Towards Autonomous Multi-Agent Consensus-Seeking	Valeria Jannelli et.al.	2411.10184	null
2024-11-15	CART: Compositional Auto-Regressive Transformer for Image Generation	Siddharth Roheda et.al.	2411.10180	null
2024-11-14	MagicQuill: An Intelligent Interactive Image Editing System	Zichen Liu et.al.	2411.09703	null
2024-11-14	Advancing Fine-Grained Visual Understanding with Multi-Scale Alignment in Multi-Modal Models	Wei Wang et.al.	2411.09691	null
2024-11-14	Squeezed Attention: Accelerating Long Context Length LLM Inference	Coleman Hooper et.al.	2411.09688	link
2024-11-14	Adaptive Decoding via Latent Preference Optimization	Shehzaad Dhuliawala et.al.	2411.09661	null
2024-11-14	On the Limits of Language Generation: Trade-Offs Between Hallucination and Mode Collapse	Alkis Kalavasis et.al.	2411.09642	null
2024-11-14	Local deployment of large-scale music AI models on commodity hardware	Xun Zhou et.al.	2411.09625	null
2024-11-14	PTR: Precision-Driven Tool Recommendation for Large Language Models	Hang Gao et.al.	2411.09613	null
2024-11-14	The Moral Foundations Weibo Corpus	Renjie Cao et.al.	2411.09612	null
2024-11-14	Initial Nugget Evaluation Results for the TREC 2024 RAG Track with the AutoNuggetizer Framework	Ronak Pradeep et.al.	2411.09607	null
2024-11-14	Accelerating Knowledge Graph and Ontology Engineering with Large Language Models	Cogan Shimizu et.al.	2411.09601	null
2024-11-14	Assessing the Performance of the DINOv2 Self-supervised Learning Vision Transformer Model for the Segmentation of the Left Atrium from MRI Images	Bipasha Kundu et.al.	2411.09598	null
2024-11-14	LLaMA-Mesh: Unifying 3D Mesh Generation with Language Models	Zhengyi Wang et.al.	2411.09595	null
2024-11-14	Adopting RAG for LLM-Aided Future Vehicle Design	Vahid Zolfaghari et.al.	2411.09590	null
2024-11-14	BabyLM Challenge: Exploring the Effect of Variation Sets on Language Model Training Efficiency	Akari Haga et.al.	2411.09587	null
2024-11-14	Software Performance Engineering for Foundation Model-Powered Software (FMware)	Haoxiang Zhang et.al.	2411.09580	null
2024-11-14	Piecing It All Together: Verifying Multi-Hop Multimodal Claims	Haoran Wang et.al.	2411.09547	null
2024-11-14	A Practical Guide to Fine-tuning Language Models with Limited Data	Márton Szép et.al.	2411.09539	null
2024-11-14	Navigating the Risks: A Survey of Security, Privacy, and Ethics Threats in LLM-Based Agents	Yuyou Gan et.al.	2411.09523	null
2024-11-14	Communication Compression for Tensor Parallel LLM Inference	Jan Hansen-Palmus et.al.	2411.09510	null
2024-11-14	Spider: Any-to-Many Multimodal LLM	Jinxiang Lai et.al.	2411.09439	null
2024-11-13	Large Wireless Model (LWM): A Foundation Model for Wireless Channels	Sadjad Alikhani et.al.	2411.08872	link
2024-11-13	The Limited Impact of Medical Adaptation of Large Language and Vision-Language Models	Daniel P. Jeong et.al.	2411.08870	link
2024-11-13	CamemBERT 2.0: A Smarter French Language Model Aged to Perfection	Wissam Antoun et.al.	2411.08868	null
2024-11-13	LLMStinger: Jailbreaking LLMs using RL fine-tuned LLMs	Piyush Jha et.al.	2411.08862	null
2024-11-13	Multimodal Instruction Tuning with Hybrid State Space Models	Jianing Zhou et.al.	2411.08840	null
2024-11-13	FinRobot: AI Agent for Equity Research and Valuation with Large Language Models	Tianyu Zhou et.al.	2411.08804	link
2024-11-13	Evaluating World Models with LLM for Decision Making	Chang Yang et.al.	2411.08794	null
2024-11-13	Can sparse autoencoders be used to decompose and interpret steering vectors?	Harry Mayne et.al.	2411.08790	link
2024-11-13	Sharingan: Extract User Action Sequence from Desktop Recordings	Yanting Chen et.al.	2411.08768	null
2024-11-13	Separating Tongue from Thought: Activation Patching Reveals Language-Agnostic Concept Representations in Transformers	Clément Dumas et.al.	2411.08745	link
2024-11-13	A Comparative Study of Discrete Speech Tokens for Semantic-Related Tasks with Large Language Models	Dingdong Wang et.al.	2411.08742	null
2024-11-13	Dynamic Rewarding with Prompt Optimization Enables Tuning-free Self-Alignment of Language Models	Somanshu Singla et.al.	2411.08733	link
2024-11-13	Polymetis:Large Language Modeling for Multiple Material Domains	Chao Huang et.al.	2411.08728	null
2024-11-13	Voxeland: Probabilistic Instance-Aware Semantic Mapping with Evidence-based Uncertainty Quantification	Jose-Luis Matez-Bandera et.al.	2411.08727	link
2024-11-13	Theoretical Analysis of Byte-Pair Encoding	László Kozma et.al.	2411.08671	null
2024-11-13	OSMLoc: Single Image-Based Visual Localization in OpenStreetMap with Geometric and Semantic Guidances	Youqi Liao et.al.	2411.08665	link
2024-11-13	UniMat: Unifying Materials Embeddings through Multi-modal Learning	Janghoon Ock et.al.	2411.08664	null
2024-11-13	Accelerating Quasi-Static Time Series Simulations with Foundation Models	Alban Puech et.al.	2411.08652	null
2024-11-13	A System Level Performance Evaluation for Superconducting Digital Systems	Joyjit Kundu et.al.	2411.08645	null
2024-11-13	Towards Secure Intelligent O-RAN Architecture: Vulnerabilities, Threats and Promising Technical Solutions using LLMs	Mojdeh Karbalaee Motalleb et.al.	2411.08640	null
2024-11-12	Learning with Less: Knowledge Distillation from Large Language Models via Unlabeled Data	Juanhui Li et.al.	2411.08028	null
2024-11-12	LLMPhy: Complex Physical Reasoning Using Large Language Models and World Models	Anoop Cherian et.al.	2411.08027	null
2024-11-12	Language Models as Causal Effect Generators	Lucius E. J. Bynum et.al.	2411.08019	link
2024-11-12	ExpressivityArena: Can LLMs Express Information Implicitly?	Joshua Tint et.al.	2411.08010	null
2024-11-12	Can adversarial attacks by large language models be attributed?	Manuel Cebrian et.al.	2411.08003	null
2024-11-12	Derivational Morphology Reveals Analogical Generalization in Large Language Models	Valentin Hofmann et.al.	2411.07990	null
2024-11-12	JanusFlow: Harmonizing Autoregression and Rectified Flow for Unified Multimodal Understanding and Generation	Yiyang Ma et.al.	2411.07975	link
2024-11-12	From General to Specific: Utilizing General Hallucation to Automatically Measure the Role Relationship Fidelity for Specific Role-Play Agents	Chuyi Kong et.al.	2411.07965	null
2024-11-12	Towards Low-bit Communication for Tensor Parallel LLM Inference	Harry Dong et.al.	2411.07942	null
2024-11-12	Leveraging Multimodal Models for Enhanced Neuroimaging Diagnostics in Alzheimer's Disease	Francesco Chiumento et.al.	2411.07871	null
2024-11-12	Trustful LLMs: Customizing and Grounding Text Generation with Knowledge Bases and Dual Decoders	Xiaofeng Zhu et.al.	2411.07870	null
2024-11-12	Verbosity $\neq$ Veracity: Demystify Verbosity Compensation Behavior of Large Language Models	Yusen Zhang et.al.	2411.07858	link
2024-11-12	Tucano: Advancing Neural Text Generation for Portuguese	Nicholas Kluge Corrêa et.al.	2411.07854	link
2024-11-12	NL-SLAM for OC-VLN: Natural Language Grounded SLAM for Object-Centric VLN	Sonia Raychaudhuri et.al.	2411.07848	null
2024-11-12	Chain Association-based Attacking and Shielding Natural Language Processing Systems	Jiacheng Huang et.al.	2411.07843	null
2024-11-12	FRUGAL: Memory-Efficient Optimization by Reducing State Overhead for Scalable Training	Philip Zmushko et.al.	2411.07837	link
2024-11-12	Efficient Federated Finetuning of Tiny Transformers with Resource-Constrained Devices	Kilian Pfeiffer et.al.	2411.07826	null
2024-11-12	Query Optimization for Parametric Knowledge Refinement in Retrieval-Augmented Large Language Models	Youan Cong et.al.	2411.07820	null
2024-11-12	Federated Low-Rank Adaptation with Differential Privacy over Wireless Networks	Tianqu Kang et.al.	2411.07806	null
2024-11-12	Likelihood as a Performance Gauge for Retrieval-Augmented Generation	Tianyu Liu et.al.	2411.07773	link
2024-11-11	UTMath: Math Evaluation with Unit Test via Reasoning-to-Coding Thoughts	Bo Yang et.al.	2411.07240	link
2024-11-11	OpenThaiGPT 1.5: A Thai-Centric Open Source Large Language Model	Sumeth Yuenyong et.al.	2411.07238	null
2024-11-11	Contextualized Evaluations: Taking the Guesswork Out of Language Model Evaluations	Chaitanya Malaviya et.al.	2411.07237	null
2024-11-11	Tooling or Not Tooling? The Impact of Tools on Language Agents for Chemistry Problem Solving	Botao Yu et.al.	2411.07228	null
2024-11-11	TempCharBERT: Keystroke Dynamics for Continuous Access Control Based on Pre-trained Language Models	Matheus Simão et.al.	2411.07224	null
2024-11-11	Comparing Bottom-Up and Top-Down Steering Approaches on In-Context Learning Tasks	Madeline Brumley et.al.	2411.07213	null
2024-11-11	General Geospatial Inference with a Population Dynamics Foundation Model	Mohit Agarwal et.al.	2411.07207	link
2024-11-11	DLCR: A Generative Data Expansion Framework via Diffusion for Clothes-Changing Person Re-ID	Nyle Siddiqui et.al.	2411.07205	link
2024-11-11	The Super Weight in Large Language Models	Mengxia Yu et.al.	2411.07191	link
2024-11-11	NatureLM-audio: an Audio-Language Foundation Model for Bioacoustics	David Robinson et.al.	2411.07186	null
2024-11-11	SAMPart3D: Segment Any Part in 3D Objects	Yunhan Yang et.al.	2411.07184	link
2024-11-11	Counterfactual Generation from Language Models	Shauli Ravfogel et.al.	2411.07180	link
2024-11-11	More Expressive Attention with Negative Weights	Ang Lv et.al.	2411.07176	link
2024-11-11	Continual Memorization of Factoids in Large Language Models	Howard Chen et.al.	2411.07175	link
2024-11-11	A Domain-Agnostic Neurosymbolic Approach for Big Social Data Analysis: Evaluating Mental Health Sentiment on Social Media during COVID-19	Vedant Khandelwal et.al.	2411.07163	null
2024-11-11	Chinese SimpleQA: A Chinese Factuality Evaluation for Large Language Models	Yancheng He et.al.	2411.07140	null
2024-11-11	Stronger Models are NOT Stronger Teachers for Instruction Tuning	Zhangchen Xu et.al.	2411.07133	null
2024-11-11	Token Merging for Training-Free Semantic Binding in Text-to-Image Synthesis	Taihang Hu et.al.	2411.07132	link
2024-11-11	Retrieval or Global Context Understanding? On Many-Shot In-Context Learning for Long-Context Evaluation	Kaijian Zou et.al.	2411.07130	link
2024-11-11	Benchmarking LLMs' Judgments with No Gold Standard	Shengwei Xu et.al.	2411.07127	link
2024-11-08	Recycled Attention: Efficient inference for long-context language models	Fangyuan Xu et.al.	2411.05787	null
2024-11-08	Using Language Models to Disambiguate Lexical Choices in Translation	Josh Barua et.al.	2411.05781	link
2024-11-08	Fact or Fiction? Can LLMs be Reliable Annotators for Political Truths?	Veronica Chatrath et.al.	2411.05775	null
2024-11-08	Multi-hop Evidence Pursuit Meets the Web: Team Papelo at FEVER 2024	Christopher Malon et.al.	2411.05762	null
2024-11-08	End-to-End Navigation with Vision Language Models: Transforming Spatial Reasoning into Question-Answering	Dylan Goetting et.al.	2411.05755	link
2024-11-08	Aioli: A Unified Optimization Framework for Language Model Data Mixing	Mayee F. Chen et.al.	2411.05735	link
2024-11-08	Poze: Sports Technique Feedback under Data Constraints	Agamdeep Singh et.al.	2411.05734	null
2024-11-08	STARS: Sensor-agnostic Transformer Architecture for Remote Sensing	Ethan King et.al.	2411.05714	null
2024-11-08	Unmasking the Limits of Large Language Models: A Systematic Evaluation of Masked Text Processing Ability through MskQA and MskCal	Fuka Matsuzaki et.al.	2411.05665	link
2024-11-08	The influence of persona and conversational task on social interactions with a LLM-controlled embodied conversational agent	Leon O. H. Kroczek et.al.	2411.05653	null
2024-11-08	LightVA: Lightweight Visual Analytics with LLM Agent-Based Task Planning and Execution	Yuheng Zhao et.al.	2411.05651	null
2024-11-08	Harnessing High-Level Song Descriptors towards Natural Language-Based Music Recommendation	Elena V. Epure et.al.	2411.05649	link
2024-11-08	Evaluating Large Language Model Capability in Vietnamese Fact-Checking Data Generation	Long Truong To et.al.	2411.05641	null
2024-11-08	Assessing Open-Source Large Language Models on Argumentation Mining Subtasks	Mohammad Yeghaneh Abkenar et.al.	2411.05639	null
2024-11-08	A Two-Step Concept-Based Approach for Enhanced Interpretability and Trust in Skin Lesion Diagnosis	Cristiano Patrício et.al.	2411.05609	link
2024-11-08	Evaluating and Adapting Large Language Models to Represent Folktales in Low-Resource Languages	JA Meaney et.al.	2411.05593	null
2024-11-08	Open-set object detection: towards unified problem formulation and benchmarking	Hejer Ammar et.al.	2411.05564	null
2024-11-08	Training objective drives the consistency of representational similarity across datasets	Laure Ciernik et.al.	2411.05561	link
2024-11-08	AcceLLM: Accelerating LLM Inference using Redundancy for Load Balancing and Data Locality	Ilias Bournias et.al.	2411.05555	null
2024-11-08	Assessing the Answerability of Queries in Retrieval-Augmented Code Generation	Geonmin Kim et.al.	2411.05547	null
2024-11-07	SVDQunat: Absorbing Outliers by Low-Rank Components for 4-Bit Diffusion Models	Muyang Li et.al.	2411.05007	link
2024-11-07	Analyzing The Language of Visual Tokens	David M. Chan et.al.	2411.05001	null
2024-11-07	Needle Threading: Can LLMs Follow Threads through Near-Million-Scale Haystacks?	Jonathan Roberts et.al.	2411.05000	null
2024-11-07	DynaMem: Online Dynamic Spatio-Semantic Memory for Open World Mobile Manipulation	Peiqi Liu et.al.	2411.04999	link
2024-11-07	LLM2CLIP: Powerful Language Model Unlock Richer Visual Representation	Weiquan Huang et.al.	2411.04997	link
2024-11-07	Mixture-of-Transformers: A Sparse and Scalable Architecture for Multi-Modal Foundation Models	Weixin Liang et.al.	2411.04996	null
2024-11-07	Rethinking Bradley-Terry Models in Preference-Based Reward Modeling: Foundations, Theory, and Alternatives	Hao Sun et.al.	2411.04991	link
2024-11-07	The Semantic Hub Hypothesis: Language Models Share Semantic Representations Across Languages and Modalities	Zhaofeng Wu et.al.	2411.04986	link
2024-11-07	Enhancing Reverse Engineering: Investigating and Benchmarking Large Language Models for Vulnerability Analysis in Decompiled Binaries	Dylan Manuel et.al.	2411.04981	null
2024-11-07	SuffixDecoding: A Model-Free Approach to Speeding Up Large Language Model Inference	Gabriele Oliaro et.al.	2411.04975	null
2024-11-07	BitNet a4.8: 4-bit Activations for 1-bit LLMs	Hongyu Wang et.al.	2411.04965	null
2024-11-07	Position Paper On Diagnostic Uncertainty Estimation from Large Language Models: Next-Word Probability Is Not Pre-test Probability	Yanjun Gao et.al.	2411.04962	null
2024-11-07	CAD-MLLM: Unifying Multimodality-Conditioned CAD Generation With MLLM	Jingwei Xu et.al.	2411.04954	null
2024-11-07	M3DocRAG: Multi-modal Retrieval is What You Need for Multi-page Multi-document Understanding	Jaemin Cho et.al.	2411.04952	null
2024-11-07	A Reinforcement Learning-Based Automatic Video Editing Method Using Pre-trained Vision-Language Model	Panwen Hu et.al.	2411.04942	null
2024-11-07	VideoGLaMM: A Large Multimodal Model for Pixel-Level Visual Grounding in Videos	Shehan Munasinghe et.al.	2411.04923	null
2024-11-07	GPTKB: Building Very Large Knowledge Bases from Language Models	Yujia Hu et.al.	2411.04920	link
2024-11-07	OpenCoder: The Open Cookbook for Top-Tier Code Large Language Models	Siming Huang et.al.	2411.04905	null
2024-11-07	In the Era of Prompt Learning with Vision-Language Models	Ankit Jha et.al.	2411.04892	null
2024-11-07	GUI Agents with Foundation Models: A Comprehensive Survey	Shuai Wang et.al.	2411.04890	null
2024-11-06	Medical Adaptation of Large Language and Vision-Language Models: Are We Making Progress?	Daniel P. Jeong et.al.	2411.04118	link
2024-11-06	How Transformers Solve Propositional Logic Problems: A Mechanistic Analysis	Guan Zhe Hong et.al.	2411.04105	null
2024-11-06	RaVL: Discovering and Mitigating Spurious Correlations in Fine-Tuned Vision-Language Models	Maya Varma et.al.	2411.04097	link
2024-11-06	Textual Decomposition Then Sub-motion-space Scattering for Open-Vocabulary Motion Generation	Ke Fan et.al.	2411.04079	null
2024-11-06	H-POPE: Hierarchical Polling-based Probing Evaluation of Hallucinations in Large Vision-Language Models	Nhi Pham et.al.	2411.04077	null
2024-11-06	M3SciQA: A Multi-Modal Multi-Document Scientific QA Benchmark for Evaluating Foundation Models	Chuhan Li et.al.	2411.04075	null
2024-11-06	Pseudo-labeling with Keyword Refining for Few-Supervised Video Captioning	Ping Li et.al.	2411.04059	link
2024-11-06	Beemo: Benchmark of Expert-edited Machine-generated Outputs	Ekaterina Artemova et.al.	2411.04032	null
2024-11-06	Prompt Engineering Using GPT for Word-Level Code-Mixed Language Identification in Low-Resource Dravidian Languages	Aniket Deroy et.al.	2411.04025	null
2024-11-06	Select2Plan: Training-Free ICL-Based Planning through VQA and Memory Retrieval	Davide Buoso et.al.	2411.04006	null
2024-11-06	Customized Multiple Clustering via Multi-Modal Subspace Proxy Learning	Jiawei Yao et.al.	2411.03978	link
2024-11-06	What Really is Commonsense Knowledge?	Quyet V. Do et.al.	2411.03964	null
2024-11-06	How Does A Text Preprocessing Pipeline Affect Ontology Syntactic Matching?	Zhangcheng Qiang et.al.	2411.03962	link
2024-11-06	Face Reconstruction from Face Embeddings using Adapter to a Face Foundation Model	Hatef Otroshi Shahreza et.al.	2411.03960	null
2024-11-06	Fine-Grained Guidance for Retrievers: Leveraging LLMs' Feedback in Retrieval-Augmented Generation	Yuhang Liu et.al.	2411.03957	null
2024-11-06	Long-Form Text-to-Music Generation with Adaptive Prompts: A Case of Study in Tabletop Role-Playing Games Soundtracks	Felipe Marra et.al.	2411.03948	link
2024-11-06	Interactions Across Blocks in Post-Training Quantization of Large Language Models	Khasmamad Shabanovi et.al.	2411.03934	null
2024-11-06	Multi3Hate: Multimodal, Multilingual, and Multicultural Hate Speech Detection with Vision-Language Models	Minh Duc Bui et.al.	2411.03888	link
2024-11-06	Polynomial Composition Activations: Unleashing the Dynamics of Large Language Models	Zhijian Zhuo et.al.	2411.03884	link
2024-11-06	MEG: Medical Knowledge-Augmented Large Language Models for Question Answering	Laura Cabello et.al.	2411.03883	link
2024-11-05	Inference Optimal VLMs Need Only One Visual Token but Larger Models	Kevin Y. Li et.al.	2411.03312	link
2024-11-05	LLMs for Domain Generation Algorithm Detection	Reynier Leyva La O et.al.	2411.03307	null
2024-11-05	VERITAS: A Unified Approach to Reliability Evaluation	Rajkumar Ramamurthy et.al.	2411.03300	null
2024-11-05	Examining Human-AI Collaboration for Co-Writing Constructive Comments Online	Farhana Shahid et.al.	2411.03295	null
2024-11-05	Interaction2Code: How Far Are We From Automatic Interactive Webpage Generation?	Jingyu Xiao et.al.	2411.03292	link
2024-11-05	The Future of Intelligent Healthcare: A Systematic Analysis and Discussion on the Integration and Impact of Robots Using Large Language Models for Healthcare	Souren Pashangpour et.al.	2411.03287	null
2024-11-05	SMoA: Improving Multi-agent Large Language Models with Sparse Mixture-of-Agents	Dawei Li et.al.	2411.03284	link
2024-11-05	Spontaneous Emergence of Agent Individuality through Social Interactions in LLM-Based Communities	Ryosuke Takata et.al.	2411.03252	null
2024-11-05	DiffLM: Controllable Synthetic Data Generation via Diffusion Language Models	Ying Zhou et.al.	2411.03250	null
2024-11-05	From Pen to Prompt: How Creative Writers Integrate AI into their Writing Practice	Alicia Guo et.al.	2411.03137	null
2024-11-05	"Create a Fear of Missing Out" -- ChatGPT Implements Unsolicited Deceptive Designs in Generated Websites Without Warning	Veronika Krauß et.al.	2411.03108	null
2024-11-05	Utilizing Precise and Complete Code Context to Guide LLM in Automatic False Positive Mitigation	Jinbao Chen et.al.	2411.03079	null
2024-11-05	Predictor-Corrector Enhanced Transformers with Exponential Moving Average Coefficient Learning	Bei Li et.al.	2411.03042	null
2024-11-05	HumanVLM: Foundation for Human-Scene Vision-Language Model	Dawei Dai et.al.	2411.03034	null
2024-11-05	Leveraging Large Language Models in Code Question Answering: Baselines and Issues	Georgy Andryushchenko et.al.	2411.03012	link
2024-11-05	Controlling for Unobserved Confounding with Large Language Model Classification of Patient Smoking Status	Samuel Lee et.al.	2411.03004	null
2024-11-05	Efficient and Effective Adaptation of Multimodal Foundation Models in Sequential Recommendation	Junchen Fu et.al.	2411.02992	null
2024-11-05	Growing a Tail: Increasing Output Diversity in Large Language Models	Michal Shur-Ofry et.al.	2411.02989	null
2024-11-05	[Vision Paper] PRObot: Enhancing Patient-Reported Outcome Measures for Diabetic Retinopathy using Chatbots and Generative AI	Maren Pielka et.al.	2411.02973	null
2024-11-05	Multi-modal NeRF Self-Supervision for LiDAR Semantic Segmentation	Xavier Timoneda et.al.	2411.02969	null
2024-11-04	Training-free Regional Prompting for Diffusion Transformers	Anthony Chen et.al.	2411.02395	link
2024-11-04	Adaptive Length Image Tokenization via Recurrent Allocation	Shivam Duggal et.al.	2411.02393	link
2024-11-04	Attacking Vision-Language Computer Agents via Pop-ups	Yanzhe Zhang et.al.	2411.02391	link
2024-11-04	Improving Scientific Hypothesis Generation with Knowledge Grounded Large Language Models	Guangzhi Xiong et.al.	2411.02382	null
2024-11-04	Addressing Uncertainty in LLMs to Enhance Reliability in Generative AI	Ramneet Kaur et.al.	2411.02381	null
2024-11-04	Learning General-Purpose Biomedical Volume Representations using Randomized Synthesis	Neel Dey et.al.	2411.02372	link
2024-11-04	DeeR-VLA: Dynamic Inference of Multimodal Large Language Models for Efficient Robot Execution	Yang Yue et.al.	2411.02359	link
2024-11-04	"Give Me BF16 or Give Me Death"? Accuracy-Performance Trade-Offs in LLM Quantization	Eldar Kurtic et.al.	2411.02355	null
2024-11-04	Machine learning identification of maternal inflammatory response and histologic choroamnionitis from placental membrane whole slide images	Abhishek Sharma et.al.	2411.02354	null
2024-11-04	Social-RAG: Retrieving from Group Interactions to Socially Ground Proactive AI Generation to Group Preferences	Ruotong Wang et.al.	2411.02353	null
2024-11-04	Can Large Language Models generalize analogy solving like people can?	Claire E. Stevenson et.al.	2411.02348	null
2024-11-04	WebRL: Training LLM Web Agents via Self-Evolving Online Curriculum Reinforcement Learning	Zehan Qi et.al.	2411.02337	link
2024-11-04	Sparsing Law: Towards Large Language Models with Greater Activation Sparsity	Yuqi Luo et.al.	2411.02335	link
2024-11-04	Disrupting Test Development with AI Assistants	Vijay Joshi et.al.	2411.02328	null
2024-11-04	PPLLaVA: Varied Video Sequence Understanding With Prompt Guidance	Ruyang Liu et.al.	2411.02327	link
2024-11-04	An Empirical Study on the Code Refactoring Capability of Large Language Models	Jonathan Cordeiro et.al.	2411.02320	null
2024-11-04	Evaluating the Ability of Large Language Models to Generate Verifiable Specifications in VeriFast	Marilyn Rego et.al.	2411.02318	null
2024-11-04	Defining and Evaluating Physical Safety for Large Language Models	Yung-Chen Tang et.al.	2411.02317	null
2024-11-04	Evaluating Creative Short Story Generation in Humans and Large Language Models	Mete Ismayilzada et.al.	2411.02316	link
2024-11-04	Taking AI Welfare Seriously	Robert Long et.al.	2411.00986	null
2024-10-31	P-Masking: Power Law Masking Improves Multi-attribute Controlled Generation	Mohamed Elgaar et.al.	2410.24201	null
2024-11-01	SelfCodeAlign: Self-Alignment for Code Generation	Yuxiang Wei et.al.	2410.24198	link
2024-10-31	DC-Spin: A Speaker-invariant Speech Tokenizer for Spoken Language Models	Heng-Jui Chang et.al.	2410.24177	null
2024-10-31	Constraint Back-translation Improves Complex Instruction Following of Large Language Models	Yunjia Qi et.al.	2410.24175	null
2024-10-31	$π_0$ : A Vision-Language-Action Flow Model for General Robot Control	Kevin Black et.al.	2410.24164	null
2024-10-31	GPT or BERT: why not both?	Lucas Georges Gabriel Charpentier et.al.	2410.24159	link
2024-10-31	Thought Space Explorer: Navigating and Expanding Thought Space for Large Language Model Reasoning	Jinghan Zhang et.al.	2410.24155	null
2024-10-31	Language-Driven Policy Distillation for Cooperative Driving in Multi-Agent Reinforcement Learning	Jiaqi Liu et.al.	2410.24152	null
2024-10-31	Exploring Vision Language Models for Facial Attribute Recognition: Emotion, Race, Gender, and Age	Nouar AlDahoul et.al.	2410.24148	null
2024-10-31	Leveraging Large Language Models for Code Translation and Software Development in Scientific Computing	Akash Dhruv et.al.	2410.24119	link
2024-10-31	Repository-Level Compositional Code Translation and Validation	Ali Reza Ibrahimzada et.al.	2410.24117	link
2024-10-31	Matchmaker: Self-Improving Large Language Model Programs for Schema Matching	Nabeel Seedat et.al.	2410.24105	null
2024-10-31	Progressive Safeguards for Safe and Model-Agnostic Reinforcement Learning	Nabil Omi et.al.	2410.24096	null
2024-10-31	In-Context Fine-Tuning for Time-Series Foundation Models	Abhimanyu Das et.al.	2410.24087	null
2024-10-31	Desert Camels and Oil Sheikhs: Arab-Centric Red Teaming of Frontier LLMs	Muhammed Saeed et.al.	2410.24049	null
2024-10-31	Handwriting Recognition in Historical Documents with Multimodal LLM	Lucian Li et.al.	2410.24034	null
2024-10-31	Navigating the Unknown: A Chat-Based Collaborative Interface for Personalized Exploratory Tasks	Yingzhe Peng et.al.	2410.24032	null
2024-10-31	AndroidLab: Training and Systematic Benchmarking of Android Autonomous Agents	Yifan Xu et.al.	2410.24024	link
2024-10-31	SFM-Protein: Integrative Co-evolutionary Pre-training for Advanced Protein Sequence Representation	Liang He et.al.	2410.24022	null
2024-10-31	Speech is More Than Words: Do Speech-to-Text Translation Systems Leverage Prosody?	Ioannis Tsiamas et.al.	2410.24019	null
2024-10-30	ReferEverything: Towards Segmenting Everything We Can Speak of in Videos	Anurag Bagchi et.al.	2410.23287	null
2024-10-30	A Monte Carlo Framework for Calibrated Uncertainty Estimation in Sequence Prediction	Qidong Yang et.al.	2410.23272	null
2024-10-30	TOMATO: Assessing Visual Temporal Reasoning Capabilities in Multimodal Foundation Models	Ziyao Shangguan et.al.	2410.23266	link
2024-10-30	EMMA: End-to-End Multimodal Model for Autonomous Driving	Jyh-Jing Hwang et.al.	2410.23262	null
2024-10-30	Keypoint Abstraction using Large Models for Object-Relative Imitation Learning	Xiaolin Fang et.al.	2410.23254	null
2024-10-30	Evaluating Cultural and Social Awareness of LLM Web Agents	Haoyi Qiu et.al.	2410.23252	null
2024-10-30	Carrot and Stick: Eliciting Comparison Data and Beyond	Yiling Chen et.al.	2410.23243	null
2024-10-30	A little less conversation, a little more action, please: Investigating the physical common-sense of LLMs in a 3D embodied environment	Matteo G. Mecattaf et.al.	2410.23242	link
2024-10-30	EMOTION: Expressive Motion Sequence Generation for Humanoid Robots with In-Context Learning	Peide Huang et.al.	2410.23234	null
2024-10-30	COMAL: A Convergent Meta-Algorithm for Aligning LLMs with General Preferences	Yixin Liu et.al.	2410.23223	link
2024-10-30	Partial Channel Dependence with Channel Masks for Time Series Foundation Models	Seunghan Lee et.al.	2410.23222	null
2024-10-30	OS-ATLAS: A Foundation Action Model for Generalist GUI Agents	Zhiyong Wu et.al.	2410.23218	link
2024-10-31	Grounding by Trying: LLMs with Reinforcement Learning-Enhanced Retrieval	Sheryl Hsu et.al.	2410.23214	null
2024-10-30	ProTransformer: Robustify Transformers via Plug-and-Play Paradigm	Zhichao Hou et.al.	2410.23182	link
2024-10-30	ReasoningRec: Bridging Personalized Recommendations and Human-Interpretable Explanations through LLM Reasoning	Millennium Bismay et.al.	2410.23180	link
2024-10-30	TokenFormer: Rethinking Transformer Scaling with Tokenized Model Parameters	Haiyang Wang et.al.	2410.23168	link
2024-10-30	SciPIP: An LLM-based Scientific Paper Idea Proposer	Wenxiao Wang et.al.	2410.23166	link
2024-10-30	FlexTSF: A Universal Forecasting Model for Time Series with Variable Regularities	Jingge Xiao et.al.	2410.23160	link
2024-10-30	VisualPredicator: Learning Abstract World Models with Neuro-Symbolic Predicates for Robot Planning	Yichao Liang et.al.	2410.23156	null
2024-10-30	Public Domain 12M: A Highly Aesthetic Image-Text Dataset with Novel Governance Mechanisms	Jordan Meyer et.al.	2410.23144	null
2024-10-29	Local Policies Enable Zero-shot Long-horizon Manipulation	Murtaza Dalal et.al.	2410.22332	null
2024-10-29	Task Vectors are Cross-Modal	Grace Luo et.al.	2410.22330	null
2024-10-29	Enhancing Code Annotation Reliability: Generative AI's Role in Comment Quality Assessment Models	Seetharam Killivalavan et.al.	2410.22323	null
2024-10-29	Online Detecting LLM-Generated Texts via Sequential Hypothesis Testing by Betting	Can Chen et.al.	2410.22318	link
2024-10-29	Multi-Class Textual-Inversion Secretly Yields a Semantic-Agnostic Classifier	Kai Wang et.al.	2410.22317	link
2024-10-29	Natural Language Inference Improves Compositionality in Vision-Language Models	Paola Cascante-Bonilla et.al.	2410.22315	null
2024-10-29	Senna: Bridging Large Vision-Language Models and End-to-End Autonomous Driving	Bo Jiang et.al.	2410.22313	link
2024-10-29	GPT-4o reads the mind in the eyes	James W. A. Strachan et.al.	2410.22309	null
2024-10-29	SVIP: Towards Verifiable Inference of Open-source Large Language Models	Yifan Sun et.al.	2410.22307	null
2024-10-29	Flow-DPO: Improving LLM Mathematical Reasoning through Online Multi-Agent Learning	Yihe Deng et.al.	2410.22304	null
2024-10-29	LLMs are Highly-Constrained Biophysical Sequence Optimizers	Angelica Chen et.al.	2410.22296	null
2024-10-29	Fine-Tuning LLMs for Code Mutation: A New Era of Cyber Threats	Mohammad Setak et.al.	2410.22293	null
2024-10-29	From melodic note sequences to pitches using word2vec	Daniel Defays et.al.	2410.22285	null
2024-10-29	Embedding-based classifiers can detect prompt injection attacks	Md. Ahsan Ayub et.al.	2410.22284	link
2024-10-29	Whose ChatGPT? Unveiling Real-World Educational Inequalities Introduced by Large Language Models	Renzhe Yu et.al.	2410.22282	null
2024-10-29	Fourier Head: Helping Large Language Models Learn Complex Probability Distributions	Nate Gillman et.al.	2410.22269	null
2024-10-29	Meta-Learning Adaptable Foundation Models	Jacob L. Block et.al.	2410.22264	null
2024-10-29	FactBench: A Dynamic Benchmark for In-the-Wild Language Model Factuality Evaluation	Farima Fatahi Bayat et.al.	2410.22257	null
2024-10-29	Abrupt Learning in Transformers: A Case Study on Matrix Completion	Pulkit Gopalani et.al.	2410.22244	null
2024-10-29	Are Decoder-Only Large Language Models the Silver Bullet for Code Search?	Yuxuan Chen et.al.	2410.22240	link
2024-10-28	Arithmetic Without Algorithms: Language Models Solve Math With a Bag of Heuristics	Yaniv Nikankin et.al.	2410.21272	link
2024-10-28	LARP: Tokenizing Videos with a Learned Autoregressive Generative Prior	Hanyu Wang et.al.	2410.21264	null
2024-10-28	BLAST: Block-Level Adaptive Structured Matrices for Efficient Deep Neural Network Inference	Changwoo Lee et.al.	2410.21262	link
2024-10-28	AutoBench-V: Can Large Vision-Language Models Benchmark Themselves?	Han Bao et.al.	2410.21259	link
2024-10-28	Multi-modal AI for comprehensive breast cancer prognostication	Jan Witowski et.al.	2410.21256	null
2024-10-28	LongReward: Improving Long-context Large Language Models with AI Feedback	Jiajie Zhang et.al.	2410.21252	link
2024-10-28	Zero-Shot Dense Retrieval with Embeddings from Relevance Feedback	Nour Jedidi et.al.	2410.21242	null
2024-10-28	Hierarchical Knowledge Graph Construction from Images for Scalable E-Commerce	Zhantao Yang et.al.	2410.21237	null
2024-10-28	Flaming-hot Initiation with Regular Execution Sampling for Large Language Models	Weizhe Chen et.al.	2410.21236	null
2024-10-28	LoRA vs Full Fine-tuning: An Illusion of Equivalence	Reece Shuttleworth et.al.	2410.21228	null
2024-10-28	Vision Search Assistant: Empower Vision-Language Models as Multimodal Search Engines	Zhixin Zhang et.al.	2410.21220	link
2024-10-28	Lifting the Veil on the Large Language Model Supply Chain: Composition, Risks, and Mitigations	Kaifeng Huang et.al.	2410.21218	null
2024-10-28	BongLLaMA: LLaMA for Bangla Language	Abdullah Khan Zehady et.al.	2410.21200	null
2024-10-28	Belief in the Machine: Investigating Epistemological Blind Spots of Language Models	Mirac Suzgun et.al.	2410.21195	link
2024-10-29	Document Parsing Unveiled: Techniques, Challenges, and Prospects for Structured Information Extraction	Qintong Zhang et.al.	2410.21169	null
2024-10-28	M2rc-Eval: Massively Multilingual Repository-level Code Completion Evaluation	Jiaheng Liu et.al.	2410.21157	null
2024-10-28	Palisade -- Prompt Injection Detection Framework	Sahasra Kokkula et.al.	2410.21146	null
2024-10-28	LLM-initialized Differentiable Causal Discovery	Shiv Kampani et.al.	2410.21141	null
2024-10-28	Do LLMs generate test oracles that capture the actual or the expected program behaviour?	Michael Konstantinou et.al.	2410.21136	null
2024-10-28	Towards Unifying Evaluation of Counterfactual Explanations: Leveraging Large Language Models for Human-Centric Assessments	Marharyta Domnich et.al.	2410.21131	link
2024-10-25	The Potential and Value of AI Chatbot in Personalized Cognitive Training	Zilong Wang et.al.	2410.19733	null
2024-10-25	Rethinking Visual Dependency in Long-Context Reasoning for Large Vision-Language Models	Yucheng Zhou et.al.	2410.19732	null
2024-10-25	Counting Ability of Large Language Models and Impact of Tokenization	Xiang Zhang et.al.	2410.19730	link
2024-10-25	FISHNET: Financial Intelligence from Sub-querying, Harmonizing, Neural-Conditioning, Expert Swarms, and Task Planning	Nicole Cho et.al.	2410.19727	null
2024-10-25	2D-DPO: Scaling Direct Preference Optimization with 2-Dimensional Supervision	Shilong Li et.al.	2410.19720	null
2024-10-25	Multi-view biomedical foundation models for molecule-target and property prediction	Parthasarathy Suryanarayanan et.al.	2410.19704	link
2024-10-25	TimeSuite: Improving MLLMs for Long Video Understanding via Grounded Tuning	Xiangyu Zeng et.al.	2410.19702	null
2024-10-25	IPPON: Common Sense Guided Informative Path Planning for Object Goal Navigation	Kaixian Qu et.al.	2410.19697	null
2024-10-25	Less is More: Extreme Gradient Boost Rank-1 Adaption for Efficient Finetuning of LLMs	Yifei Zhang et.al.	2410.19694	null
2024-10-25	APRICOT: Active Preference Learning and Constraint-Aware Task Planning with LLMs	Huaxiaoyue Wang et.al.	2410.19656	null
2024-10-25	Frozen-DETR: Enhancing DETR with Image Understanding from Frozen Foundation Models	Shenghao Fu et.al.	2410.19635	null
2024-10-25	Take Caution in Using LLMs as Human Surrogates: Scylla Ex Machina	Yuan Gao et.al.	2410.19599	null
2024-10-25	Diverse Sign Language Translation	Xin Shen et.al.	2410.19586	link
2024-10-25	ChunkRAG: Novel LLM-Chunk Filtering Method for RAG Systems	Ritvik Aggarwal Ishneet Sukhvinder Singh Ibrahim Allahverdiyev et.al.	2410.19572	null
2024-10-25	GeoLLaVA: Efficient Fine-Tuned Vision-Language Models for Temporal Change Detection in Remote Sensing	Hosam Elgendy et.al.	2410.19552	link
2024-10-25	Bongard in Wonderland: Visual Puzzles that Still Make AI Go Mad?	Antonia Wüst et.al.	2410.19546	link
2024-10-25	Brain-like Functional Organization within Large Language Models	H. Sun et.al.	2410.19542	null
2024-10-25	Detection of Human and Machine-Authored Fake News in Urdu	Muhammad Zain Ali et.al.	2410.19517	link
2024-10-25	SWITCH: Studying with Teacher for Knowledge Distillation of Large Language Models	Jahyun Koo et.al.	2410.19503	null
2024-10-25	Introducing MAPO: Momentum-Aided Gradient Descent Prompt Optimization	Anthony Cui et.al.	2410.19499	null
2024-10-24	Unbounded: A Generative Infinite Game of Character Life Simulation	Jialu Li et.al.	2410.18975	null
2024-10-24	Deep Insights into Cognitive Decline: A Survey of Leveraging Non-Intrusive Modalities with Deep Learning Techniques	David Ortiz-Perez et.al.	2410.18972	null
2024-10-24	ConceptDrift: Uncovering Biases through the Lens of Foundational Models	Cristian Daniel Păduraru et.al.	2410.18970	null
2024-10-24	Ferret-UI 2: Mastering Universal User Interface Understanding Across Platforms	Zhangheng Li et.al.	2410.18967	null
2024-10-24	Does Data Contamination Detection Work (Well) for LLMs? A Survey and Evaluation on Detection Assumptions	Yujuan Fu et.al.	2410.18966	null
2024-10-24	On the Crucial Role of Initialization for Matrix Factorization	Bingcong Li et.al.	2410.18965	null
2024-10-24	OSCAR: Operating System Control via State-Aware Reasoning and Re-Planning	Xiaoqiang Wang et.al.	2410.18963	null
2024-10-24	Context is Key: A Benchmark for Forecasting with Essential Textual Information	Andrew Robert Williams et.al.	2410.18959	link
2024-10-24	Bridge-Coder: Unlocking LLMs' Potential to Overcome Language Gaps in Low-Resource Code	Jipeng Zhang et.al.	2410.18957	null
2024-10-24	BioMistral-NLU: Towards More Generalizable Medical Language Understanding through Instruction Tuning	Yujuan Velvin Fu et.al.	2410.18955	null
2024-10-24	Dynamic Vocabulary Pruning in Early-Exit LLMs	Jort Vincenti et.al.	2410.18952	link
2024-10-24	SafeBench: A Safety Evaluation Framework for Multimodal Large Language Models	Zonghao Ying et.al.	2410.18927	null
2024-10-24	From Blind Solvers to Logical Thinkers: Benchmarking LLMs' Logical Integrity on Faulty Mathematical Problems	A M Muntasir Rahman et.al.	2410.18921	null
2024-10-25	A Survey on Speech Large Language Models	Jing Peng et.al.	2410.18908	null
2024-10-24	PRISM: A Methodology for Auditing Biases in Large Language Models	Leif Azzopardi et.al.	2410.18906	link
2024-10-24	LLMs for Extremely Low-Resource Finno-Ugric Languages	Taido Purason et.al.	2410.18902	null
2024-10-24	Creating and Repairing Robot Programs in Open-World Domains	Claire Schlesinger et.al.	2410.18893	null
2024-10-24	Improving Small-Scale Large Language Models Function Calling for Reasoning Tasks	Graziano A. Manduzio et.al.	2410.18890	null
2024-10-24	Are LLMs Better than Reported? Detecting Label Errors and Mitigating Their Effect on Model Performance	Omer Nahum et.al.	2410.18889	null
2024-10-24	Provably Robust Watermarks for Open-Source Language Models	Miranda Christ et.al.	2410.18861	null
2024-10-23	TP-Eval: Tap Multimodal LLMs' Potential in Evaluation by Customizing Prompts	Yuxuan Xie et.al.	2410.18071	null
2024-10-23	CLEAR: Character Unlearning in Textual and Visual Modalities	Alexey Dontsov et.al.	2410.18057	null
2024-10-23	LongRAG: A Dual-Perspective Retrieval-Augmented Generation Paradigm for Long-Context Question Answering	Qingfei Zhao et.al.	2410.18050	link
2024-10-23	Key Algorithms for Keyphrase Generation: Instruction-Based LLMs for Russian Scientific Keyphrases	Anna Glazkova et.al.	2410.18040	null
2024-10-23	MiLoRA: Efficient Mixture of Low-Rank Adaptation for Large Language Models Fine-tuning	Jingfan Zhang et.al.	2410.18035	null
2024-10-23	GraphTeam: Facilitating Large Language Model-based Graph Analysis via Multi-Agent Collaboration	Xin Li et.al.	2410.18032	link
2024-10-23	MiniFed : Integrating LLM-based Agentic-Workflow for Simulating FOMC Meeting	Sungil Seok et.al.	2410.18012	null
2024-10-23	Benchmarking Foundation Models on Exceptional Cases: Dataset Creation and Validation	Suho Kang et.al.	2410.18001	link
2024-10-23	MCUBERT: Memory-Efficient BERT Inference on Commodity Microcontrollers	Zebin Yang et.al.	2410.17957	null
2024-10-23	ExpertFlow: Optimized Expert Activation and Token Allocation for Efficient Mixture-of-Experts Inference	Xin He et.al.	2410.17954	null
2024-10-23	SimRAG: Self-Improving Retrieval-Augmented Generation for Adapting Large Language Models to Specialized Domains	Ran Xu et.al.	2410.17952	null
2024-10-23	Benchmarking Floworks against OpenAI & Anthropic: A Novel Framework for Enhanced LLM Function Calling	Nirav Bhan et.al.	2410.17950	null
2024-10-23	Toward path-invariant embeddings for local distance source characterization	Lisa Linville et.al.	2410.17937	null
2024-10-23	Guide for Defense (G4D): Dynamic Guidance for Robust and Balanced Defense in Large Language Models	He Cao et.al.	2410.17922	link
2024-10-23	Scaling Diffusion Language Models via Adaptation from Autoregressive Models	Shansan Gong et.al.	2410.17891	link
2024-10-23	R-CoT: Reverse Chain-of-Thought Problem Generation for Geometric Reasoning in Large Multimodal Models	Linger Deng et.al.	2410.17885	link
2024-10-23	Lightweight Neural App Control	Filippos Christianos et.al.	2410.17883	null
2024-10-23	AdaRankGrad: Adaptive Gradient-Rank and Moments for Memory-Efficient LLMs Training and Fine-Tuning	Yehonathan Refael et.al.	2410.17881	null
2024-10-23	Understanding Layer Significance in LLM Alignment	Guangyuan Shi et.al.	2410.17875	null
2024-10-23	DataTales: A Benchmark for Real-World Intelligent Data Narration	Yajing Yang et.al.	2410.17859	link
2024-10-22	PyramidDrop: Accelerating Your Large Vision-Language Models via Pyramid Visual Redundancy Reduction	Long Xing et.al.	2410.17247	link
2024-10-22	Towards Reliable Evaluation of Behavior Steering Interventions in LLMs	Itamar Pres et.al.	2410.17245	null
2024-10-22	Frontiers in Intelligent Colonoscopy	Ge-Peng Ji et.al.	2410.17241	link
2024-10-22	Large Language Models Empowered Personalized Web Agents	Hongru Cai et.al.	2410.17236	null
2024-10-22	Automated Spinal MRI Labelling from Reports Using a Large Language Model	Robin Y. Park et.al.	2410.17235	link
2024-10-22	Fine-Tuning Large Language Models to Appropriately Abstain with Semantic Entropy	Benedict Aaron Tjandra et.al.	2410.17234	null
2024-10-22	Few-shot In-Context Preference Learning Using Large Language Models	Chao Yu et.al.	2410.17233	null
2024-10-22	Context-aware Prompt Tuning: Advancing In-Context Learning with Adversarial Methods	Tsachi Blau et.al.	2410.17222	null
2024-10-22	MiniPLM: Knowledge Distillation for Pre-Training Language Models	Yuxian Gu et.al.	2410.17215	link
2024-10-22	Exploring Possibilities of AI-Powered Legal Assistance in Bangladesh through Large Language Modeling	Azmine Toushik Wasi et.al.	2410.17210	link
2024-10-22	VoiceBench: Benchmarking LLM-Based Voice Assistants	Yiming Chen et.al.	2410.17196	link
2024-10-23	Non-myopic Generation of Language Model for Reasoning and Planning	Chang Ma et.al.	2410.17195	link
2024-10-22	Remote Timing Attacks on Efficient Language Model Inference	Nicholas Carlini et.al.	2410.17175	null
2024-10-22	From Attention to Activation: Unravelling the Enigmas of Large Language Models	Prannay Kaul et.al.	2410.17174	null
2024-10-22	Self-calibration for Language Model Quantization and Pruning	Miles Williams et.al.	2410.17170	null
2024-10-22	Interchangeable Token Embeddings for Extendable Vocabulary and Alpha-Equivalence	İlker Işık et.al.	2410.17161	null
2024-10-22	Improving Pinterest Search Relevance Using Large Language Models	Han Wang et.al.	2410.17152	null
2024-10-22	Are Visual-Language Models Effective in Action Recognition? A Comparative Study	Mahmoud Ali et.al.	2410.17149	null
2024-10-22	Can General-Purpose Large Language Models Generalize to English-Thai Machine Translation ?	Jirat Chiaranaipanich et.al.	2410.17145	null
2024-10-22	Towards Automated Penetration Testing: Introducing LLM Benchmark, Analysis, and Improvements	Isamu Isozaki et.al.	2410.17141	link
2024-10-21	Reflection-Bench: probing AI intelligence with reflection	Lingyu Li et.al.	2410.16270	link
2024-10-21	SAM2Long: Enhancing SAM 2 for Long Video Segmentation with a Training-Free Memory Tree	Shuangrui Ding et.al.	2410.16268	link
2024-10-21	xGen-MM-Vid (BLIP-3-Video): You Only Need 32 Tokens to Represent a Video Even in VLMs	Michael S. Ryoo et.al.	2410.16267	null
2024-10-22	Mini-InternVL: A Flexible-Transfer Pocket Multimodal Model with 5% Parameters and 90% Performance	Zhangwei Gao et.al.	2410.16261	link
2024-10-21	Elucidating the design space of language models for image generation	Xuantong Liu et.al.	2410.16257	link
2024-10-21	CompassJudger-1: All-in-one Judge Model Helps Model Evaluation and Evolution	Maosong Cao et.al.	2410.16256	link
2024-10-21	Can Knowledge Editing Really Correct Hallucinations?	Baixiang Huang et.al.	2410.16251	link
2024-10-21	Analyzing Context Contributions in LLM-based Machine Translation	Emmanouil Zaranis et.al.	2410.16246	null
2024-10-21	IBGP: Imperfect Byzantine Generals Problem for Zero-Shot Robustness in Communicative Multi-Agent Systems	Yihuan Mao et.al.	2410.16237	null
2024-10-21	LLaVA-KD: A Framework of Distilling Multimodal Large Language Models	Yuxuan Cai et.al.	2410.16236	link
2024-10-21	ToW: Thoughts of Words Improve Reasoning in Large Language Models	Zhikun Xu et.al.	2410.16235	link
2024-10-21	Sketch2Code: Evaluating Vision-Language Models for Interactive Web Design Prototyping	Ryan Li et.al.	2410.16232	null
2024-10-21	Building A Coding Assistant via the Retrieval-Augmented Language Model	Xinze Li et.al.	2410.16229	link
2024-10-21	A Realistic Threat Model for Large Language Model Jailbreaks	Valentyn Boreiko et.al.	2410.16222	link
2024-10-21	Pre-training Distillation for Large Language Models: A Design Space Exploration	Hao Peng et.al.	2410.16215	null
2024-10-21	Comprehensive benchmarking of large language models for RNA secondary structure prediction	L. I. Zablocki et.al.	2410.16212	link
2024-10-21	CoT-TL: Low-Resource Temporal Knowledge Representation of Planning Instructions Using Chain-of-Thought Reasoning	Kumar Manas et.al.	2410.16207	null
2024-10-21	Improve Vision Language Model Chain-of-thought Reasoning	Ruohong Zhang et.al.	2410.16198	link
2024-10-22	LASER: Script Execution by Autonomous Agents for On-demand Traffic Simulation	Hao Gao et.al.	2410.16197	link
2024-10-21	Contamination Report for Multilingual Benchmarks	Sanchit Ahuja et.al.	2410.16186	null
2024-10-18	Are AI Detectors Good Enough? A Survey on Quality of Datasets With Machine-Generated Texts	German Gritsai et.al.	2410.14677	link
2024-10-18	SudoLM: Learning Access Control of Parametric Knowledge with Authorization Alignment	Qin Liu et.al.	2410.14676	null
2024-10-18	Enhancing Large Language Models' Situated Faithfulness to External Contexts	Yukun Huang et.al.	2410.14675	link
2024-10-18	Decomposing The Dark Matter of Sparse Autoencoders	Joshua Engels et.al.	2410.14670	link
2024-10-18	NaturalBench: Evaluating Vision-Language Models on Natural Adversarial Samples	Baiqi Li et.al.	2410.14669	null
2024-10-18	MiCEval: Unveiling Multimodal Chain of Thought's Quality via Image Description and Reasoning Steps	Xiongtao Zhou et.al.	2410.14668	link
2024-10-18	A Large Language Model-Driven Reward Design Framework via Dynamic Feedback for Reinforcement Learning	Shengjie Sun et.al.	2410.14660	null
2024-10-18	Bridging the Training-Inference Gap in LLMs by Leveraging Self-Generated Tokens	Zhepeng Cen et.al.	2410.14655	null
2024-10-18	EvoPress: Towards Optimal Dynamic Model Compression via Evolutionary Search	Oliver Sieberling et.al.	2410.14649	link
2024-10-18	Distance between Relevant Information Pieces Causes Bias in Long-Context LLMs	Runchu Tian et.al.	2410.14641	link
2024-10-18	GenEOL: Harnessing the Generative Power of LLMs for Training-Free Sentence Embeddings	Raghuveer Thirukovalluru et.al.	2410.14635	link
2024-10-18	Swiss Army Knife: Synergizing Biases in Knowledge from Vision Foundation Models for Multi-Task Learning	Yuxiang Lu et.al.	2410.14633	null
2024-10-18	On the Regularization of Learnable Embeddings for Time Series Processing	Luca Butera et.al.	2410.14630	null
2024-10-18	CELI: Controller-Embedded Language Model Interactions	Jan-Samuel Wagner et.al.	2410.14627	null
2024-10-18	DiSCo Meets LLMs: A Unified Approach for Sparse Retrieval and Contextual Distillation in Conversational Search	Simon Lupart et.al.	2410.14609	null
2024-10-18	Teaching Models to Balance Resisting and Accepting Persuasion	Elias Stengel-Eskin et.al.	2410.14596	link
2024-10-18	Neuro-Symbolic Traders: Assessing the Wisdom of AI Crowds in Markets	Namid R. Stillman et.al.	2410.14587	null
2024-10-18	Do LLMs estimate uncertainty well in instruction-following?	Juyeon Heo et.al.	2410.14582	null
2024-10-18	Large Language Models Are Overparameterized Text Encoders	Thennal D K et.al.	2410.14578	null
2024-10-18	MomentumSMoE: Integrating Momentum into Sparse Mixture of Experts	Rachel S. Y. Teo et.al.	2410.14574	link
2024-10-17	Fluid: Scaling Autoregressive Text-to-image Generative Models with Continuous Tokens	Lijie Fan et.al.	2410.13863	null
2024-10-17	PUMA: Empowering Unified MLLM with Multi-granular Visual Generation	Rongyao Fang et.al.	2410.13861	link
2024-10-17	VLM-Grounder: A VLM Agent for Zero-Shot 3D Visual Grounding	Runsen Xu et.al.	2410.13860	link
2024-10-17	$γ-$ MoD: Exploring Mixture-of-Depth Adaptation for Multimodal Large Language Models	Yaxin Luo et.al.	2410.13859	null
2024-10-17	How Numerical Precision Affects Mathematical Reasoning Capabilities of LLMs	Guhao Feng et.al.	2410.13857	null
2024-10-17	Can MLLMs Understand the Deep Implication Behind Chinese Images?	Chenhao Zhang et.al.	2410.13854	link
2024-10-17	Retrospective Learning from Interactions	Zizhao Chen et.al.	2410.13852	null
2024-10-17	Differentiable Robot Rendering	Ruoshi Liu et.al.	2410.13851	null
2024-10-17	SimLayerKV: A Simple Framework for Layer-Level KV Cache Reduction	Xuan Zhang et.al.	2410.13846	link
2024-10-17	A Unified View of Delta Parameter Editing in Post-Trained Large-Scale Models	Qiaoyu Tang et.al.	2410.13841	null
2024-10-17	Active-Dormant Attention Heads: Mechanistically Demystifying Extreme-Token Phenomena in LLMs	Tianyu Guo et.al.	2410.13835	link
2024-10-17	A Common Pitfall of Margin-based Language Model Alignment: Gradient Entanglement	Hui Yuan et.al.	2410.13828	link
2024-10-17	Unearthing Skill-Level Insights for Understanding Trade-Offs of Foundation Models	Mazda Moayeri et.al.	2410.13826	null
2024-10-17	AgentOccam: A Simple Yet Strong Baseline for LLM-Based Web Agents	Ke Yang et.al.	2410.13825	null
2024-10-18	Harnessing Webpage UIs for Text-Rich Visual Understanding	Junpeng Liu et.al.	2410.13824	null
2024-10-17	Deep Generative Models Unveil Patterns in Medical Images Through Vision-Language Conditioning	Xiaodan Xing et.al.	2410.13823	link
2024-10-17	Steering Your Generalists: Improving Robotic Foundation Models via Value Guidance	Mitsuhiko Nakamoto et.al.	2410.13816	null
2024-10-17	De-mark: Watermark Removal in Large Language Models	Ruibo Chen et.al.	2410.13808	null
2024-10-17	A Watermark for Order-Agnostic Language Models	Ruibo Chen et.al.	2410.13805	null
2024-10-18	BenTo: Benchmark Task Reduction with In-Context Transferability	Hongyu Zhao et.al.	2410.13804	link
2024-10-16	Dual Prototype Evolving for Test-Time Generalization of Vision-Language Models	Ce Zhang et.al.	2410.12790	link
2024-10-16	Meta-Chunking: Learning Efficient Text Segmentation via Logical Perception	Jihao Zhao et.al.	2410.12788	link
2024-10-16	In-Context Learning Enables Robot Action Prediction in LLMs	Yida Yin et.al.	2410.12782	null
2024-10-16	Identifying Task Groupings for Multi-Task Learning Using Pointwise V-Usable Information	Yingya Li et.al.	2410.12774	null
2024-10-16	Harmon: Whole-Body Motion Generation of Humanoid Robots from Language Descriptions	Zhenyu Jiang et.al.	2410.12773	null
2024-10-16	Towards Zero-Shot Camera Trap Image Categorization	Jiří Vyskočil et.al.	2410.12769	null
2024-10-16	The Non-Local Model Merging Problem: Permutation Symmetries and Variance Collapse	Ekansh Sharma et.al.	2410.12766	null
2024-10-16	StyleDistance: Stronger Content-Independent Style Embeddings with Synthetic Parallel Examples	Ajay Patel et.al.	2410.12757	null
2024-10-17	CREAM: Consistency Regularized Self-Rewarding Language Models	Zhaoyang Wang et.al.	2410.12735	null
2024-10-16	WorldMedQA-V: a multilingual, multimodal medical examination dataset for multimodal language models evaluation	João Matos et.al.	2410.12722	link
2024-10-16	FusionLLM: A Decentralized LLM Training System on Geo-distributed GPUs with Adaptive Compression	Zhenheng Tang et.al.	2410.12707	null
2024-10-16	WorldCuisines: A Massive-Scale Benchmark for Multilingual and Multicultural Visual Question Answering on Global Cuisines	Genta Indra Winata et.al.	2410.12705	link
2024-10-16	Sarcasm Detection in a Less-Resourced Language	Lazar Đoković et.al.	2410.12704	link
2024-10-16	Embedding an Ethical Mind: Aligning Text-to-Image Synthesis via Lightweight Value Optimization	Xingqi Wang et.al.	2410.12700	link
2024-10-16	VividMed: Vision Language Model with Versatile Visual Grounding for Medicine	Lingxiao Luo et.al.	2410.12694	link
2024-10-16	Automatic Mapping of Anatomical Landmarks from Free-Text Using Large Language Models: Insights from Llama-2	Mohamad Abdi et.al.	2410.12686	null
2024-10-16	3DIS: Depth-Driven Decoupled Instance Synthesis for Text-to-Image Generation	Dewei Zhou et.al.	2410.12669	link
2024-10-16	Cross-Modal Safety Mechanism Transfer in Large Vision-Language Models	Shicheng Xu et.al.	2410.12662	null
2024-10-16	Evaluating Morphological Compositional Generalization in Large Language Models	Mete Ismayilzada et.al.	2410.12656	null
2024-10-16	Beyond Speech and More: Investigating the Emergent Ability of Speech Foundation Models for Classifying Physiological Time-Series Signals	Orchid Chetia Phukan et.al.	2410.12645	null
2024-10-15	GaVaMoE: Gaussian-Variational Gated Mixture of Experts for Explainable Recommendation	Fei Tang et.al.	2410.11841	link
2024-10-15	A Hitchhiker's Guide to Scaling Law Estimation	Leshem Choshen et.al.	2410.11840	link
2024-10-15	MMFuser: Multimodal Multi-Layer Feature Fuser for Fine-Grained Vision-Language Understanding	Yue Cao et.al.	2410.11829	link
2024-10-15	Adaptive Data Optimization: Dynamic Sample Selection with Scaling Laws	Yiding Jiang et.al.	2410.11820	link
2024-10-15	Improving Long-Text Alignment for Text-to-Image Diffusion Models	Luping Liu et.al.	2410.11817	link
2024-10-15	SGEdit: Bridging LLM with Text2Image Generative Model for Scene Graph-based Image Editing	Zhiyuan Zhang et.al.	2410.11815	null
2024-10-15	NesTools: A Dataset for Evaluating Nested Tool Learning Abilities of Large Language Models	Han Han et.al.	2410.11805	link
2024-10-15	FoundTS: Comprehensive and Unified Benchmarking of Foundation Models for Time Series Forecasting	Zhe Li et.al.	2410.11802	null
2024-10-15	Selection-p: Self-Supervised Task-Agnostic Prompt Compression for Faithfulness and Transferability	Tsz Ting Chung et.al.	2410.11786	null
2024-10-15	Latent BKI: Open-Dictionary Continuous Mapping in Visual-Language Latent Spaces with Quantifiable Uncertainty	Joey Wilson et.al.	2410.11783	link
2024-10-15	G-Designer: Architecting Multi-agent Communication Topologies via Graph Neural Networks	Guibin Zhang et.al.	2410.11782	null
2024-10-15	Language Models Encode Numbers Using Digit Representations in Base 10	Amit Arnold Levy et.al.	2410.11781	link
2024-10-15	MLLM can see? Dynamic Correction Decoding for Hallucination Mitigation	Chenxi Wang et.al.	2410.11779	link
2024-10-15	Time-Series Foundation Model for Value-at-Risk	Anubha Goel et.al.	2410.11773	link
2024-10-15	Layer-wise Importance Matters: Less Memory for Better Performance in Parameter-efficient Fine-tuning of Large Language Models	Kai Yao et.al.	2410.11772	link
2024-10-15	SlideChat: A Large Vision-Language Assistant for Whole-Slide Pathology Image Understanding	Ying Chen et.al.	2410.11761	null
2024-10-15	Latent Action Pretraining from Videos	Seonghyeon Ye et.al.	2410.11758	null
2024-10-15	Personas with Attitudes: Controlling LLMs for Diverse Data Annotation	Leon Fröhling et.al.	2410.11745	link
2024-10-15	DySpec: Faster Speculative Decoding with Dynamic Token Tree Structure	Yunfan Xiong et.al.	2410.11744	null
2024-10-15	Light-Weight Fault Tolerant Attention for Large Language Model Training	Yuhang Liang et.al.	2410.11720	null
2024-10-14	DuoAttention: Efficient Long-Context LLM Inference with Retrieval and Streaming Heads	Guangxuan Xiao et.al.	2410.10819	link
2024-10-14	Your Mixture-of-Experts LLM Is Secretly an Embedding Model For Free	Ziyue Li et.al.	2410.10814	link
2024-10-14	LongMemEval: Benchmarking Chat Assistants on Long-Term Interactive Memory	Di Wu et.al.	2410.10813	link
2024-10-14	Local and Global Decoding in Text Generation	Daniel Gareev et.al.	2410.10810	link
2024-10-14	Mix Data or Merge Models? Optimizing for Diverse Multi-Task Learning	Aakanksha et.al.	2410.10801	null
2024-10-14	Towards Foundation Models for 3D Vision: How Close Are We?	Yiming Zuo et.al.	2410.10799	link
2024-10-15	MMAR: Towards Lossless Multi-Modal Auto-Regressive Probabilistic Modeling	Jian Yang et.al.	2410.10798	null
2024-10-14	Context-Parametric Inversion: Why Instruction Finetuning May Not Actually Improve Context Reliance	Sachin Goyal et.al.	2410.10796	link
2024-10-15	LiveXiv -- A Multi-Modal Live Benchmark Based on Arxiv Papers Content	Nimrod Shabtay et.al.	2410.10783	link
2024-10-14	When Attention Sink Emerges in Language Models: An Empirical View	Xiangming Gu et.al.	2410.10781	link
2024-10-14	Focused ReAct: Improving ReAct through Reiterate and Early Stop	Shuoqiu Li et.al.	2410.10779	null
2024-10-14	AFlow: Automating Agentic Workflow Generation	Jiayi Zhang et.al.	2410.10762	link
2024-10-14	Denial-of-Service Poisoning Attacks against Large Language Models	Kuofeng Gao et.al.	2410.10760	link
2024-10-14	SplitLLM: Collaborative Inference of LLMs for Model Placement and Throughput Optimization	Akrit Mudvari et.al.	2410.10759	null
2024-10-14	Use Random Selection for Now: Investigation of Few-Shot Selection Strategies in LLM-based Text Augmentation for Classification	Jan Cegin et.al.	2410.10756	link
2024-10-14	NT-LLM: A Novel Node Tokenizer for Integrating Graph Structure into Large Language Models	Yanbiao Ji et.al.	2410.10743	null
2024-10-14	SensorBench: Benchmarking LLMs in Coding-Based Sensor Processing	Pengrui Quan et.al.	2410.10741	link
2024-10-14	Balancing Continuous Pre-Training and Instruction Fine-Tuning: Optimizing Instruction-Following in LLMs	Ishan Jindal et.al.	2410.10739	null
2024-10-14	Embedding Self-Correction as an Inherent Ability in Large Language Models for Enhanced Mathematical Reasoning	Kuofeng Gao et.al.	2410.10735	null
2024-10-14	Towards LLM-guided Efficient and Interpretable Multi-linear Tensor Network Rank Selection	Giorgos Iacovides et.al.	2410.10728	null
2024-10-11	Unraveling and Mitigating Safety Alignment Degradation of Vision-Language Models	Qin Liu et.al.	2410.09047	null
2024-10-11	AttnGCG: Enhancing Jailbreaking Attacks on LLMs with Attention Manipulation	Zijun Wang et.al.	2410.09040	link
2024-10-11	Semi-Supervised Learning of Noisy Mixture of Experts Models	Oh-Ran Kwon et.al.	2410.09039	null
2024-10-11	SimpleStrat: Diversifying Language Model Generation with Stratification	Justin Wong et.al.	2410.09038	null
2024-10-11	Mentor-KD: Making Small Language Models Better Multi-step Reasoners	Hojae Lee et.al.	2410.09037	link
2024-10-11	PEAR: A Robust and Flexible Automation Framework for Ptychography Enabled by Multiple Large Language Model Agents	Xiangyu Yin et.al.	2410.09034	link
2024-10-11	MedMobile: A mobile-sized language model with expert-level clinical capabilities	Krithik Vishwanath et.al.	2410.09019	link
2024-10-11	Parameter-Efficient Fine-Tuning of State Space Models	Kevin Galim et.al.	2410.09016	link
2024-10-11	The Impact of Visual Information in Chinese Characters: Evaluating Large Models' Ability to Recognize and Utilize Radicals	Xiaofeng Wu et.al.	2410.09013	null
2024-10-11	Software Engineering and Foundation Models: Insights from Industry Blogs Using a Jury of Foundation Models	Hao Li et.al.	2410.09012	link
2024-10-11	SuperCorrect: Supervising and Correcting Language Models with Error-Driven Insights	Ling Yang et.al.	2410.09008	link
2024-10-11	From Interaction to Impact: Towards Safer AI Agents Through Understanding and Evaluating UI Operation Impacts	Zhuohao Jerry Zhang et.al.	2410.09006	null
2024-10-11	DA-Ada: Learning Domain-Aware Adapter for Domain Adaptive Object Detection	Haochen Li et.al.	2410.09004	null
2024-10-11	Hypothesis-only Biases in Large Language Model-Elicited Natural Language Inference	Grace Proebsting et.al.	2410.08996	null
2024-10-11	The structure of the token space for large language models	Michael Robinson et.al.	2410.08993	null
2024-10-11	Science is Exploration: Computational Frontiers for Conceptual Metaphor Theory	Rebecca M. M. Hicke et.al.	2410.08991	link
2024-10-11	SubZero: Random Subspace Zeroth-Order Optimization for Memory-Efficient LLM Fine-Tuning	Ziming Yu et.al.	2410.08989	link
2024-10-11	Towards Trustworthy Knowledge Graph Reasoning: An Uncertainty Aware Perspective	Bo Ni et.al.	2410.08985	null
2024-10-11	NoVo: Norm Voting off Hallucinations with Attention Heads in Large Language Models	Zheng Yi Ho et.al.	2410.08970	null
2024-10-11	Controllable Safety Alignment: Inference-Time Adaptation to Diverse Safety Requirements	Jingyu Zhang et.al.	2410.08968	null
2024-10-10	DICE: Discrete Inversion Enabling Controllable Editing for Multinomial Diffusion and Masked Generative Models	Xiaoxiao He et.al.	2410.08207	null
2024-10-10	Mono-InternVL: Pushing the Boundaries of Monolithic Multimodal Large Language Models with Endogenous Visual Pre-training	Gen Luo et.al.	2410.08202	null
2024-10-10	Adam Exploits $\ell_\infty$ -geometry of Loss Landscape via Coordinate-wise Adaptivity	Shuo Xie et.al.	2410.08198	link
2024-10-10	From Exploration to Mastery: Enabling LLMs to Master Tools via Self-Driven Interactions	Changle Qu et.al.	2410.08197	link
2024-10-10	MathCoder2: Better Math Reasoning from Continued Pretraining on Model-translated Mathematical Code	Zimu Lu et.al.	2410.08196	link
2024-10-10	Features are fate: a theory of transfer learning in high-dimensional regression	Javan Tahir et.al.	2410.08194	null
2024-10-10	GenARM: Reward Guided Generation with Autoregressive Reward Model for Test-time Alignment	Yuancheng Xu et.al.	2410.08193	null
2024-10-10	MRAG-Bench: Vision-Centric Evaluation for Retrieval-Augmented Multimodal Models	Wenbo Hu et.al.	2410.08182	null
2024-10-10	Sample then Identify: A General Framework for Risk Control and Assessment in Multimodal Large Language Models	Qingni Wang et.al.	2410.08174	null
2024-10-10	On the Evaluation of Generative Robotic Simulations	Feng Chen et.al.	2410.08172	null
2024-10-10	Visual Scratchpads: Enabling Global Reasoning in Vision	Aryo Lotfi et.al.	2410.08165	null
2024-10-10	Agent S: An Open Agentic Framework that Uses Computers Like a Human	Saaket Agashe et.al.	2410.08164	link
2024-10-10	The Effect of Surprisal on Reading Times in Information Seeking and Repeated Reading	Keren Gruteke Klein et.al.	2410.08162	link
2024-10-10	DART: Denoising Autoregressive Transformer for Scalable Text-to-Image Generation	Jiatao Gu et.al.	2410.08159	null
2024-10-10	Rewarding Progress: Scaling Automated Process Verifiers for LLM Reasoning	Amrith Setlur et.al.	2410.08146	null
2024-10-10	Insight Over Sight? Exploring the Vision-Knowledge Conflicts in Multimodal LLMs	Xiaoyuan Liu et.al.	2410.08145	link
2024-10-10	DelTA: An Online Document-Level Translation Agent Based on Multi-Level Memory	Yutong Wang et.al.	2410.08143	link
2024-10-10	Steering Masked Discrete Diffusion Models via Discrete Denoising Posterior Prediction	Jarrid Rector-Brooks et.al.	2410.08134	null
2024-10-10	Think Beyond Size: Dynamic Prompting for More Effective Reasoning	Kamesh R et.al.	2410.08130	null
2024-10-10	Mars: Situated Inductive Reasoning in an Open-World Environment	Xiaojuan Tang et.al.	2410.08126	null
2024-10-09	MM-Ego: Towards Building Egocentric Multimodal LLMs	Hanrong Ye et.al.	2410.07177	null
2024-10-09	Astute RAG: Overcoming Imperfect Retrieval Augmentation and Knowledge Conflicts for Large Language Models	Fei Wang et.al.	2410.07176	null
2024-10-09	Do better language models have crisper vision?	Jona Ruthardt et.al.	2410.07173	null
2024-10-09	One Initialization to Rule them All: Fine-tuning via Explained Variance Adaptation	Fabian Paischer et.al.	2410.07170	link
2024-10-09	Sylber: Syllabic Embedding Representation of Speech from Raw Audio	Cheol Jun Cho et.al.	2410.07168	link
2024-10-09	Deciphering Cross-Modal Alignment in Large Vision-Language Models with Modality Integration Rate	Qidong Huang et.al.	2410.07167	link
2024-10-09	Embodied Agent Interface: Benchmarking LLMs for Embodied Decision Making	Manling Li et.al.	2410.07166	link
2024-10-09	Simplicity Prevails: Rethinking Negative Preference Optimization for LLM Unlearning	Chongyu Fan et.al.	2410.07163	link
2024-10-09	Trans4D: Realistic Geometry-Aware Transition for Compositional Text-to-4D Synthesis	Bohan Zeng et.al.	2410.07155	link
2024-10-09	Towards Interpreting Visual Information Processing in Vision-Language Models	Clement Neo et.al.	2410.07149	link
2024-10-09	Stuffed Mamba: State Collapse and State Capacity of RNN-Based Long-Context Modeling	Yingfa Chen et.al.	2410.07145	null
2024-10-09	Cheating Automatic LLM Benchmarks: Null Models Achieve High Win Rates	Xiaosen Zheng et.al.	2410.07137	link
2024-10-10	EvolveDirector: Approaching Advanced Text-to-Image Generation with Large Vision-Language Models	Rui Zhao et.al.	2410.07133	link
2024-10-09	Mental Disorders Detection in the Era of Large Language Models	Gleb Kuzmin et.al.	2410.07129	null
2024-10-09	Exploring the Readiness of Prominent Small Language Models for the Democratization of Financial Literacy	Tagore Rao Kosireddy et.al.	2410.07118	link
2024-10-09	Personalized Visual Instruction Tuning	Renjie Pi et.al.	2410.07113	link
2024-10-09	VHELM: A Holistic Evaluation of Vision Language Models	Tony Lee et.al.	2410.07112	link
2024-10-09	I Want to Break Free! Anti-Social Behavior and Persuasion Ability of LLMs in Multi-Agent Settings with Social Hierarchy	Gian Maria Campedelli et.al.	2410.07109	link
2024-10-09	Unleashing Multi-Hop Reasoning Potential in Large Language Models through Repetition of Misordered Context	Sangwon Yu et.al.	2410.07103	null
2024-10-09	MLE-bench: Evaluating Machine Learning Agents on Machine Learning Engineering	Jun Shern Chan et.al.	2410.07095	link
2024-10-07	Fine-Tuning CLIP's Last Visual Projector: A Few-Shot Cornucopia	Mohammad Fahes et.al.	2410.05270	link
2024-10-07	Data Advisor: Dynamic Data Curation for Safety Alignment of Large Language Models	Fei Wang et.al.	2410.05269	null
2024-10-07	PrefixQuant: Static Quantization Beats Dynamic through Prefixed Outliers in LLMs	Mengzhao Chen et.al.	2410.05265	link
2024-10-07	TurtleBench: Evaluating Top Language Models via Real-World Yes/No Puzzles	Qingchen Yu et.al.	2410.05262	link
2024-10-07	TextHawk2: A Large Vision-Language Model Excels in Bilingual OCR and Grounding with 16x Fewer Tokens	Ya-Qi Yu et.al.	2410.05261	null
2024-10-07	Differential Transformer	Tianzhu Ye et.al.	2410.05258	link
2024-10-07	GLEE: A Unified Framework and Benchmark for Language-based Economic Environments	Eilam Shapira et.al.	2410.05254	link
2024-10-07	Causal Micro-Narratives	Mourad Heddaya et.al.	2410.05252	null
2024-10-07	SFTMix: Elevating Language Model Instruction Tuning with Mixup Recipe	Yuxin Xiao et.al.	2410.05248	null
2024-10-07	Navigating the Digital World as Humans Do: Universal Visual Grounding for GUI Agents	Boyu Gou et.al.	2410.05243	link
2024-10-08	TuneVLSeg: Prompt Tuning Benchmark for Vision-Language Segmentation Models	Rabin Adhikari et.al.	2410.05239	link
2024-10-07	GSM-Symbolic: Understanding the Limitations of Mathematical Reasoning in Large Language Models	Iman Mirzadeh et.al.	2410.05229	null
2024-10-07	Cookbook: A framework for improving LLM generative abilities via programmatic data generating templates	Avanika Narayan et.al.	2410.05224	null
2024-10-07	Precise Model Benchmarking with Only a Few Observations	Riccardo Fogliato et.al.	2410.05222	null
2024-10-07	Density estimation with LLMs: a geometric investigation of in-context learning trajectories	Toni J. B. Liu et.al.	2410.05218	null
2024-10-07	Organizing Unstructured Image Collections using Natural Language	Mingxuan Liu et.al.	2410.05217	null
2024-10-07	Preserving Multi-Modal Capabilities of Pre-trained VLMs for Improving Vision-Linguistic Compositionality	Youngtaek Oh et.al.	2410.05210	link
2024-10-07	RevisEval: Improving LLM-as-a-Judge via Response-Adapted References	Qiyuan Zhang et.al.	2410.05193	null
2024-10-07	Understanding Warmup-Stable-Decay Learning Rates: A River Valley Loss Landscape Perspective	Kaiyue Wen et.al.	2410.05192	null
2024-10-07	LADEV: A Language-Driven Testing and Evaluation Platform for Vision-Language-Action Models in Robotic Manipulation	Zhijie Wang et.al.	2410.05191	null
2024-10-04	Enhance Reasoning by Learning from Mistakes: Peer-Review Knowledge Distillation from Multiple Large Language Models	Zhuochun Li et.al.	2410.03663	null
2024-10-04	Unraveling Cross-Modality Knowledge Conflict in Large Vision-Language Models	Tinghui Zhu et.al.	2410.03659	link
2024-10-04	RAFT: Realistic Attacks to Fool Text Detectors	James Wang et.al.	2410.03658	link
2024-10-04	Aligning LLMs with Individual Preferences via Interaction	Shujin Wu et.al.	2410.03642	link
2024-10-04	Conditional Enzyme Generation Using Protein Language Models with Adapters	Jason Yang et.al.	2410.03634	null
2024-10-04	Large Language Model Performance Benchmarking on Mobile Platforms: A Thorough Evaluation	Jie Xiao et.al.	2410.03613	null
2024-10-04	TICKing All the Boxes: Generated Checklists Improve LLM Evaluation and Generation	Jonathan Cook et.al.	2410.03608	null
2024-10-04	LeLaN: Learning A Language-Conditioned Navigation Policy from In-the-Wild Videos	Noriaki Hirose et.al.	2410.03603	null
2024-10-04	Efficiently Identifying Watermarked Segments in Mixed-Source Texts	Xuandong Zhao et.al.	2410.03600	null
2024-10-04	Understanding Reasoning in Chain-of-Thought from the Hopfieldian View	Lijie Hu et.al.	2410.03595	null
2024-10-04	Look Twice Before You Answer: Memory-Space Visual Retracing for Hallucination Mitigation in Multimodal Large Language Models	Xin Zou et.al.	2410.03577	link
2024-10-04	Towards Linguistically-Aware and Language-Independent Tokenization for Large Language Models (LLMs)	Abrar Rahman et.al.	2410.03568	null
2024-10-04	Structure-Enhanced Protein Instruction Tuning: Towards General-Purpose Protein Understanding	Wei Wu et.al.	2410.03553	null
2024-10-04	Re-examining Sexism and Misogyny Classification with Annotator Attitudes	Aiqi Jiang et.al.	2410.03543	null
2024-10-04	No Need to Talk: Asynchronous Mixture of Language Models	Anastasiia Filippova et.al.	2410.03529	null
2024-10-04	Steering Large Language Models between Code Execution and Textual Reasoning	Yongchao Chen et.al.	2410.03524	null
2024-10-04	A Probabilistic Perspective on Unlearning and Alignment for Large Language Models	Yan Scholten et.al.	2410.03523	null
2024-10-04	CliMedBench: A Large-Scale Chinese Benchmark for Evaluating Medical Large Language Models in Clinical Scenarios	Zetian Ouyang et.al.	2410.03502	link
2024-10-04	FedStein: Enhancing Multi-Domain Federated Learning Through James-Stein Estimator	Sunny Gupta et.al.	2410.03499	link
2024-10-04	Towards Reproducible LLM Evaluation: Quantifying Uncertainty in LLM Benchmark Scores	Robert E. Blackwell et.al.	2410.03492	null
2024-10-03	Interpreting and Editing Vision-Language Representations to Mitigate Hallucinations	Nick Jiang et.al.	2410.02762	link
2024-10-03	FakeShield: Explainable Image Forgery Detection and Localization via Multi-modal Large Language Models	Zhipei Xu et.al.	2410.02761	link
2024-10-03	Erasing Conceptual Knowledge from Language Models	Rohit Gandikota et.al.	2410.02760	link
2024-10-03	Loong: Generating Minute-level Long Videos with Autoregressive Language Models	Yuqing Wang et.al.	2410.02757	null
2024-10-03	SIEVE: General Purpose Data Filtering System Matching GPT-4o Accuracy at 1% the Cost	Jifan Zhang et.al.	2410.02755	null
2024-10-03	Training Language Models on Synthetic Edit Sequences Improves Code Synthesis	Ulyana Piterbarg et.al.	2410.02749	link
2024-10-03	CriSPO: Multi-Aspect Critique-Suggestion-guided Automatic Prompt Optimization for Text Generation	Han He et.al.	2410.02748	link
2024-10-03	Contrastive Localized Language-Image Pre-Training	Hong-You Chen et.al.	2410.02746	null
2024-10-03	Neutral residues: revisiting adapters for model extension	Franck Signe Talla et.al.	2410.02744	null
2024-10-03	MA-RLHF: Reinforcement Learning from Human Feedback with Macro Actions	Yekun Chai et.al.	2410.02743	link
2024-10-03	Grounding Large Language Models In Embodied Environment With Imperfect World Models	Haolan Liu et.al.	2410.02742	null
2024-10-03	Salient Information Prompting to Steer Content in Prompt-based Abstractive Summarization	Lei Xu et.al.	2410.02741	link
2024-10-03	Revisit Large-Scale Image-Caption Data in Pre-training Multimodal Foundation Models	Zhengfeng Lai et.al.	2410.02740	null
2024-10-04	Justice or Prejudice? Quantifying Biases in LLM-as-a-Judge	Jiayi Ye et.al.	2410.02736	null
2024-10-03	DivScene: Benchmarking LVLMs for Object Navigation with Diverse Scenes and Objects	Zhaowei Wang et.al.	2410.02730	link
2024-10-03	Unified Multi-Modal Interleaved Document Representation for Information Retrieval	Jaewoo Lee et.al.	2410.02729	null
2024-10-03	Adaptive Inference-Time Compute: LLMs Can Predict if They Can Do Better, Even Mid-Generation	Rohin Manvi et.al.	2410.02725	null
2024-10-03	Large Language Models as Markov Chains	Oussama Zekri et.al.	2410.02724	null
2024-10-03	Domain-Specific Retrieval-Augmented Generation Using Vector Stores, Knowledge Graphs, and Tensor Factorization	Ryan C. Barron et.al.	2410.02721	null
2024-10-03	UncertaintyRAG: Span-Level Uncertainty Enhanced Long-Context Modeling for Retrieval-Augmented Generation	Zixuan Li et.al.	2410.02719	null
2024-10-02	Locret: Enhancing Eviction in Long-Context LLM Inference with Trained Retaining Heads	Yuxiang Huang et.al.	2410.01805	link
2024-10-02	Efficient $1$ -bit tensor approximations	Alex W. Neal Riasanovsky et.al.	2410.01799	null
2024-10-02	Knowledge-Driven Feature Selection and Engineering for Genotype Data with Large Language Models	Joseph Lee et.al.	2410.01795	link
2024-10-02	When a language model is optimized for reasoning, does it still show embers of autoregression? An analysis of OpenAI o1	R. Thomas McCoy et.al.	2410.01792	null
2024-10-02	Investigating on RLHF methodology	Alexey Kutalev et.al.	2410.01789	null
2024-10-02	OmniGenBench: Automating Large-scale in-silico Benchmarking for Genomic Foundation Models	Heng Yang et.al.	2410.01784	link
2024-10-02	Open-RAG: Enhanced Retrieval-Augmented Reasoning with Open-Source Large Language Models	Shayekh Bin Islam et.al.	2410.01782	link
2024-10-03	Quantifying Generalization Complexity for Large Language Models	Zhenting Qi et.al.	2410.01769	link
2024-10-02	Integrating Protein Sequence and Expression Level to Analysis Molecular Characterization of Breast Cancer Subtypes	Hossein Sholehrasa et.al.	2410.01755	null
2024-10-03	Leopard: A Vision Language Model For Text-Rich Multi-Image Tasks	Mengzhao Jia et.al.	2410.01744	link
2024-10-02	VitaGlyph: Vitalizing Artistic Typography with Flexible Dual-branch Diffusion Models	Kailai Feng et.al.	2410.01738	link
2024-10-02	Visual Perception in Text Strings	Qi Jia et.al.	2410.01733	link
2024-10-02	Automated Knowledge Concept Annotation and Question Representation Learning for Knowledge Tracing	Yilmazcan Ozyurt et.al.	2410.01727	link
2024-10-02	Auto-Demo Prompting: Leveraging Generated Outputs as Demonstrations for Enhanced Batch Prompting	Longyu Feng et.al.	2410.01724	null
2024-10-02	Towards a Theoretical Understanding of Synthetic Data in LLM Post-Training: A Reverse-Bottleneck Perspective	Zeyu Gan et.al.	2410.01720	link
2024-10-02	Examining the Role of Relationship Alignment in Large Language Models	Kristen M. Altenburger et.al.	2410.01708	null
2024-10-02	Interpretable Contrastive Monte Carlo Tree Search Reasoning	Zitian Gao et.al.	2410.01707	link
2024-10-02	An Exploration of Self-Supervised Mutual Information Alignment for Multi-Task Settings	Soham Govande et.al.	2410.01704	link
2024-10-02	CreDes: Causal Reasoning Enhancement and Dual-End Searching for Solving Long-Range Reasoning Problems using LLMs	Kangsheng Wang et.al.	2410.01696	null
2024-10-02	U-shaped and Inverted-U Scaling behind Emergent Abilities of Large Language Models	Tung-Yu Wu et.al.	2410.01692	null
2024-09-30	MM1.5: Methods, Analysis & Insights from Multimodal LLM Fine-tuning	Haotian Zhang et.al.	2409.20566	null
2024-09-30	LaMMA-P: Generalizable Multi-Agent Long-Horizon Task Allocation and Planning with LM-Driven PDDL Planner	Xiaopan Zhang et.al.	2409.20560	null
2024-09-30	Propose, Assess, Search: Harnessing LLMs for Goal-Oriented Planning in Instructional Videos	Md Mohaiminul Islam et.al.	2409.20557	null
2024-09-30	UniAff: A Unified Representation of Affordances for Tool Usage and Articulation with Vision-Language Models	Qiaojun Yu et.al.	2409.20551	null
2024-09-30	LLM Hallucinations in Practical Code Generation: Phenomena, Mechanism, and Mitigation	Ziyao Zhang et.al.	2409.20550	null
2024-09-30	Robi Butler: Remote Multimodal Interactions with Household Robot Assistant	Anxing Xiao et.al.	2409.20548	null
2024-09-30	Uncertainty-Informed Screening for Safer Solvents Used in the Synthesis of Perovskite via Language Models	Arpan Mukherjee et.al.	2409.20512	null
2024-09-30	COLLAGE: Collaborative Human-Agent Interaction Generation using Hierarchical Latent Diffusion and Language Models	Divyanshu Daiya et.al.	2409.20502	null
2024-09-30	A Weakly Supervised Data Labeling Framework for Machine Lexical Normalization in Vietnamese Social Media	Dung Ha Nguyen et.al.	2409.20467	null
2024-09-30	Robot Navigation Using Physically Grounded Vision-Language Models in Outdoor Environments	Mohamed Elnoor et.al.	2409.20445	null
2024-10-01	Instance-adaptive Zero-shot Chain-of-Thought Prompting	Xiaosong Yuan et.al.	2409.20441	null
2024-09-30	HELPD: Mitigating Hallucination of LVLMs by Hierarchical Feedback Learning with Vision-enhanced Penalty Decoding	Fan Yuan et.al.	2409.20429	link
2024-09-30	World to Code: Multi-modal Data Generation via Self-Instructed Compositional Captioning and Filtering	Jiacong Wang et.al.	2409.20424	link
2024-09-30	Anti-stereotypical Predictive Text Suggestions Do Not Reliably Yield Anti-stereotypical Writing	Connor Baumler et.al.	2409.20390	null
2024-09-30	Wait, but Tylenol is Acetaminophen... Investigating and Improving Language Models' Ability to Resist Requests for Misinformation	Shan Chen et.al.	2409.20385	null
2024-09-30	Word-wise intonation model for cross-language TTS systems	Tomilov A. A. et.al.	2409.20374	null
2024-09-30	The Perfect Blend: Redefining RLHF with Mixture of Judges	Tengyu Xu et.al.	2409.20370	null
2024-09-30	VideoINSTA: Zero-shot Long Video Understanding via Informative Spatial-Temporal Reasoning with LLMs	Ruotong Liao et.al.	2409.20365	link
2024-09-30	Efficient Driving Behavior Narration and Reasoning on Edge Device Using Large Language Models	Yizhou Huang et.al.	2409.20364	null
2024-09-30	Rotated Runtime Smooth: Training-Free Activation Smoother for accurate INT4 inference	Ke Yi et.al.	2409.20361	null
2024-09-27	Exploring Token Pruning in Vision State Space Models	Zheng Zhan et.al.	2409.18962	null
2024-09-27	LML: Language Model Learning a Dataset for Data-Augmented Prediction	Praneeth Vadlapati et.al.	2409.18957	link
2024-09-27	Ruler: A Model-Agnostic Method to Control Generated Length for Large Language Models	Jiaming Li et.al.	2409.18943	link
2024-09-27	From Seconds to Hours: Reviewing MultiModal Large Language Models on Comprehensive Long Video Understanding	Heqing Zou et.al.	2409.18938	link
2024-09-27	Social Media Bot Policies: Evaluating Passive and Active Enforcement	Kristina Radivojevic et.al.	2409.18931	null
2024-09-27	AIPatient: Simulating Patients with EHRs and LLM Powered Agentic Workflow	Huizi Yu et.al.	2409.18924	null
2024-09-27	Soft Measures for Extracting Causal Collective Intelligence	Maryam Berijanian et.al.	2409.18911	link
2024-09-27	Improving Visual Object Tracking through Visual Prompting	Shih-Fang Chen et.al.	2409.18901	link
2024-09-27	IDGen: Item Discrimination Induced Prompt Generation for LLM Evaluation	Fan Lin et.al.	2409.18892	link
2024-09-27	Suicide Phenotyping from Clinical Notes in Safety-Net Psychiatric Hospital Using Multi-Label Classification with Pre-Trained Language Models	Zehan Li et.al.	2409.18878	null
2024-09-27	Predicting and analyzing memorization within fine-tuned Large Language Models	Jérémie Dentan et.al.	2409.18858	null
2024-09-27	Mitigating Selection Bias with Node Pruning and Auxiliary Options	Hyeong Kyu Choi et.al.	2409.18857	null
2024-09-27	LLMs4Synthesis: Leveraging Large Language Models for Scientific Synthesis	Hamed Babaei Giglou et.al.	2409.18812	link
2024-09-27	Open-Nav: Exploring Zero-Shot Vision-and-Language Navigation in Continuous Environment with Open-Source LLMs	Yanyuan Qiao et.al.	2409.18794	null
2024-09-27	A Survey on the Honesty of Large Language Models	Siheng Li et.al.	2409.18786	link
2024-09-27	Enhancing Explainability in Multimodal Large Language Models Using Ontological Context	Jihen Amara et.al.	2409.18753	null
2024-09-27	OpenObject-NAV: Open-Vocabulary Object-Oriented Navigation Based on Dynamic Carrier-Relationship Scene Graph	Yujie Tang et.al.	2409.18743	null
2024-09-27	Scalable Cross-Entropy Loss for Sequential Recommendations with Large Item Catalogs	Gleb Mezentsev et.al.	2409.18721	link
2024-09-27	Read Over the Lines: Attacking LLMs and Toxicity Detection Systems with ASCII Art to Mask Profanity	Sergey Berezin et.al.	2409.18708	link
2024-09-27	Beyond Single-Audio: Advancing Multi-Audio Processing in Audio Large Language Models	Yiming Chen et.al.	2409.18680	link
2024-09-26	EgoLM: Multi-Modal Language Model of Egocentric Motions	Fangzhou Hong et.al.	2409.18127	null
2024-09-26	Lotus: Diffusion-based Visual Foundation Model for High-quality Dense Prediction	Jing He et.al.	2409.18124	null
2024-09-26	Multi-View and Multi-Scale Alignment for Contrastive Language-Image Pre-training in Mammography	Yuexi Du et.al.	2409.18119	null
2024-09-26	E.T. Bench: Towards Open-Ended Event-Level Video-Language Understanding	Ye Liu et.al.	2409.18111	link
2024-09-26	Open-World Evaluation for Retrieving Diverse Perspectives	Hung-Ting Chen et.al.	2409.18110	null
2024-09-26	MALPOLON: A Framework for Deep Species Distribution Modeling	Theo Larcher et.al.	2409.18102	link
2024-09-26	SKT: Integrating State-Aware Keypoint Trajectories with Vision-Language Models for Robotic Garment Manipulation	Xin Li et.al.	2409.18082	null
2024-09-26	Infer Human's Intentions Before Following Natural Language Instructions	Yanming Wan et.al.	2409.18073	link
2024-09-26	Infering Alt-text For UI Icons With Large Language Models During App Development	Sabrina Haque et.al.	2409.18060	null
2024-09-26	DualAD: Dual-Layer Planning for Reasoning in Autonomous Driving	Dingrui Wang et.al.	2409.18053	link
2024-09-26	EMOVA: Empowering Language Models to See, Hear and Speak with Vivid Emotions	Kai Chen et.al.	2409.18042	null
2024-09-26	Compositional Hardness of Code in Large Language Models -- A Probabilistic Perspective	Yotam Wolf et.al.	2409.18028	null
2024-09-26	An Adversarial Perspective on Machine Unlearning for AI Safety	Jakub Łucki et.al.	2409.18025	link
2024-09-26	DARE: Diverse Visual Question Answering with Robustness Evaluation	Hannah Sterz et.al.	2409.18023	null
2024-09-26	Role-RL: Online Long-Context Processing with Role Reinforcement Learning for Distinct LLMs in Their Optimal Roles	Lewei He et.al.	2409.18014	null
2024-09-26	Control Industrial Automation System with Large Language Models	Yuchen Xia et.al.	2409.18009	link
2024-09-26	Multilingual Evaluation of Long Context Retrieval and Reasoning	Ameeta Agrawal et.al.	2409.18006	link
2024-09-26	Enhancing Tourism Recommender Systems for Sustainable City Trips Using Retrieval-Augmented Generation	Ashmi Banerjee et.al.	2409.18003	null
2024-09-26	Extracting Affect Aggregates from Longitudinal Social Media Data with Temporal Adapters for Large Language Models	Georg Ahnert et.al.	2409.17990	link
2024-09-26	LLM4Brain: Training a Large Language Model for Brain Video Understanding	Ruizhe Zheng et.al.	2409.17987	null
2024-09-25	Attention Prompting on Image for Large Vision-Language Models	Runpeng Yu et.al.	2409.17143	link
2024-09-25	FineZip : Pushing the Limits of Large Language Models for Practical Lossless Text Compression	Fazal Mittu et.al.	2409.17141	link
2024-09-25	Turn Every Application into an Agent: Towards Efficient Human-Agent-Computer Interaction with API-First LLM-Based Agents	Junting Lu et.al.	2409.17140	null
2024-09-25	Blox-Net: Generative Design-for-Robot-Assembly Using VLM Supervision, Physics Simulation, and a Robot with Reset	Andrew Goldberg et.al.	2409.17126	null
2024-09-25	Programming Every Example: Lifting Pre-training Data Quality like Experts at Scale	Fan Zhou et.al.	2409.17115	link
2024-09-25	Unveiling Ontological Commitment in Multi-Modal Foundation Models	Mert Keser et.al.	2409.17109	null
2024-09-25	Accumulator-Aware Post-Training Quantization	Ian Colbert et.al.	2409.17092	null
2024-09-25	Can Vision Language Models Learn from Visual Demonstrations of Ambiguous Spatial Reasoning?	Bowen Zhao et.al.	2409.17080	link
2024-09-25	VPTQ: Extreme Low-bit Vector Post-Training Quantization for Large Language Models	Yifei Liu et.al.	2409.17066	link
2024-09-25	Benchmarking Domain Generalization Algorithms in Computational Pathology	Neda Zamanitajeddin et.al.	2409.17063	link
2024-09-25	Using LLM for Real-Time Transcription and Summarization of Doctor-Patient Interactions into ePuskesmas in Indonesia	Azmul Asmar Irfan et.al.	2409.17054	null
2024-09-25	GeoBiked: A Dataset with Geometric Features and Automated Labeling Techniques to Enable Deep Generative Models in Engineering Design	Phillip Mueller et.al.	2409.17045	null
2024-09-25	How to Connect Speech Foundation Models and Large Language Models? What Matters and What Does Not	Francesco Verdini et.al.	2409.17044	null
2024-09-25	Counterfactual Token Generation in Large Language Models	Ivi Chatzi et.al.	2409.17027	link
2024-09-25	LLM-CARD: Towards a Description and Landscape of Large Language Models	Shengwei Tian et.al.	2409.17011	link
2024-09-25	Models Can and Should Embrace the Communicative Nature of Human-Generated Math	Sasha Boguraev et.al.	2409.17005	null
2024-09-26	INT-FlashAttention: Enabling Flash Attention for INT8 Quantization	Shimao Chen et.al.	2409.16997	link
2024-09-25	Harnessing Diversity for Important Data Selection in Pretraining Large Language Models	Chi Zhang et.al.	2409.16986	null
2024-09-25	AXCEL: Automated eXplainable Consistency Evaluation using LLMs	P Aditya Sreekar et.al.	2409.16984	null
2024-09-25	Decoding Large-Language Models: A Systematic Overview of Socio-Technical Impacts, Constraints, and Emerging Questions	Zeyneb N. Kaya et.al.	2409.16974	null
2024-09-24	Semantic Refocused Tuning for Open-Vocabulary Panoptic Segmentation	Yong Xien Chng et.al.	2409.16278	null
2024-09-24	LLM Echo Chamber: personalized and automated disinformation	Tony Ma et.al.	2409.16241	link
2024-09-24	EuroLLM: Multilingual Language Models for Europe	Pedro Henrique Martins et.al.	2409.16235	null
2024-09-24	Fine-Tuning is Fine, if Calibrated	Zheda Mai et.al.	2409.16223	link
2024-09-24	Towards Enhancing Linked Data Retrieval in Conversational UIs using Large Language Models	Omar Mussa et.al.	2409.16220	link
2024-09-24	LLMCount: Enhancing Stationary mmWave Detection with Multimodal-LLM	Boyan Li et.al.	2409.16209	null
2024-09-25	CJEval: A Benchmark for Assessing Large Language Models Using Chinese Junior High School Exam Data	Qian-Wen Zhang et.al.	2409.16202	link
2024-09-24	Leveraging Estimated Transferability Over Human Intuition for Model Selection in Text Ranking	Jun Bai et.al.	2409.16198	link
2024-09-24	HelloBench: Evaluating Long Text Generation Capabilities of Large Language Models	Haoran Que et.al.	2409.16191	link
2024-09-24	Expert-level vision-language foundation model for real-world radiology and comprehensive evaluation	Xiaohong Liu et.al.	2409.16183	null
2024-09-24	SDFit: 3D Object Pose and Shape by Fitting a Morphable SDF to a Single Image	Dimitrije Antić et.al.	2409.16178	null
2024-09-24	Cyber Knowledge Completion Using Large Language Models	Braden K Webb et.al.	2409.16176	null
2024-09-24	Merging LoRAs like Playing LEGO: Pushing the Modularity of LoRA to Extremes Through Rank-Wise Clustering	Ziyu Zhao et.al.	2409.16167	null
2024-09-24	EnIGMA: Enhanced Interactive Generative Model Agent for CTF Challenges	Talor Abramovich et.al.	2409.16165	link
2024-09-24	ComiCap: A VLMs pipeline for dense captioning of Comic Panels	Emanuele Vivoli et.al.	2409.16159	link
2024-09-24	Controlling Risk of Retrieval-augmented Generation: A Counterfactual Prompting Framework	Lu Chen et.al.	2409.16146	link
2024-09-24	Evaluation of state-of-the-art ASR Models in Child-Adult Interactions	Aditya Ashvin et.al.	2409.16135	null
2024-09-24	MOSS: Enabling Code-Driven Evolution and Context Management for AI Agents	Ming Zhu et.al.	2409.16120	link
2024-09-25	Generative Speech Foundation Model Pretraining for High-Quality Speech Extraction and Restoration	Pin-Jui Ku et.al.	2409.16117	link
2024-09-24	Exploring Hint Generation Approaches in Open-Domain Question Answering	Jamshid Mozafari et.al.	2409.16096	link
2024-09-20	Gender Representation and Bias in Indian Civil Service Mock Interviews	Somonnoy Banerjee et.al.	2409.12194	null
2024-09-18	Qwen2-VL: Enhancing Vision-Language Model's Perception of the World at Any Resolution	Peng Wang et.al.	2409.12191	link
2024-09-18	To CoT or not to CoT? Chain-of-thought helps mainly on math and symbolic reasoning	Zayne Sprague et.al.	2409.12183	link
2024-09-23	A Controlled Study on Long Context Extension and Generalization in LLMs	Yi Lu et.al.	2409.12181	link
2024-09-18	Finetuning Language Models to Emit Linguistic Expressions of Uncertainty	Arslan Chaudhry et.al.	2409.12180	null
2024-09-18	Decoding Style: Efficient Fine-Tuning of LLMs for Image-Guided Outfit Recommendation with Preference	Najmeh Forouzandehmehr et.al.	2409.12150	null
2024-09-18	MAgICoRe: Multi-Agent, Iterative, Coarse-to-Fine Refinement for Reasoning	Justin Chih-Yao Chen et.al.	2409.12147	link
2024-09-18	MoRAG -- Multi-Fusion Retrieval Augmented Generation for Human Motion	Kalakonda Sai Shashank et.al.	2409.12140	null
2024-09-24	Takin: A Cohort of Superior Quality Zero-shot Speech Generation Models	Sijing Chen et.al.	2409.12139	null
2024-09-18	GRIN: GRadient-INformed MoE	Liyuan Liu et.al.	2409.12136	null
2024-09-18	Linguini: A benchmark for language-agnostic linguistic reasoning	Eduardo Sánchez et.al.	2409.12126	link
2024-09-18	Qwen2.5-Math Technical Report: Toward Mathematical Expert Model via Self-Improvement	An Yang et.al.	2409.12122	null
2024-09-18	Low Frame-rate Speech Codec: a Codec Designed for Fast High-quality Speech LLM Training and Inference	Edresson Casanova et.al.	2409.12117	null
2024-09-18	Measuring Human and AI Values based on Generative Psychometrics with Large Language Models	Haoran Ye et.al.	2409.12106	link
2024-09-19	Skill matching at scale: freelancer-project alignment for efficient multilingual candidate retrieval	Warren Jouanneau et.al.	2409.12097	null
2024-09-19	The Impact of Element Ordering on LM Agent Performance	Wayne Chi et.al.	2409.12089	link
2024-09-18	Dual-Layer Training and Decoding of Large Language Model with Simultaneously Thinking and Speaking	Ningyuan Xi et.al.	2409.12059	null
2024-09-19	Using Large Language Models to Generate Clinical Trial Tables and Figures	Yumeng Yang et.al.	2409.12046	null
2024-09-18	All-in-one foundational models learning across quantum chemical levels	Yuxinxin Chen et.al.	2409.12015	link
2024-09-18	Mixture of Prompt Learning for Vision Language Models	Yu Du et.al.	2409.12011	null
2024-09-17	AraDiCE: Benchmarks for Dialectal and Cultural Capabilities in LLMs	Basel Mousi et.al.	2409.11404	null
2024-09-17	NVLM: Open Frontier-Class Multimodal LLMs	Wenliang Dai et.al.	2409.11402	null
2024-09-17	Says Who? Effective Zero-Shot Annotation of Focalization	Rebecca M. M. Hicke et.al.	2409.11390	null
2024-09-17	Diversify and Conquer: Diversity-Centric Data Selection with Iterative Refinement	Simon Yu et.al.	2409.11378	link
2024-09-17	Towards Time Series Reasoning with LLMs	Winnie Chow et.al.	2409.11376	null
2024-09-17	Multi-OCT-SelfNet: Integrating Self-Supervised Learning with Multi-Source Data Fusion for Enhanced Multi-Class Retinal Disease Classification	Fatema-E- Jannat et.al.	2409.11375	null
2024-09-17	Learning Spatially-Aware Language and Audio Embedding	Bhavika Devnani et.al.	2409.11369	null
2024-09-17	CoCA: Regaining Safety-awareness of Multimodal Large Language Models with Constitutional Calibration	Jiahui Gao et.al.	2409.11365	null
2024-09-17	CORE-Bench: Fostering the Credibility of Published Research Through a Computational Reproducibility Agent Benchmark	Zachary S. Siegel et.al.	2409.11363	link
2024-09-17	AI Suggestions Homogenize Writing Toward Western Styles and Diminish Cultural Nuances	Dhruv Agarwal et.al.	2409.11360	null
2024-09-17	THaMES: An End-to-End Tool for Hallucination Mitigation and Evaluation in Large Language Models	Mengfei Liang et.al.	2409.11353	link
2024-09-17	LPT++: Efficient Training on Mixture of Long-tailed Experts	Bowen Dong et.al.	2409.11323	null
2024-09-17	SOAP: Improving and Stabilizing Shampoo using Adam	Nikhil Vyas et.al.	2409.11321	link
2024-09-17	Beyond LoRA: Exploring Efficient Fine-Tuning Techniques for Time Series Foundational Models	Divij Gupta et.al.	2409.11302	null
2024-09-17	Leveraging Distillation Techniques for Document Understanding: A Case Study with FLAN-T5	Marcel Lamott et.al.	2409.11282	null
2024-09-17	P-RAG: Progressive Retrieval Augmented Generation For Planning on Embodied Everyday Task	Weiye Xu et.al.	2409.11279	null
2024-09-17	Hackphyr: A Local Fine-Tuned LLM Agent for Network Security Environments	Maria Rigaki et.al.	2409.11276	null
2024-09-17	Task Arithmetic for Language Expansion in Speech Translation	Yao-Fei Cheng et.al.	2409.11274	null
2024-09-17	LOLA -- An Open-Source Massively Multilingual Large Language Model	Nikit Srivastava et.al.	2409.11272	link
2024-09-17	Bio-Inspired Mamba: Temporal Locality and Bioplausible Learning in Selective State Space Models	Jiahao Qin et.al.	2409.11263	null
2024-09-16	RetrievalAttention: Accelerating Long-Context LLM Inference via Vector Retrieval	Di Liu et.al.	2409.10516	link
2024-09-16	Context-aware Code Segmentation for C-to-Rust Translation using Large Language Models	Momoko Shiraishi et.al.	2409.10506	null
2024-09-16	DILA: Dictionary Label Attention for Mechanistic Interpretability in High-dimensional Multi-label Medical Coding Prediction	John Wu et.al.	2409.10504	null
2024-09-16	Causal Language Modeling Can Elicit Search and Reasoning Capabilities on Logic Puzzles	Kulin Shah et.al.	2409.10502	link
2024-09-16	Code Vulnerability Detection: A Comparative Analysis of Emerging Large Language Models	Shaznin Sultana et.al.	2409.10490	null
2024-09-16	Do Pre-trained Vision-Language Models Encode Object States?	Kaleb Newman et.al.	2409.10488	null
2024-09-16	XLM for Autonomous Driving Systems: A Comprehensive Review	Sonda Fourati et.al.	2409.10484	null
2024-09-16	Schrodinger's Memory: Large Language Models	Wei Wang et.al.	2409.10482	null
2024-09-16	Towards Semantic Versioning of Open Pre-trained Language Model Releases on Hugging Face	Adekunle Ajibode et.al.	2409.10472	null
2024-09-16	LLM as BT-Planner: Leveraging LLMs for Behavior Tree Generation in Robot Task Planning	Jicong Ao et.al.	2409.10444	link
2024-09-16	CtRNet-X: Camera-to-Robot Pose Estimation in Real-world Conditions Using a Single Camera	Jingpei Lu et.al.	2409.10441	null
2024-09-16	HiFi-CS: Towards Open Vocabulary Visual Grounding For Robotic Grasping Using Vision-Language Models	Vineet Bhat et.al.	2409.10419	null
2024-09-16	A Large-Scale Privacy Assessment of Android Third-Party SDKs	Mark Huasong Meng et.al.	2409.10411	null
2024-09-16	A Knowledge-Enhanced Disease Diagnosis Method Based on Prompt Learning and BERT Integration	Zhang Zheng et.al.	2409.10403	null
2024-09-17	Learnings from a Large-Scale Deployment of an LLM-Powered Expert-in-the-Loop Healthcare Chatbot	Bhuvan Sachdeva et.al.	2409.10354	null
2024-09-16	Large Language Model Enhanced Hard Sample Identification for Denoising Recommendation	Tianrui Song et.al.	2409.10343	null
2024-09-16	The 20 questions game to distinguish large language models	Gurvan Richardeau et.al.	2409.10338	null
2024-09-16	MGSA: Multi-granularity Graph Structure Attention for Knowledge Graph-to-Text Generation	Shanshan Wang et.al.	2409.10294	null
2024-09-16	ReflectDiffu: Reflect between Emotion-intent Contagion and Mimicry for Empathetic Response Generation via a RL-Diffusion Framework	Jiahao Yuan et.al.	2409.10289	link
2024-09-16	ComplexCodeEval: A Benchmark for Evaluating Large Code Models on More Complex Code	Jia Feng et.al.	2409.10280	link
2024-09-13	Agents in Software Engineering: Survey, Landscape, and Vision	Yanxian Huang et.al.	2409.09030	link
2024-09-13	Contri(e)ve: Context + Retrieve for Scholarly Question Answering	Kanchan Shivashankar et.al.	2409.09010	null
2024-09-13	Safeguarding Decentralized Social Media: LLM Agents for Automating Community Rule Compliance	Lucio La Cava et.al.	2409.08963	null
2024-09-13	Emerging Reliance Behaviors in Human-AI Text Generation: Hallucinations, Data Quality Assessment, and Cognitive Forcing Functions	Zahra Ashktorab et.al.	2409.08937	null
2024-09-13	SynSUM -- Synthetic Benchmark with Structured and Unstructured Medical Records	Paloma Rabaey et.al.	2409.08936	link
2024-09-13	LLM-based Weak Supervision Framework for Query Intent Classification in Video Search	Farnoosh Javadi et.al.	2409.08931	null
2024-09-13	Affective Computing Has Changed: The Foundation Model Disruption	Björn Schuller et.al.	2409.08907	null
2024-09-13	AnyBipe: An End-to-End Framework for Training and Deploying Bipedal Robots Guided by Large Language Models	Yifei Yao et.al.	2409.08904	link
2024-09-13	A Market for Lemons? Strategic Directions for a Vigilant Application of Artificial Intelligence in Entrepreneurship Research	Martin Obschonka et.al.	2409.08890	null
2024-09-13	Visual Language Tracking with Multi-modal Interaction: A Robust Benchmark	Xuchen Li et.al.	2409.08887	null
2024-09-13	Exploring Graph Structure Comprehension Ability of Multimodal Large Language Models: Case Studies	Zhiqiang Zhong et.al.	2409.08864	null
2024-09-13	FP-VEC: Fingerprinting Large Language Models via Efficient Vector Addition	Zhenhua Xu et.al.	2409.08846	null
2024-09-13	AIPO: Improving Training Objective for Iterative Preference Optimization	Yaojie Shen et.al.	2409.08845	link
2024-09-13	A RAG Approach for Generating Competency Questions in Ontology Engineering	Xueli Pan et.al.	2409.08820	null
2024-09-13	Your Weak LLM is Secretly a Strong Teacher for Alignment	Leitian Tao et.al.	2409.08813	null
2024-09-13	Mutual Theory of Mind in Human-AI Collaboration: An Empirical Study with LLM-driven AI Agents in a Real-time Shared Workspace Task	Shao Zhang et.al.	2409.08811	null
2024-09-13	LLaQo: Towards a Query-Based Coach in Expressive Music Performance Assessment	Huan Zhang et.al.	2409.08795	link
2024-09-13	Optimizing Ingredient Substitution Using Large Language Models to Enhance Phytochemical Content in Recipes	Luis Rita et.al.	2409.08792	null
2024-09-13	Electrocardiogram Report Generation and Question Answering via Retrieval-Augmented Self-Supervised Modeling	Jialu Tang et.al.	2409.08788	null
2024-09-13	Uncertainty and Generalizability in Foundation Models for Earth Observation	Raul Ramos-Pollan et.al.	2409.08744	null
2024-09-12	Windows Agent Arena: Evaluating Multi-Modal OS Agents at Scale	Rogerio Bonatti et.al.	2409.08264	link
2024-09-12	OmniQuery: Contextually Augmenting Captured Multimodal Memory to Enable Personal Question Answering	Jiahao Nick Li et.al.	2409.08250	null
2024-09-12	Source2Synth: Synthetic Data Generation and Curation Grounded in Real Data Sources	Alisia Lupidi et.al.	2409.08239	null
2024-09-12	LLM Honeypot: Leveraging Large Language Models as Advanced Interactive Honeypot Systems	Hakan T. Otal et.al.	2409.08234	link
2024-09-12	Adaptive Language-Guided Abstraction from Contrastive Explanations	Andi Peng et.al.	2409.08212	null
2024-09-12	ComAlign: Compositional Alignment in Vision-Language Models	Ali Abdollah et.al.	2409.08206	null
2024-09-12	What Makes a Maze Look Like a Maze?	Joy Hsu et.al.	2409.08202	null
2024-09-12	AudioBERT: Audio Knowledge Augmented Language Model	Hyunjong Ok et.al.	2409.08199	link
2024-09-12	Fine-tuning Large Language Models for Entity Matching	Aaron Steiner et.al.	2409.08185	link
2024-09-12	On the Role of Context in Reading Time Prediction	Andreas Opedal et.al.	2409.08160	link
2024-09-12	Faster Speech-LLaMA Inference with Multi-token Prediction	Desh Raj et.al.	2409.08148	null
2024-09-12	LLM-POTUS Score: A Framework of Analyzing Presidential Debates with Large Language Models	Zhengliang Liu et.al.	2409.08147	null
2024-09-12	Towards a graph-based foundation model for network traffic analysis	Louis Van Langendonck et.al.	2409.08111	null
2024-09-12	The Faetar Benchmark: Speech Recognition in a Very Under-Resourced Language	Michael Ong et.al.	2409.08103	null
2024-09-12	The CLC-UKET Dataset: Benchmarking Case Outcome Prediction for the UK Employment Tribunal	Huiyuan Xie et.al.	2409.08098	null
2024-09-12	Securing Large Language Models: Addressing Bias, Misinformation, and Prompt Attacks	Benji Peng et.al.	2409.08087	null
2024-09-12	SimMAT: Exploring Transferability from Vision Foundation Models to Any Image Modality	Chenyang Lei et.al.	2409.08083	link
2024-09-12	SoVAR: Building Generalizable Scenarios from Accident Reports for Autonomous Driving Testing	An Guo et.al.	2409.08081	link
2024-09-12	TravelAgent: An AI Assistant for Personalized Travel Planning	Aili Chen et.al.	2409.08069	null
2024-09-12	An Evaluation Framework for Attributed Information Retrieval using Large Language Models	Hanane Djeddal et.al.	2409.08014	link
2024-09-11	"My Grade is Wrong!": A Contestable AI Framework for Interactive Feedback in Evaluating Student Essays	Shengxin Hong et.al.	2409.07453	null
2024-09-11	StereoCrafter: Diffusion-based Generation of Long and High-fidelity Stereoscopic 3D from Monocular Videos	Sijie Zhao et.al.	2409.07447	null
2024-09-11	SUPER: Evaluating Agents on Setting Up and Executing Tasks from Research Repositories	Ben Bogin et.al.	2409.07440	link
2024-09-11	A Suite for Acoustic Language Model Evaluation	Gallil Maimon et.al.	2409.07437	link
2024-09-11	Synthetic continued pretraining	Zitong Yang et.al.	2409.07431	link
2024-09-11	Agent Workflow Memory	Zora Zhiruo Wang et.al.	2409.07429	link
2024-09-11	CLNX: Bridging Code and Natural Language for C/C++ Vulnerability-Contributing Commits Identification	Zeqing Qin et.al.	2409.07407	null
2024-09-11	AdaCAD: Adaptively Decoding to Balance Conflicts between Contextual and Parametric Knowledge	Han Wang et.al.	2409.07394	link
2024-09-11	Awaking the Slides: A Tuning-free and Knowledge-regulated AI Tutoring System via Language Model Coordination	Daniel Zhang-Li et.al.	2409.07372	null
2024-09-11	Demo: SGCode: A Flexible Prompt-Optimizing System for Secure Generation of Code	Khiem Ton et.al.	2409.07368	null
2024-09-11	Think Together and Work Better: Combining Humans' and LLMs' Think-Aloud Outcomes for Effective Text Evaluation	SeongYeub Chu et.al.	2409.07355	link
2024-09-11	Securing Vision-Language Models with a Robust Encoder Against Jailbreak and Adversarial Attacks	Md Zarif Hossain et.al.	2409.07353	link
2024-09-11	Explanation, Debate, Align: A Weak-to-Strong Framework for Language Model Generalization	Mehrdad Zakershahrak et.al.	2409.07335	null
2024-09-11	Learning to Compress Contexts for Efficient Knowledge-based Visual Question Answering	Weixi Weng et.al.	2409.07331	null
2024-09-11	MEDIC: Towards a Comprehensive Framework for Evaluating LLMs in Clinical Applications	Praveen K Kanithi et.al.	2409.07314	null
2024-09-11	Exploring User-level Gradient Inversion with a Diffusion Prior	Zhuohang Li et.al.	2409.07291	null
2024-09-11	STORE: Streamlining Semantic Tokenization and Generative Recommendation with A Single LLM	Qijiong Liu et.al.	2409.07276	null
2024-09-11	MiniDrive: More Efficient Vision-Language Models with Multi-Level 2D Features as Text Tokens for Autonomous Driving	Enming Zhang et.al.	2409.07267	link
2024-09-11	Alignment of Diffusion Models: Fundamentals, Challenges, and Future	Buhua Liu et.al.	2409.07253	link
2024-09-11	PiTe: Pixel-Temporal Alignment for Large Video-Language Model	Yang Liu et.al.	2409.07239	link
2024-09-10	Benchmarking Sub-Genre Classification For Mainstage Dance Music	Hongzhi Shu et.al.	2409.06690	null
2024-09-10	E2LLM: Encoder Elongated Large Language Models for Long-Context Understanding and Reasoning	Zihan Liao et.al.	2409.06679	null
2024-09-10	LLaMA-Omni: Seamless Speech Interaction with Large Language Models	Qingkai Fang et.al.	2409.06666	link
2024-09-10	Human Perception of LLM-generated Text Content in Social Media Environments	Kristina Radivojevic et.al.	2409.06653	null
2024-09-10	Optimal Workload Placement on Multi-Instance GPUs	Bekir Turkkan et.al.	2409.06646	null
2024-09-10	EyeCLIP: A visual-language foundation model for multi-modal ophthalmic image analysis	Danli Shi et.al.	2409.06644	null
2024-09-11	Segmenting sea ice floes in close-range optical imagery with active contour and foundation models	Giulio Passerotti et.al.	2409.06641	null
2024-09-10	TeXBLEU: Automatic Metric for Evaluate LaTeX Format	Kyudan Jung et.al.	2409.06639	link
2024-09-10	MoWE-Audio: Multitask AudioLLMs with Mixture of Weak Encoders	Wenyu Zhang et.al.	2409.06635	null
2024-09-10	A Practice of Post-Training on Llama-3 70B with Optimal Selection of Additional Language Mixture Ratio	Ningyuan Xi et.al.	2409.06624	null
2024-09-10	Exploring Italian sentence embeddings properties through multi-tasking	Vivi Nastase et.al.	2409.06622	link
2024-09-10	Alleviating Hallucinations in Large Language Models with Scepticism Modeling	Yetao Wu et.al.	2409.06601	null
2024-09-10	GroUSE: A Benchmark to Evaluate Evaluators in Grounded Question Answering	Sacha Muller et.al.	2409.06595	link
2024-09-10	Quantifying and Enabling the Interpretability of CLIP-like Models	Avinash Madasu et.al.	2409.06579	null
2024-09-10	Exploring syntactic information in sentence embeddings through multilingual subject-verb agreement	Vivi Nastase et.al.	2409.06567	null
2024-09-10	MAPS: Energy-Reliability Tradeoff Management in Autonomous Vehicles Through LLMs Penetrated Science	Mahdieh Aliazam et.al.	2409.06558	null
2024-09-10	Questioning Internal Knowledge Structure of Large Language Models Through the Lens of the Olympic Games	Juhwan Choi et.al.	2409.06518	link
2024-09-10	Aligning Machine and Human Visual Representations across Abstraction Levels	Lukas Muttenthaler et.al.	2409.06509	null
2024-09-10	Mitigating Hallucination in Visual-Language Models via Re-Balancing Contrastive Decoding	Xiaoyu Liang et.al.	2409.06485	null
2024-09-10	Multimodal Large Language Model Driven Scenario Testing for Autonomous Vehicles	Qiujing Lu et.al.	2409.06450	null
2024-09-09	MMEvol: Empowering Multimodal Large Language Models with Evol-Instruct	Run Luo et.al.	2409.05840	null
2024-09-09	Are Large Language Models a Threat to Programming Platforms? An Exploratory Study	Md Mustakim Billah et.al.	2409.05824	null
2024-09-09	VFA: Vision Frequency Analysis of Foundation Models and Human	Mohammad-Javad Darvishi-Bayazi et.al.	2409.05817	null
2024-09-09	Improving Pretraining Data Using Perplexity Correlations	Tristan Thrush et.al.	2409.05816	null
2024-09-09	Benchmarking Chinese Knowledge Rectification in Large Language Models	Tianhe Lu et.al.	2409.05806	link
2024-09-09	Evidence from fMRI Supports a Two-Phase Abstraction Process in Language Models	Emily Cheng et.al.	2409.05771	null
2024-09-09	Model Input Verification of Large Scale Simulations	Rumyana Neykova et.al.	2409.05768	null
2024-09-09	A Novel Idea Generation Tool using a Structured Conversational AI (CAI) System	B. Sankar et.al.	2409.05747	null
2024-09-09	LLMs Will Always Hallucinate, and We Need to Live With This	Sourav Banerjee et.al.	2409.05746	null
2024-09-09	A System and Benchmark for LLM-based Q&A on Heterogeneous Data	Achille Fokoue et.al.	2409.05735	null
2024-09-09	Towards Democratizing Multilingual Large Language Models For Medicine Through A Two-Stage Instruction Fine-tuning Approach	Meng Zhou et.al.	2409.05732	null
2024-09-09	The Influence of Task and Group Disparities over Users' Attitudes Toward Using Large Language Models for Psychotherapy	Qihang He et.al.	2409.05703	null
2024-09-09	Segmentation by Factorization: Unsupervised Semantic Segmentation for Pathology by Factorizing Foundation Model Features	Jacob Gildenblat et.al.	2409.05697	null
2024-09-09	Zero-shot Outlier Detection via Prior-data Fitted Networks: Model Selection Bygone!	Yuchen Shen et.al.	2409.05672	null
2024-09-09	Revisiting English Winogender Schemas for Consistency, Coverage, and Grammatical Case	Vagrant Gautam et.al.	2409.05653	link
2024-09-10	MemoRAG: Moving towards Next-Gen RAG Via Memory-Inspired Knowledge Discovery	Hongjin Qian et.al.	2409.05591	link
2024-09-09	Leveraging Content and Acoustic Representations for Efficient Speech Emotion Recognition	Soumya Dutta et.al.	2409.05566	link
2024-09-09	CauseJudger: Identifying the Cause with LLMs for Abductive Logical Reasoning	Jinwei He et.al.	2409.05559	null
2024-09-09	SciAgents: Automating scientific discovery through multi-agent intelligent graph reasoning	Alireza Ghafarollahi et.al.	2409.05556	link
2024-09-09	Harmonic Reasoning in Large Language Models	Anna Kruspe et.al.	2409.05521	null
2024-09-06	VILA-U: a Unified Foundation Model Integrating Visual Understanding and Generation	Yecheng Wu et.al.	2409.04429	link
2024-09-06	Exploring Foundation Models for Synthetic Medical Imaging: A Study on Chest X-Rays and Fine-Tuning Techniques	Davide Clode da Silva et.al.	2409.04424	null
2024-09-06	RLPF: Reinforcement Learning from Prediction Feedback for User Summarization with LLMs	Jiaxing Wu et.al.	2409.04421	null
2024-09-06	Question-Answering Dense Video Events	Hangyu Qin et.al.	2409.04388	null
2024-09-06	Learning vs Retrieval: The Role of In-Context Examples in Regression with LLMs	Aliakbar Nafar et.al.	2409.04318	link
2024-09-06	An optically accelerated extreme learning machine using hot atomic vapors	Pierre Azam et.al.	2409.04312	null
2024-09-06	Using Large Language Models to Generate Authentic Multi-agent Knowledge Work Datasets	Desiree Heim et.al.	2409.04286	null
2024-09-06	Advancing Automated Knowledge Transfer in Evolutionary Multitasking via Large Language Models	Yuxiao Huang et.al.	2409.04270	null
2024-09-06	An overview of domain-specific foundation model: key technologies, applications and challenges	Haolong Chen et.al.	2409.04267	null
2024-09-06	UniDet3D: Multi-dataset Indoor 3D Object Detection	Maksim Kolodiazhnyi et.al.	2409.04234	link
2024-09-06	Fast Forwarding Low-Rank Training	Adir Rahamim et.al.	2409.04206	null
2024-09-06	Residual Stream Analysis with Multi-Layer SAEs	Tim Lawson et.al.	2409.04185	link
2024-09-06	GALLa: Graph Aligned Large Language Models for Improved Source Code Understanding	Ziyin Zhang et.al.	2409.04183	null
2024-09-06	Combining LLMs and Knowledge Graphs to Reduce Hallucinations in Question Answering	Larissa Pusch et.al.	2409.04181	null
2024-09-06	From Calculation to Adjudication: Examining LLM judges on Mathematical Reasoning Tasks	Andreas Stephan et.al.	2409.04168	null
2024-09-06	Can OpenSource beat ChatGPT? -- A Comparative Study of Large Language Models for Text-to-Code Generation	Luis Mayer et.al.	2409.04164	null
2024-09-06	Prompt-based Personality Profiling: Reinforcement Learning for Relevance Filtering	Jan Hofmann et.al.	2409.04122	null
2024-09-06	Multi-Programming Language Ensemble for Code Generation in Large Language Model	Tengfei Xue et.al.	2409.04114	link
2024-09-06	Can LLMs Generate Novel Research Ideas? A Large-Scale Human Study with 100+ NLP Researchers	Chenglei Si et.al.	2409.04109	link
2024-09-06	UI-JEPA: Towards Active Perception of User Intent through Onscreen User Activity	Yicheng Fu et.al.	2409.04081	null
2024-09-05	Lexicon3D: Probing Visual Foundation Models for Complex 3D Scene Understanding	Yunze Man et.al.	2409.03757	link
2024-09-05	Foundation Model or Finetune? Evaluation of few-shot semantic segmentation for river pollution	Marga Don et.al.	2409.03754	link
2024-09-05	Attention Heads of Large Language Models: A Survey	Zifan Zheng et.al.	2409.03752	link
2024-09-05	LLM-CI: Assessing Contextual Integrity Norms in Language Models	Yan Shvartzshnaider et.al.	2409.03735	null
2024-09-05	Safety vs. Performance: How Multi-Objective Learning Reduces Barriers to Market Entry	Meena Jagadeesan et.al.	2409.03734	null
2024-09-05	Planning In Natural Language Improves LLM Search For Code Generation	Evan Wang et.al.	2409.03733	link
2024-09-06	RAG based Question-Answering for Contextual Response Prediction System	Sriram Veturi et.al.	2409.03708	null
2024-09-05	LAST: Language Model Aware Speech Tokenization	Arnon Turetzky et.al.	2409.03701	null
2024-09-05	TRACE-cs: Trustworthy Reasoning for Contrastive Explanations in Course Scheduling Problems	Stylianos Loukas Vasileiou et.al.	2409.03671	link
2024-09-05	A Fused Large Language Model for Predicting Startup Success	Abdurahman Maarouf et.al.	2409.03668	null
2024-09-05	The representation landscape of few-shot learning and fine-tuning in large language models	Diego Doimo et.al.	2409.03662	link
2024-09-06	LLM-based multi-agent poetry generation in non-cooperative environments	Ran Zhang et.al.	2409.03659	link
2024-09-05	On the Limited Generalization Capability of the Implicit Reward Model Induced by Direct Preference Optimization	Yong Lin et.al.	2409.03650	null
2024-09-05	Text-Guided Mixup Towards Long-Tailed Image Categorization	Richard Franklin et.al.	2409.03583	link
2024-09-05	FrozenSeg: Harmonizing Frozen Foundation Models for Open-Vocabulary Segmentation	Xi Chen et.al.	2409.03525	null
2024-09-05	Have Large Vision-Language Models Mastered Art History?	Ombretta Strafforello et.al.	2409.03521	null
2024-09-05	Tissue Concepts: supervised foundation models in computational pathology	Till Nicke et.al.	2409.03519	link
2024-09-05	From MOOC to MAIC: Reshaping Online Teaching and Learning through LLM-driven Agents	Jifan Yu et.al.	2409.03512	null
2024-09-05	LLM-based event abstraction and integration for IoT-sourced logs	Mohsen Shirali et.al.	2409.03478	link
2024-09-05	How Much Data is Enough Data? Fine-Tuning Large Language Models for In-House Translation: Performance Evaluation Across Multiple Dataset Sizes	Inacio Vieira et.al.	2409.03454	null
2024-09-04	RoboTwin: Dual-Arm Robot Benchmark with Generative Digital Twins (early version)	Yao Mu et.al.	2409.02920	null
2024-09-04	Can LVLMs Obtain a Driver's License? A Benchmark Towards Reliable AGI for Autonomous Driving	Yuhang Lu et.al.	2409.02914	null
2024-09-04	Masked Diffusion Models are Secretly Time-Agnostic Masked Models and Exploit Inaccurate Categorical Sampling	Kaiwen Zheng et.al.	2409.02908	null
2024-09-05	LongCite: Enabling LLMs to Generate Fine-grained Citations in Long-context QA	Jiajie Zhang et.al.	2409.02897	link
2024-09-04	LongLLaVA: Scaling Multi-modal LLMs to 1000 Images Efficiently via Hybrid Architecture	Xidong Wang et.al.	2409.02889	link
2024-09-04	CanvOI, an Oncology Intelligence Foundation Model: Scaling FLOPS Differently	Jonathan Zalach et.al.	2409.02885	null
2024-09-04	Benchmarking Spurious Bias in Few-Shot Image Classifiers	Guangtao Zheng et.al.	2409.02882	link
2024-09-04	Configurable Foundation Models: Building LLMs from a Modular Perspective	Chaojun Xiao et.al.	2409.02877	null
2024-09-04	Historical German Text Normalization Using Type- and Token-Based Language Modeling	Anton Ehrmanntraut et.al.	2409.02841	null
2024-09-04	Exploring Sentiment Dynamics and Predictive Behaviors in Cryptocurrency Discussions by Few-Shot Learning with Large Language Models	Moein Shahiki Tash et.al.	2409.02836	null
2024-09-04	CMM-Math: A Chinese Multimodal Math Dataset To Evaluate and Enhance the Mathematics Reasoning of Large Multimodal Models	Wentao Liu et.al.	2409.02834	link
2024-09-04	ExpLLM: Towards Chain of Thought for Facial Expression Recognition	Xing Lan et.al.	2409.02828	null
2024-09-04	Design Contradictions: Help or Hindrance?	Aron E. Owen et.al.	2409.02823	null
2024-09-04	Language Understanding as a Constraint on Consensus Size in LLM Societies	Giordano De Marzo et.al.	2409.02822	null
2024-09-04	Towards a Unified View of Preference Learning for Large Language Models: A Survey	Bofei Gao et.al.	2409.02795	link
2024-09-05	Pooling And Attention: What Are Effective Designs For LLM-Based Embedding Models?	Yixuan Tang et.al.	2409.02727	link
2024-09-04	Pre-training data selection for biomedical domain adaptation using journal impact metrics	Mathieu Laï-king et.al.	2409.02725	null
2024-09-04	Alignment-Aware Model Extraction Attacks on Large Language Models	Zi Liang et.al.	2409.02718	link
2024-09-04	Creating a Gen-AI based Track and Trace Assistant MVP (SuperTracy) for PostNL	Mohammad Reshadati et.al.	2409.02711	null
2024-09-04	LLM-Assisted Visual Analytics: Opportunities and Challenges	Maeve Hutchinson et.al.	2409.02691	null
2024-08-30	SYNTHEVAL: Hybrid Behavioral Testing of NLP Models with Synthetic CheckLists	Raoyuan Zhao et.al.	2408.17437	link
2024-08-30	DARES: Depth Anything in Robotic Endoscopic Surgery with Self-supervised Vector-LoRA of the Foundation Model	Mona Sheikh Zeinoddin et.al.	2408.17433	link
2024-08-30	Advancing Multi-talker ASR Performance with Large Language Models	Mohan Shi et.al.	2408.17431	null
2024-08-30	CLOCR-C: Context Leveraging OCR Correction with Pre-trained Language Models	Jonathan Bourne et.al.	2408.17428	link
2024-09-03	Open-vocabulary Temporal Action Localization using VLMs	Naoki Wake et.al.	2408.17422	null
2024-08-30	Getting Inspiration for Feature Elicitation: App Store- vs. LLM-based Approach	Jialiang Wei et.al.	2408.17404	link
2024-08-30	EMPOWER: Embodied Multi-role Open-vocabulary Planning with Online Grounding and Execution	Francesco Argenziano et.al.	2408.17379	null
2024-08-30	NDP: Next Distribution Prediction as a More Broad Target	Junhao Ruan et.al.	2408.17377	null
2024-08-30	Assessing Generative Language Models in Classification Tasks: Performance and Self-Evaluation Capabilities in the Environmental and Climate Change Domain	Francesca Grasso et.al.	2408.17362	link
2024-08-30	Forget to Flourish: Leveraging Machine-Unlearning on Pretrained Language Models for Privacy Leakage	Md Rafi Ur Rashid et.al.	2408.17354	null
2024-09-02	LSMS: Language-guided Scale-aware MedSegmentor for Medical Image Referring Segmentation	Shuyi Ouyang et.al.	2408.17347	null
2024-08-30	Investigating Neuron Ablation in Attention Heads: The Case for Peak Activation Centering	Nicholas Pochinkov et.al.	2408.17322	link
2024-08-30	Bridging Domain Knowledge and Process Discovery Using Large Language Models	Ali Norouzifar et.al.	2408.17316	link
2024-08-30	Flexible and Effective Mixing of Large Language Models into a Mixture of Domain Experts	Rhui Dih Lee et.al.	2408.17280	null
2024-08-30	Joint Estimation and Prediction of City-wide Delivery Demand: A Large Language Model Empowered Graph-based Learning Approach	Tong Nie et.al.	2408.17258	null
2024-08-30	VisionTS: Visual Masked Autoencoders Are Free-Lunch Zero-Shot Time Series Forecasters	Mouxiang Chen et.al.	2408.17253	link
2024-08-30	Improving Extraction of Clinical Event Contextual Properties from Electronic Health Records: A Comparative Study	Shubham Agarwal et.al.	2408.17181	null
2024-08-30	Codec Does Matter: Exploring the Semantic Shortcoming of Codec for Audio Language Model	Zhen Ye et.al.	2408.17175	link
2024-08-30	Look, Compare, Decide: Alleviating Hallucination in Large Vision-Language Models via Multi-View Multi-Path Reasoning	Xiaoye Qu et.al.	2408.17150	link
2024-08-30	Reasoning AI Performance Degradation in 6G Networks with Large Language Models	Liming Huang et.al.	2408.17097	null
2024-08-29	PromptSmooth: Certifying Robustness of Medical Vision-Language Models via Prompt Learning	Noor Hussein et.al.	2408.16769	link
2024-08-29	How Far Can Cantonese NLP Go? Benchmarking Cantonese Capabilities of Large Language Models	Jiyue Jiang et.al.	2408.16756	link
2024-08-29	Reinforcement Learning without Human Feedback for Last Mile Fine-Tuning of Large Language Models	Alec Solway et.al.	2408.16753	null
2024-08-29	A Gradient Analysis Framework for Rewarding Good and Penalizing Bad Examples in Language Models	Yi-Lin Tuan et.al.	2408.16751	null
2024-08-29	Assessing Large Language Models for Online Extremism Research: Identification, Explanation, and New Knowledge	Beidi Dong et.al.	2408.16749	null
2024-08-29	Theoretical and Methodological Framework for Studying Texts Produced by Large Language Models	Jiří Milička et.al.	2408.16740	null
2024-08-29	Smaller, Weaker, Yet Better: Training LLM Reasoners via Compute-Optimal Sampling	Hritik Bansal et.al.	2408.16737	null
2024-08-29	VideoLLM-MoD: Efficient Video-Language Streaming with Mixture-of-Depths Vision Computation	Shiwei Wu et.al.	2408.16730	null
2024-08-30	Mini-Omni: Language Models Can Hear, Talk While Thinking in Streaming	Zhifei Xie et.al.	2408.16725	link
2024-08-29	GradBias: Unveiling Word Influence on Bias in Text-to-Image Generative Models	Moreno D'Incà et.al.	2408.16700	link
2024-08-29	Entropic Distribution Matching in Supervised Fine-tuning of LLMs: Less Overfitting and Better Diversity	Ziniu Li et.al.	2408.16673	null
2024-08-29	Space3D-Bench: Spatial 3D Question Answering Benchmark	Emilia Szymanska et.al.	2408.16662	null
2024-08-29	DriveGenVLM: Real-world Video Generation for Vision Language Model based Autonomous Driving	Yongjie Fu et.al.	2408.16647	null
2024-08-29	Examination of Code generated by Large Language Models	Robin Beer et.al.	2408.16601	link
2024-08-29	Enhancing Dialogue Generation in Werewolf Game Through Situation Analysis and Persuasion Strategies	Zhiyang Qi et.al.	2408.16586	null
2024-08-29	WavTokenizer: an Efficient Acoustic Discrete Codec Tokenizer for Audio Language Modeling	Shengpeng Ji et.al.	2408.16532	link
2024-08-29	CNIMA: A Universal Evaluation Framework and Automated Approach for Assessing Second Language Dialogues	Rena Gao et.al.	2408.16518	link
2024-08-29	LLMs vs Established Text Augmentation Techniques for Classification: When do the Benefits Outweight the Costs?	Jan Cegin et.al.	2408.16502	null
2024-08-29	CogVLM2: Visual Language Models for Image and Video Understanding	Wenyi Hong et.al.	2408.16500	link
2024-08-29	A Survey on Evaluating Large Language Models in Code Generation Tasks	Liguo Chen et.al.	2408.16498	null
2024-08-28	Eagle: Exploring The Design Space for Multimodal LLMs with Mixture of Encoders	Min Shi et.al.	2408.15998	link
2024-08-29	Spatio-Temporal Context Prompting for Zero-Shot Action Detection	Wei-Jhe Huang et.al.	2408.15996	null
2024-08-28	Perceive-IR: Learning to Perceive Degradation Better for All-in-One Image Restoration	Xu Zhang et.al.	2408.15994	null
2024-08-28	BattleAgentBench: A Benchmark for Evaluating Cooperation and Competition Capabilities of Language Models in Multi-Agent Systems	Wei Wang et.al.	2408.15971	null
2024-08-28	More Text, Less Point: Towards 3D Data-Efficient Point-Language Understanding	Yuan Tang et.al.	2408.15966	link
2024-08-28	Atari-GPT: Investigating the Capabilities of Multimodal Large Language Models as Low-Level Policies for Atari Games	Nicholas R. Waytowich et.al.	2408.15950	null
2024-08-28	DeMoBot: Deformable Mobile Manipulation with Vision-based Sub-goal Retrieval	Yuying Zhang et.al.	2408.15919	null
2024-08-28	Leveraging Open Knowledge for Advancing Task Expertise in Large Language Models	Yuncheng Yang et.al.	2408.15915	link
2024-08-28	Decentralized LLM Inference over Edge Networks with Energy Harvesting	Aria Khoshsirat et.al.	2408.15907	null
2024-08-28	LLM-Based Multi-Hop Question Answering with Knowledge Graph Integration in Evolving Environments	Ruirui Chen et.al.	2408.15903	null
2024-08-28	Nexus: Specialization meets Adaptability for Efficiently Training Mixture of Experts	Nikolas Gritsch et.al.	2408.15901	null
2024-08-28	Bias in LLMs as Annotators: The Effect of Party Cues on Labelling Decision by Large Language Models	Sebastian Vallejo Vera et.al.	2408.15895	null
2024-08-28	LLaVA-MoD: Making LLaVA Tiny via MoE Knowledge Distillation	Fangxun Shu et.al.	2408.15881	link
2024-08-28	Persuasion Games using Large Language Models	Ganesh Prasath Ramani et.al.	2408.15879	null
2024-08-28	Retrieval-Augmented Instruction Tuning for Automated Process Engineering Calculations : A Tool-Chaining Problem-Solving Framework with Attributable Reflection	Sagar Srinivas Sakhinana et.al.	2408.15866	null
2024-08-28	Benchmarking foundation models as feature extractors for weakly-supervised computational pathology	Peter Neidlinger et.al.	2408.15823	null
2024-08-28	Visual Prompt Engineering for Medical Vision Language Models in Radiology	Stefan Denner et.al.	2408.15802	null
2024-08-28	Scaling Up Summarization: Leveraging Large Language Models for Long Text Extractive Summarization	Léo Hemamou et.al.	2408.15801	null
2024-08-28	Evaluating Named Entity Recognition Using Few-Shot Prompting with Large Language Models	Hédi Zhegidi et.al.	2408.15796	link
2024-08-28	Efficient LLM Scheduling by Learning to Rank	Yichao Fu et.al.	2408.15792	link
2024-08-27	Generative Verifiers: Reward Modeling as Next-Token Prediction	Lunjun Zhang et.al.	2408.15240	null
2024-08-27	The Mamba in the Llama: Distilling and Accelerating Hybrid Models	Junxiong Wang et.al.	2408.15237	link
2024-08-27	Into the Unknown Unknowns: Engaged Human Learning through Participation in Language Model Agent Conversations	Yucheng Jiang et.al.	2408.15232	null
2024-08-27	LLM Defenses Are Not Robust to Multi-Turn Human Jailbreaks Yet	Nathaniel Li et.al.	2408.15221	null
2024-08-27	Investigating Coverage Criteria in Large Language Models: An In-Depth Study Through Jailbreak Attacks	Shide Zhou et.al.	2408.15207	null
2024-08-27	Leveraging Hallucinations to Reduce Manual Prompt Dependency in Promptable Segmentation	Jian Hu et.al.	2408.15205	link
2024-08-27	Can Unconfident LLM Annotations Be Used for Confident Conclusions?	Kristina Gligorić et.al.	2408.15204	link
2024-08-27	Infusing Acoustic Pause Context into Text-Based Dementia Assessment	Franziska Braun et.al.	2408.15188	null
2024-08-27	Unlocking Potential in Pre-Trained Music Language Models for Versatile Multi-Track Music Arrangement	Longshen Ou et.al.	2408.15176	null
2024-08-27	X-Reflect: Cross-Reflection Prompting for Multimodal Recommendation	Hanjia Lyu et.al.	2408.15172	null
2024-08-27	Measuring text summarization factuality using atomic facts entailment metrics in the context of retrieval augmented generation	N. E. Kriman et.al.	2408.15171	null
2024-08-27	How transformers learn structured data: insights from hierarchical filtering	Jerome Garnier-Brun et.al.	2408.15138	link
2024-08-27	CLIP-AGIQA: Boosting the Performance of AI-Generated Image Quality Assessment with CLIP	Zhenchen Tang et.al.	2408.15098	null
2024-08-27	Relation Also Knows: Rethinking the Recall and Editing of Factual Associations in Auto-Regressive Transformer Language Models	Xiyu Liu et.al.	2408.15091	null
2024-08-27	BaichuanSEED: Sharing the Potential of ExtensivE Data Collection and Deduplication by Introducing a Competitive Large Language Model Baseline	Guosheng Dong et.al.	2408.15079	null
2024-08-27	Constraining Participation: Affordances of Feedback Features in Interfaces to Large Language Models	Ned Cooper et.al.	2408.15066	null
2024-08-27	The Benefits of Balance: From Information Projections to Variance Reduction	Lang Liu et.al.	2408.15065	null
2024-08-28	DocLayLLM: An Efficient and Effective Multi-modal Extension of Large Language Models for Text-rich Document Understanding	Wenhui Liao et.al.	2408.15045	null
2024-08-28	A Survey of Large Language Models for European Languages	Wazir Ali et.al.	2408.15040	null
2024-08-27	Speech Recognition Transformers: Topological-lingualism Perspective	Shruti Singh et.al.	2408.14991	null
2024-08-26	A Practitioner's Guide to Continual Multimodal Pretraining	Karsten Roth et.al.	2408.14471	link
2024-08-27	Step-by-Step Unmasking for Parameter-Efficient Fine-tuning of Large Language Models	Aradhye Agarwal et.al.	2408.14470	link
2024-08-26	Grounded Multi-Hop VideoQA in Long-Form Egocentric Videos	Qirui Chen et.al.	2408.14469	null
2024-08-26	Explicit Inductive Inference using Large Language Models	Tianyang Liu et.al.	2408.14467	null
2024-08-26	Evaluating Large Language Models on Spatial Tasks: A Multi-Task Benchmarking Study	Liuchang Xu Shuo Zhao et.al.	2408.14438	null
2024-08-26	Social perception of faces in a vision-language model	Carina I. Hausladen et.al.	2408.14435	link
2024-08-26	CHARTOM: A Visual Theory-of-Mind Benchmark for Multimodal Large Language Models	Shubham Bharti et.al.	2408.14419	null
2024-08-26	MEDSAGE: Enhancing Robustness of Medical Dialogue Summarization to ASR Errors with LLM-generated Synthetic Dialogues	Kuluhan Binici et.al.	2408.14418	null
2024-08-26	Hyperdimensional Computing Empowered Federated Foundation Model over Wireless Networks for Metaverse	Yahao Ding et.al.	2408.14416	null
2024-08-26	Language-specific Calibration for Pruning Multilingual Language Models	Simon Kurz et.al.	2408.14398	null
2024-08-26	Reprogramming Foundational Large Language Models(LLMs) for Enterprise Adoption for Spatio-Temporal Forecasting Applications: Unveiling a New Era in Copilot-Guided Cross-Modal Time Series Representation Learning	Sakhinana Sagar Srinivas et.al.	2408.14387	null
2024-08-26	Probing Causality Manipulation of Large Language Models	Chenyang Zhang et.al.	2408.14380	link
2024-08-26	An Embedding is Worth a Thousand Noisy Labels	Francesco Di Salvo et.al.	2408.14358	link
2024-08-26	SWE-bench-java: A GitHub Issue Resolving Benchmark for Java	Daoguang Zan et.al.	2408.14354	link
2024-08-26	Assessing Contamination in Large Language Models: Introducing the LogProber method	Nicolas Yax et.al.	2408.14352	null
2024-08-26	Foundation Models for Music: A Survey	Yinghao Ma et.al.	2408.14340	link
2024-08-26	Claim Verification in the Age of Large Language Models: A Survey	Alphaeus Dmonte et.al.	2408.14317	null
2024-08-26	LLM-3D Print: Large Language Models To Monitor and Control 3D Printing	Yayati Jadhav et.al.	2408.14307	null
2024-08-26	Investigating the Effectiveness of Bayesian Spam Filters in Detecting LLM-modified Spam Mails	Malte Josten et.al.	2408.14293	link
2024-08-26	Predictability and Causality in Spanish and English Natural Language Generation	Andrea Busto-Castiñeira et.al.	2408.14283	null
2024-08-23	MME-RealWorld: Could Your Multimodal LLM Challenge High-Resolution Real-World Scenarios that are Difficult for Humans?	Yi-Fan Zhang et.al.	2408.13257	null
2024-08-23	Domain-specific long text classification from sparse relevant information	Célia D'Cruz et.al.	2408.13253	null
2024-08-23	Foundational Model for Electron Micrograph Analysis: Instruction-Tuning Small-Scale Language-and-Vision Assistant for Enterprise Adoption	Sakhinana Sagar Srinivas et.al.	2408.13248	null
2024-08-23	Multi-Layer Transformers Gradient Can be Approximated in Almost Linear Time	Yingyu Liang et.al.	2408.13233	null
2024-08-23	EUR-USD Exchange Rate Forecasting Based on Information Fusion with Large Language Models and Deep Learning Methods	Hongcheng Ding et.al.	2408.13214	null
2024-08-23	DOMAINEVAL: An Auto-Constructed Benchmark for Multi-Domain Code Generation	Qiming Zhu et.al.	2408.13204	null
2024-08-23	Can LLM be a Good Path Planner based on Prompt Engineering? Mitigating the Hallucination for Path Planning	Hourui Deng et.al.	2408.13184	null
2024-08-23	IntelliCare: Improving Healthcare Analysis with Variance-Controlled Patient-Level Knowledge from Large Language Models	Zhihao Yu et.al.	2408.13073	link
2024-08-23	Guiding IoT-Based Healthcare Alert Systems with Large Language Models	Yulan Gao et.al.	2408.13071	null
2024-08-23	SpeechPrompt: Prompting Speech Language Models for Speech Processing Tasks	Kai-Wei Chang et.al.	2408.13040	null
2024-08-23	VFM-Det: Towards High-Performance Vehicle Detection via Large Foundation Models	Wentao Wu et.al.	2408.13031	link
2024-08-23	In-Context Learning with Reinforcement Learning for Incomplete Utterance Rewriting	Haowei Du et.al.	2408.13028	null
2024-08-23	A Web-Based Solution for Federated Learning with LLM-Based Automation	Chamith Mawela et.al.	2408.13010	null
2024-08-23	Systematic Evaluation of LLM-as-a-Judge in LLM Alignment Tasks: Explainable Metrics and Diverse Prompt Templates	Hui Wei et.al.	2408.13006	link
2024-08-23	CRUXEval-X: A Benchmark for Multilingual Code Reasoning, Understanding and Execution	Ruiyang Xu et.al.	2408.13001	null
2024-08-23	Open Llama2 Model for the Lithuanian Language	Artūras Nakvosas et.al.	2408.12963	null
2024-08-23	Multimodal Contrastive In-Context Learning	Yosuke Miyanishi et.al.	2408.12959	null
2024-08-23	Image Segmentation in Foundation Model Era: A Survey	Tianfei Zhou et.al.	2408.12957	link
2024-08-23	E-code: Mastering Efficient Code Generation through Pretrained Models and Expert Encoder Group	Yue Pan et.al.	2408.12948	null
2024-08-23	Causal-Guided Active Learning for Debiasing Large Language Models	Zhouhao Sun et.al.	2408.12942	link
2024-08-22	Controllable Text Generation for Large Language Models: A Survey	Xun Liang et.al.	2408.12599	link
2024-08-23	Non-Homophilic Graph Pre-Training and Prompt Learning	Xingtong Yu et.al.	2408.12594	link
2024-08-22	RuleAlign: Making Large Language Models Better Physicians with Diagnostic Rule Alignment	Xiaohan Wang et.al.	2408.12579	null
2024-08-22	MuMA-ToM: Multi-modal Multi-Agent Theory of Mind	Haojun Shi et.al.	2408.12574	link
2024-08-22	Jamba-1.5: Hybrid Transformer-Mamba Models at Scale	Jamba Team et.al.	2408.12570	null
2024-08-22	ssProp: Energy-Efficient Training for Convolutional Neural Networks with Scheduled Sparse Back Propagation	Lujia Zhong et.al.	2408.12561	link
2024-08-22	Towards Evaluating and Building Versatile Large Language Models for Medicine	Chaoyi Wu et.al.	2408.12547	link
2024-08-22	Show-o: One Single Transformer to Unify Multimodal Understanding and Generation	Jinheng Xie et.al.	2408.12528	null
2024-08-22	MEDCO: Medical Education Copilots Based on A Multi-Agent Framework	Hao Wei et.al.	2408.12496	null
2024-08-22	GenderCARE: A Comprehensive Framework for Assessing and Reducing Gender Bias in Large Language Models	Kunsheng Tang et.al.	2408.12494	link
2024-08-23	Vintern-1B: An Efficient Multimodal Large Language Model for Vietnamese	Khang T. Doan et.al.	2408.12480	null
2024-08-22	Frame Order Matters: A Temporal Sequence-Aware Model for Few-Shot Action Recognition	Bozheng Li et.al.	2408.12475	null
2024-08-22	DLCRec: A Novel Approach for Managing Diversity in LLM-Based Recommender Systems	Jiaju Chen et.al.	2408.12470	link
2024-08-22	Envisioning Class Entity Reasoning by Large Language Models for Few-shot Learning	Mushui Liu et.al.	2408.12469	null
2024-08-22	Enhancing Multi-hop Reasoning through Knowledge Erasure in Large Language Model Editing	Mengqi Zhang et.al.	2408.12456	null
2024-08-22	Positional Description for Numerical Normalization	Deepanshu Gupta et.al.	2408.12430	null
2024-08-22	FlexEdit: Marrying Free-Shape Masks to VLLM for Flexible Image Editing	Jue Wang et.al.	2408.12429	link
2024-08-22	Enhanced Infield Agriculture with Interpretable Machine Learning Approaches for Crop Classification	Sudi Murindanyi et.al.	2408.12426	null
2024-08-22	Unlearning Trojans in Large Language Models: A Comparison Between Natural Language and Source Code	Mahdi Kazemi et.al.	2408.12416	null
2024-08-22	Generalized SAM: Efficient Fine-Tuning of SAM for Variable Input Image Sizes	Sota Kato et.al.	2408.12406	link
2024-08-21	Great Memory, Shallow Reasoning: Limits of $k$ NN-LMs	Shangyi Geng et.al.	2408.11815	link
2024-08-21	SEA: Supervised Embedding Alignment for Token-Level Visual-Textual Integration in MLLMs	Yuanyang Yin et.al.	2408.11813	null
2024-08-21	EmbodiedSAM: Online Segment Any 3D Thing in Real Time	Xiuwei Xu et.al.	2408.11811	null
2024-08-21	Approaching Deep Learning through the Spectral Dynamics of Weights	David Yunis et.al.	2408.11804	link
2024-08-21	Story3D-Agent: Exploring 3D Storytelling Visualization with Large Language Models	Yuzhou Huang et.al.	2408.11801	null
2024-08-21	PermitQA: A Benchmark for Retrieval Augmented Generation in Wind Siting and Permitting domain	Rounak Meyur et.al.	2408.11800	null
2024-08-21	Practical token pruning for foundation models in few-shot conversational virtual assistant systems	Haode Qi et.al.	2408.11799	null
2024-08-21	EE-MLLM: A Data-Efficient and Compute-Efficient Multimodal Large Language Model	Feipeng Ma et.al.	2408.11795	null
2024-08-21	Leveraging Chemistry Foundation Models to Facilitate Structure Focused Retrieval Augmented Generation in Multi-Agent Workflows for Catalyst and Materials Design	Nathaniel H. Park et.al.	2408.11793	null
2024-08-21	Critique-out-Loud Reward Models	Zachary Ankner et.al.	2408.11791	link
2024-08-21	DreamFactory: Pioneering Multi-Scene Long Video Generation with a Multi-Agent Framework	Zhifei Xie et.al.	2408.11788	null
2024-08-21	Personality Alignment of Large Language Models	Minjun Zhu et.al.	2408.11779	link
2024-08-21	Leveraging Fine-Tuned Retrieval-Augmented Generation with Long-Context Support: For 3GPP Standards	Omar Erak et.al.	2408.11775	link
2024-08-21	Against All Odds: Overcoming Typology, Script, and Language Confusion in Multilingual Embedding Inversion Attacks	Yiyi Chen et.al.	2408.11749	link
2024-08-21	DH-Bench: Probing Depth and Height Perception of Large Visual-Language Models	Shehreen Azad et.al.	2408.11748	link
2024-08-21	Open-Ended 3D Point Cloud Instance Segmentation	Phuc D. A. Nguyen et.al.	2408.11747	null
2024-08-21	Mixed Sparsity Training: Achieving 4 $\times$ FLOP Reduction for Transformer Pretraining	Pihe Hu et.al.	2408.11746	null
2024-08-21	FocusLLM: Scaling LLM's Context by Parallel Decoding	Zhenyu Li et.al.	2408.11745	link
2024-08-21	MARLIN: Mixed-Precision Auto-Regressive Parallel Inference on Large Language Models	Elias Frantar et.al.	2408.11743	link
2024-08-21	CluMo: Cluster-based Modality Fusion Prompt for Continual Learning in Visual Question Answering	Yuliang Cai et.al.	2408.11742	link
2024-08-20	Prompt-Guided Image-Adaptive Neural Implicit Lookup Tables for Interpretable Image Enhancement	Satoshi Kosugi et.al.	2408.11055	link
2024-08-20	Revisiting VerilogEval: Newer LLMs, In-Context Learning, and Specification-to-RTL Tasks	Nathaniel Pinckney et.al.	2408.11053	link
2024-08-20	FLAME: Learning to Navigate with Multimodal LLM in Urban Environments	Yunzhe Xu et.al.	2408.11051	link
2024-08-20	MagicDec: Breaking the Latency-Throughput Tradeoff for Long Context Generation with Speculative Decoding	Jian Chen et.al.	2408.11049	link
2024-08-20	Inside the Black Box: Detecting Data Leakage in Pre-trained Language Encoders	Yuan Xin et.al.	2408.11046	null
2024-08-20	Reconciling Methodological Paradigms: Employing Large Language Models as Novice Qualitative Research Assistants in Talent Management Research	Sreyoshi Bhaduri et.al.	2408.11043	null
2024-08-20	Transfusion: Predict the Next Token and Diffuse Images with One Multi-Modal Model	Chunting Zhou et.al.	2408.11039	null
2024-08-20	Scaling Law with Learning Rate Annealing	Howe Tissue et.al.	2408.11029	null
2024-08-20	Athena: Safe Autonomous Agents with Verbal Contrastive Learning	Tanmana Sadhu et.al.	2408.11021	null
2024-08-20	While GitHub Copilot Excels at Coding, Does It Ensure Responsible Output?	Wen Cheng et.al.	2408.11006	link
2024-08-20	SenPa-MAE: Sensor Parameter Aware Masked Autoencoder for Multi-Satellite Self-Supervised Pretraining	Jonathan Prexl et.al.	2408.11000	link
2024-08-20	CTP-LLM: Clinical Trial Phase Transition Prediction Using Large Language Models	Michael Reinisch et.al.	2408.10995	null
2024-08-20	Dr.Academy: A Benchmark for Evaluating Questioning Capability in Education for Large Language Models	Yuyan Chen et.al.	2408.10947	null
2024-08-20	Large Language Model Driven Recommendation	Anton Korikov et.al.	2408.10946	null
2024-08-20	HiRED: Attention-Guided Token Dropping for Efficient Inference of High-Resolution Vision-Language Models in Resource-Constrained Environments	Kazi Hasan Ibn Arif et.al.	2408.10945	link
2024-08-20	SysBench: Can Large Language Models Follow System Messages?	Yanzhao Qin et.al.	2408.10943	link
2024-08-20	Proxona: Leveraging LLM-Driven Personas to Enhance Creators' Understanding of Their Audience	Yoonseo Choi et.al.	2408.10937	null
2024-08-20	LBC: Language-Based-Classifier for Out-Of-Variable Generalization	Kangjun Noh et.al.	2408.10923	link
2024-08-21	BEYOND DIALOGUE: A Profile-Dialogue Alignment Framework Towards General Role-Playing Language Model	Yeyong Yu et.al.	2408.10903	link
2024-08-20	Soda-Eval: Open-Domain Dialogue Evaluation in the age of LLMs	John Mendonça et.al.	2408.10902	link
2024-08-19	SANER: Annotation-free Societal Attribute Neutralizer for Debiasing CLIP	Yusuke Hirota et.al.	2408.10202	null
2024-08-19	Demystifying the Communication Characteristics for Distributed Transformer Models	Quentin Anthony et.al.	2408.10197	null
2024-08-19	Transformers to SSMs: Distilling Quadratic Knowledge to Subquadratic Models	Aviv Bick et.al.	2408.10189	null
2024-08-19	LongVILA: Scaling Long-Context Visual Language Models for Long Videos	Fuzhao Xue et.al.	2408.10188	link
2024-08-19	SMILE: Zero-Shot Sparse Mixture of Low-Rank Experts Construction From Pre-Trained Foundation Models	Anke Tang et.al.	2408.10174	link
2024-08-19	Customizing Language Models with Instance-wise LoRA for Sequential Recommendation	Xiaoyu Kong et.al.	2408.10159	link
2024-08-19	Multilingual Needle in a Haystack: Investigating Long-Context Behavior of Multilingual Large Language Models	Amey Hengle et.al.	2408.10151	link
2024-08-19	In-Context Learning with Representations: Contextual Generalization of Trained Transformers	Tong Yang et.al.	2408.10147	null
2024-08-19	Instruction Finetuning for Leaderboard Generation from Empirical AI Research	Salomon Kabongo et.al.	2408.10141	null
2024-08-19	Rhyme-aware Chinese lyric generator based on GPT	Yixiao Yuan et.al.	2408.10130	null
2024-08-19	Video Object Segmentation via SAM 2: The 4th Solution for LSVOS Challenge VOS Track	Feiyu Pan et.al.	2408.10125	null
2024-08-19	Molecular Graph Representation Learning Integrating Large Language Models with Domain-specific Small Models	Tianyu Zhang et.al.	2408.10124	link
2024-08-19	Geometry Informed Tokenization of Molecules for Language Model Generation	Xiner Li et.al.	2408.10120	null
2024-08-19	GLIMMER: Incorporating Graph and Lexical Features in Unsupervised Multi-Document Summarization	Ran Liu et.al.	2408.10115	link
2024-08-20	PLUTUS: A Well Pre-trained Large Unified Transformer can Unveil Financial Time Series Regularities	Yuanjian Xu et.al.	2408.10111	null
2024-08-19	ARMADA: Attribute-Based Multimodal Data Augmentation	Xiaomeng Jin et.al.	2408.10086	null
2024-08-19	Personalizing Reinforcement Learning from Human Feedback with Variational Preference Learning	Sriyash Poddar et.al.	2408.10075	null
2024-08-19	FFAA: Multimodal Large Language Model based Explainable Open-World Face Forgery Analysis Assistant	Zhengchao Huang et.al.	2408.10072	link
2024-08-19	Privacy Checklist: Privacy Violation Detection Grounding on Contextual Integrity Theory	Haoran Li et.al.	2408.10053	null
2024-08-19	Defense Priorities in the Open-Source AI Debate: A Preliminary Assessment	Masao Dahlgren et.al.	2408.10026	null
2024-08-16	SAM2-UNet: Segment Anything 2 Makes Strong Encoder for Natural and Medical Image Segmentation	Xinyu Xiong et.al.	2408.08870	link
2024-08-16	PEDAL: Enhancing Greedy Decoding with Large Language Models using Diverse Exemplars	Sumanth Prabhu et.al.	2408.08869	null
2024-08-16	A Hassle-free Algorithm for Private Learning in Practice: Don't Use Tree Aggregation, Use BLTs	H. Brendan McMahan et.al.	2408.08868	null
2024-08-16	Visual Agents as Fast and Slow Thinkers	Guangyan Sun et.al.	2408.08862	link
2024-08-16	DPA: Dual Prototypes Alignment for Unsupervised Adaptation of Vision-Language Models	Eman Ali et.al.	2408.08855	null
2024-08-16	GeoTransformer: Enhancing Urban Forecasting with Geospatial Attention Mechanisms	Yuhao Jia et.al.	2408.08852	null
2024-08-16	ECG-Chat: A Large ECG-Language Model for Cardiac Disease Diagnosis	Yubao Zhao et.al.	2408.08849	link
2024-08-16	PsychoLex: Unveiling the Psychological Mind of Large Language Models	Mohammad Amin Abbasi et.al.	2408.08848	null
2024-08-16	FLEXTAF: Enhancing Table Reasoning with Flexible Tabular Formats	Xuanliang Zhang et.al.	2408.08841	link
2024-08-16	EasyRec: Simple yet Effective Language Models for Recommendation	Xubin Ren et.al.	2408.08821	link
2024-08-16	Retrieval-augmented Few-shot Medical Image Segmentation with Foundation Models	Lin Zhao et.al.	2408.08813	null
2024-08-16	Artificial Intelligence and Strategic Decision-Making: Evidence from Entrepreneurs and Investors	Felipe A. Csaszar et.al.	2408.08811	null
2024-08-16	Constructing Domain-Specific Evaluation Sets for LLM-as-a-judge	Ravi Raju et.al.	2408.08808	null
2024-08-16	CIKMar: A Dual-Encoder Approach to Prompt-Based Reranking in Educational Dialogue Systems	Joanito Agili Lopo et.al.	2408.08805	null
2024-08-16	A Disease-Specific Foundation Model Using Over 100K Fundus Images: Release and Validation for Abnormality and Multi-Disease Classification on Downstream Tasks	Boa Jang et.al.	2408.08790	link
2024-08-16	EmoDynamiX: Emotional Support Dialogue Strategy Prediction by Modelling MiXed Emotions and Discourse Dynamics	Chenwei Wan et.al.	2408.08782	link
2024-08-16	Large Language Models Might Not Care What You Are Saying: Prompt Format Beats Descriptions	Chenming Tang et.al.	2408.08780	null
2024-08-16	DAC: Decomposed Automation Correction for Text-to-SQL	Dingzirui Wang et.al.	2408.08779	link
2024-08-16	Lower Layer Matters: Alleviating Hallucination via Multi-Layer Fusion Contrastive Decoding with Truthfulness Refocused	Dingwei Chen et.al.	2408.08769	null
2024-08-16	Rethinking Generative Semantic Communication for Multi-User Systems with Multi-Modal LLM	Wanting Yang et.al.	2408.08765	null
2024-08-15	Can Large Language Models Understand Symbolic Graphics Programs?	Zeju Qiu et.al.	2408.08313	null
2024-08-15	ScalingFilter: Assessing Data Quality through Inverse Utilization of Scaling Laws	Ruihang Li et.al.	2408.08310	null
2024-08-15	Towards Flexible Visual Relationship Segmentation	Fangrui Zhu et.al.	2408.08305	null
2024-08-15	Benchmarking the Capabilities of Large Language Models in Transportation System Engineering: Accuracy, Consistency, and Reasoning Behaviors	Usman Syed et.al.	2408.08302	null
2024-08-15	VLPG-Nav: Object Navigation Using Visual Language Pose Graph and Object Localization Probability Maps	Senthil Hariharan Arul et.al.	2408.08301	null
2024-08-15	HELP: Hierarchical Embeddings-based Log Parsing	Andy Xu et.al.	2408.08300	null
2024-08-15	The ShareLM Collection and Plugin: Contributing Human-Model Chats for the Benefit of the Community	Shachar Don-Yehiya et.al.	2408.08291	null
2024-08-15	Autonomous Behavior Planning For Humanoid Loco-manipulation Through Grounded Language Model	Jin Wang et.al.	2408.08282	null
2024-08-15	BAM! Just Like That: Simple and Efficient Parameter Upcycling for Mixture of Experts	Qizhen Zhang et.al.	2408.08274	null
2024-08-15	DaRec: A Disentangled Alignment Framework for Large Language Model and Recommender System	Xihong Yang et.al.	2408.08231	null
2024-08-15	RED-CT: A Systems Design Methodology for Using LLM-labeled Data to Train and Deploy Edge Classifiers for Computational Social Science	David Farr et.al.	2408.08217	null
2024-08-15	Does Reasoning Emerge? Examining the Probabilities of Causation in Large Language Models	Javier González et.al.	2408.08210	null
2024-08-15	LLM4DSR: Leveraing Large Language Model for Denoising Sequential Recommendation	Bohao Wang et.al.	2408.08208	null
2024-08-15	Heavy Labels Out! Dataset Distillation with Label Space Lightening	Ruonan Yu et.al.	2408.08201	null
2024-08-15	Scaling Up Natural Language Understanding for Multi-Robots Through the Lens of Hierarchy	Shaojun Xu et.al.	2408.08188	null
2024-08-15	General-purpose Clothes Manipulation with Semantic Keypoints	Yuhong Deng et.al.	2408.08160	null
2024-08-15	EmBARDiment: an Embodied AI Agent for Productivity in XR	Riccardo Bovo et.al.	2408.08158	null
2024-08-15	DeepSeek-Prover-V1.5: Harnessing Proof Assistant Feedback for Reinforcement Learning and Monte-Carlo Tree Search	Huajian Xin et.al.	2408.08152	link
2024-08-15	P/D-Serve: Serving Disaggregated Large Language Model at Scale	Yibo Jin et.al.	2408.08147	null
2024-08-15	KOALA: Enhancing Speculative Decoding for LLM via Multi-Layer Draft Heads with Adversarial Learning	Kaiqi Zhang et.al.	2408.08146	null
2024-08-14	The Death of Schema Linking? Text-to-SQL in the Age of Well-Reasoned Language Models	Karime Maamari et.al.	2408.07702	null
2024-08-15	Model Merging in LLMs, MLLMs, and Beyond: Methods, Theories, Applications and Opportunities	Enneng Yang et.al.	2408.07666	link
2024-08-14	Spoken Stereoset: On Evaluating Social Bias Toward Speaker in Speech Large Language Models	Yi-Cheng Lin et.al.	2408.07665	link
2024-08-14	Alignment-Enhanced Decoding:Defending via Token-Level Adaptive Refining of Probability Distributions	Quan Liu et.al.	2408.07663	link
2024-08-14	WeKnow-RAG: An Adaptive Approach for Retrieval-Augmented Generation Integrating Web Search and Knowledge Graphs	Weijian Xie et.al.	2408.07611	null
2024-08-14	Transformers and Large Language Models for Efficient Intrusion Detection Systems: A Comprehensive Survey	Hamza Kheddar et.al.	2408.07583	null
2024-08-15	MathScape: Evaluating MLLMs in multimodal Math Scenarios through a Hierarchical Benchmark	Minxuan Zhou et.al.	2408.07543	link
2024-08-15	Usefulness of data flow diagrams and large language models for security threat validation: a registered report	Winnie Bahati Mbaka et.al.	2408.07537	null
2024-08-14	Development of a Multi-Agent Clinical Decision Support System for Korean Triage and Acuity Scale (KTAS)-Based Triage and Treatment Planning in Emergency Departments	Seungjun Han et.al.	2408.07531	null
2024-08-14	Large Language Models Know What Makes Exemplary Contexts	Quanyu Long et.al.	2408.07505	null
2024-08-14	Cross-Platform Video Person ReID: A New Benchmark Dataset and Adaptation Approach	Shizhou Zhang et.al.	2408.07500	link
2024-08-14	QirK: Question Answering via Intermediate Representation on Knowledge Graphs	Jan Luca Scheerer et.al.	2408.07494	null
2024-08-14	Training Overhead Ratio: A Practical Reliability Metric for Large Language Model Training Systems	Ning Lu et.al.	2408.07482	null
2024-08-14	Bridging and Modeling Correlations in Pairwise Data for Direct Preference Optimization	Yuxin Jiang et.al.	2408.07471	link
2024-08-14	Domain-invariant Representation Learning via Segment Anything Model for Blood Cell Classification	Yongcheng Li et.al.	2408.07467	link
2024-08-14	Large Language Models Prompting With Episodic Memory	Dai Do et.al.	2408.07465	null
2024-08-14	From Brazilian Portuguese to European Portuguese	João Sanches et.al.	2408.07457	null
2024-08-14	Fact or Fiction? Improving Fact Verification with Knowledge Graphs through Simplified Subgraph Retrievals	Tobias A. Opsahl et.al.	2408.07453	link
2024-08-15	BAPLe: Backdoor Attacks on Medical Foundational Models using Prompt Learning	Asif Hanif et.al.	2408.07440	link
2024-08-14	Beyond Inter-Item Relations: Dynamic Adaptive Mixture-of-Experts for LLM-Based Sequential Recommendation	CanYi Liu et.al.	2408.07427	null
2024-08-13	Diversity Empowers Intelligence: Integrating Expertise of Software Engineering Agents	Kexun Zhang et.al.	2408.07060	null
2024-08-13	LongWriter: Unleashing 10,000+ Word Generation from Long Context LLMs	Yushi Bai et.al.	2408.07055	link
2024-08-13	Casper: Prompt Sanitization for Protecting User Privacy in Web-Based Large Language Models	Chun Jie Chong et.al.	2408.07004	null
2024-08-13	LLMs can Schedule	Henrik Abgaryan et.al.	2408.06993	link
2024-08-13	DyG-Mamba: Continuous State Space Modeling on Dynamic Graphs	Dongyuan Li et.al.	2408.06966	null
2024-08-13	Towards Holistic Disease Risk Prediction using Small Language Models	Liv Björkdahl et.al.	2408.06943	null
2024-08-13	OpenResearcher: Unleashing AI for Accelerated Scientific Research	Yuxiang Zheng et.al.	2408.06941	link
2024-08-13	The advantages of context specific language models: the case of the Erasmian Language Model	João Gonçalves et.al.	2408.06931	link
2024-08-13	Evaluating Cultural Adaptability of a Large Language Model via Simulation of Synthetic Personas	Louis Kwok et.al.	2408.06929	link
2024-08-13	SceneGPT: A Language Model for 3D Scene Understanding	Shivam Chandhok et.al.	2408.06926	null
2024-08-13	Re-TASK: Revisiting LLM Tasks from Capability, Skill, and Knowledge Perspectives	Zhihu Wang et.al.	2408.06904	null
2024-08-13	Leveraging Language Models for Emotion and Behavior Analysis in Education	Kaito Tanaka et.al.	2408.06874	null
2024-08-13	LoRA $^2$ : Multi-Scale Low-Rank Approximations for Fine-Tuning Large Language Models	Jia-Chen Zhang et.al.	2408.06854	null
2024-08-13	Causal Agent based on Large Language Model	Kairong Han et.al.	2408.06849	link
2024-08-13	DracoGPT: Extracting Visualization Design Preferences from Large Language Models	Huichen Will Wang et.al.	2408.06845	null
2024-08-13	How Aligned are Human Chart Takeaways and LLM Predictions? A Case Study on Bar Charts with Varying Layouts	Huichen Will Wang et.al.	2408.06837	null
2024-08-13	Efficient Search for Customized Activation Functions with Gradient Descent	Lukas Strack et.al.	2408.06820	link
2024-08-13	MAQA: Evaluating Uncertainty Quantification in LLMs Regarding Data Uncertainty	Yongjin Yang et.al.	2408.06816	null
2024-08-13	HLSPilot: LLM-based High-Level Synthesis	Chenwei Xiong et.al.	2408.06810	link
2024-08-13	Layerwise Recurrent Router for Mixture-of-Experts	Zihan Qiu et.al.	2408.06793	link
2024-08-12	FastFiD: Improve Inference Efficiency of Open Domain Question Answering via Sentence Selection	Yufei Huang et.al.	2408.06333	link
2024-08-12	Animate, or Inanimate, That is the Question for Large Language Models	Leonardo Ranaldi et.al.	2408.06332	null
2024-08-12	Can We Rely on LLM Agents to Draft Long-Horizon Plans? Let's Take TravelPlanner as an Example	Yanan Chen et.al.	2408.06318	null
2024-08-12	Long-Form Answers to Visual Questions from Blind and Low Vision People	Mina Huh et.al.	2408.06303	null
2024-08-12	The AI Scientist: Towards Fully Automated Open-Ended Scientific Discovery	Chris Lu et.al.	2408.06292	link
2024-08-12	MovieSum: An Abstractive Summarization Dataset for Movie Screenplays	Rohit Saxena et.al.	2408.06281	link
2024-08-13	Review-driven Personalized Preference Reasoning with Large Language Models for Recommendation	Jieyong Kim et.al.	2408.06276	link
2024-08-12	FuxiTranyu: A Multilingual Large Language Model Trained with Balanced Data	Haoran Sun et.al.	2408.06273	link
2024-08-12	A RAG-Based Question-Answering Solution for Cyber-Attack Investigation and Attribution	Sampath Rajapaksha et.al.	2408.06272	null
2024-08-12	Anchored Preference Optimization and Contrastive Revisions: Addressing Underspecification in Alignment	Karel D'Oosterlinck et.al.	2408.06266	link
2024-08-12	Context-aware Visual Storytelling with Visual Prefix Tuning and Contrastive Learning	Yingjin Song et.al.	2408.06259	null
2024-08-12	On Effects of Steering Latent Representation for Large Language Model Unlearning	Dang Huu-Tien et.al.	2408.06223	link
2024-08-12	Mutual Reasoning Makes Smaller LLMs Stronger Problem-Solvers	Zhenting Qi et.al.	2408.06195	link
2024-08-12	FruitNeRF: A Unified Neural Radiance Field based Fruit Counting Framework	Lukas Meyer et.al.	2408.06190	link
2024-08-12	Improving Structural Diversity of Blackbox LLMs via Chain-of-Specification Prompting	Halley Young et.al.	2408.06186	null
2024-08-12	OmniCLIP: Adapting CLIP for Video Recognition with Spatial-Temporal Omni-Scale Feature Learning	Mushui Liu et.al.	2408.06158	link
2024-08-12	LipidBERT: A Lipid Language Model Pre-trained on METiS de novo Lipid Library	Tianhao Yu et.al.	2408.06150	null
2024-08-12	Self-Supervised Learning on MeerKAT Wide-Field Continuum Images	Erica Lastufka et.al.	2408.06147	link
2024-08-12	Med42-v2: A Suite of Clinical LLMs	Clément Christophe et.al.	2408.06142	null
2024-08-12	Utilize Transformers for translating Wikipedia category names	Hoang-Thang Ta et.al.	2408.06124	null
2024-08-10	Preserving Privacy in Large Language Models: A Survey on Current Threats and Solutions	Michele Miranda et.al.	2408.05212	link
2024-08-09	VITA: Towards Open-Source Interactive Omni Multimodal LLM	Chaoyou Fu et.al.	2408.05211	link
2024-08-09	Evaluating the capability of large language models to personalize science texts for diverse middle-school-age learners	Michael Vaccaro Jr et.al.	2408.05204	null
2024-08-09	TaSL: Task Skill Localization and Consolidation for Language Model Continual Learning	Yujie Feng et.al.	2408.05200	link
2024-08-09	ECG-FM: An Open Electrocardiogram Foundation Model	Kaden McKeen et.al.	2408.05178	link
2024-08-09	Weak-Annotation of HAR Datasets using Vision Foundation Models	Marius Bock et.al.	2408.05169	link
2024-08-09	AttackER: Towards Enhancing Cyber-Attack Attribution with a Named Entity Recognition Dataset	Pritam Deka et.al.	2408.05149	null
2024-08-09	A Hybrid RAG System with Comprehensive Enhancement on Complex Reasoning	Ye Yuan et.al.	2408.05141	null
2024-08-09	Is ChatGPT a Good Software Librarian? An Exploratory Study on the Use of ChatGPT for Software Library Recommendations	Jasmine Latendresse et.al.	2408.05128	null
2024-08-09	Large Language Models and Thematic Analysis: Human-AI Synergy in Researching Hate Speech on Social Media	Petre Breazu et.al.	2408.05126	null
2024-08-09	Sportify: Question Answering with Embedded Visualizations and Personified Narratives for Sports Video	Chunggi Lee et.al.	2408.05123	null
2024-08-09	A Survey of NL2SQL with Large Language Models: Where are we, and where are we going?	Xinyu Liu et.al.	2408.05109	link
2024-08-09	Depth Helps: Improving Pre-trained RGB-based Policy with Depth Information Injection	Xincheng Pang et.al.	2408.05107	null
2024-08-09	How Well Do LLMs Identify Cultural Unity in Diversity?	Jialin Li et.al.	2408.05102	link
2024-08-09	Hyperbolic Learning with Multimodal Large Language Models	Paolo Mandica et.al.	2408.05097	null
2024-08-09	Unlocking Decoding-time Controllability: Gradient-Free Multi-Objective Alignment with Contrastive Prompts	Tingchen Fu et.al.	2408.05094	null
2024-08-09	Order Matters in Hallucination: Reasoning Order as Benchmark and Reflexive Prompting for Large-Language-Models	Zikai Xie et.al.	2408.05093	link
2024-08-09	Generating novel experimental hypotheses from language models: A case study on cross-dative generalization	Kanishka Misra et.al.	2408.05086	link
2024-08-09	RT-Surv: Improving Mortality Prediction After Radiotherapy with Large Language Model Structuring of Large-Scale Unstructured Electronic Health Records	Sangjoon Park et.al.	2408.05074	null
2024-08-09	Examining the Behavior of LLM Architectures Within the Framework of Standardized National Exams in Brazil	Marcelo Sartori Locatelli et.al.	2408.05035	null
2024-08-08	Better Alignment with Instruction Back-and-Forth Translation	Thao Nguyen et.al.	2408.04614	null
2024-08-08	Code-switching in text and speech reveals information-theoretic audience design	Debasmita Bhattacharya et.al.	2408.04596	null
2024-08-09	Img-Diff: Contrastive Data Synthesis for Multimodal Large Language Models	Qirui Jiao et.al.	2408.04594	link
2024-08-08	Towards Resilient and Efficient LLMs: A Comparative Study of Efficiency, Performance, and Adversarial Robustness	Xiaojing Fan et.al.	2408.04585	null
2024-08-08	SAM2-Adapter: Evaluating & Adapting Segment Anything 2 in Downstream Tasks: Camouflage, Shadow, Medical Image Segmentation, and More	Tianrun Chen et.al.	2408.04579	null
2024-08-08	SCENE: Evaluating Explainable AI Techniques Using Soft Counterfactuals	Haoran Zheng et.al.	2408.04575	null
2024-08-08	Learning Fine-Grained Grounded Citations for Attributed Large Language Models	Lei Huang et.al.	2408.04568	link
2024-08-08	Bias-Aware Low-Rank Adaptation: Mitigating Catastrophic Inheritance of Large Language Models	Yupeng Chang et.al.	2408.04556	link
2024-08-08	Depth Any Canopy: Leveraging Depth Foundation Models for Canopy Height Estimation	Daniele Rege Cambrin et.al.	2408.04523	link
2024-08-08	Compromesso! Italian Many-Shot Jailbreaks Undermine the Safety of Large Language Models	Fabio Pernisi et.al.	2408.04522	null
2024-08-08	What You Need is What You Get: Theory of Mind for an LLM-Based Code Understanding Assistant	Jonan Richards et.al.	2408.04477	null
2024-08-08	Can LLMs Beat Humans in Debating? A Dynamic Multi-agent Framework for Competitive Debate	Yiqun Zhang et.al.	2408.04472	link
2024-08-08	RiskAwareBench: Towards Evaluating Physical Risk Awareness for High-level Planning of LLM-based Embodied Agents	Zihao Zhu et.al.	2408.04449	link
2024-08-08	Large Language Models for cross-language code clone detection	Micheline Bénédicte Moumoula et.al.	2408.04430	null
2024-08-08	Recognizing Emotion Regulation Strategies from Human Behavior with Large Language Models	Philipp Müller et.al.	2408.04420	null
2024-08-08	Enhancing Robustness of Retrieval-Augmented Language Models with In-Context Learning	Seong-Il Park et.al.	2408.04414	null
2024-08-08	Deeploy: Enabling Energy-Efficient Deployment of Small Language Models On Heterogeneous Microcontrollers	Moritz Scherer et.al.	2408.04413	null
2024-08-08	Exploring Reasoning Biases in Large Language Models Through Syllogism: Insights from the NeuBAROCO Dataset	Kentaro Ozeki et.al.	2408.04403	link
2024-08-08	Automated Educational Question Generation at Different Bloom's Skill Levels using Large Language Models: Strategies and Evaluation	Nicy Scaria et.al.	2408.04394	link
2024-08-08	Open-domain Implicit Format Control for Large Language Model Generation	Yiqun Yao et.al.	2408.04392	link
2024-08-07	How Well Can Vision Language Models See Image Details?	Chenhui Gou et.al.	2408.03940	null
2024-08-07	SLIM-RAFT: A Novel Fine-Tuning Approach to Improve Cross-Linguistic Performance for Mercosur Common Nomenclature	Vinícius Di Oliveira et.al.	2408.03936	null
2024-08-07	CodexGraph: Bridging Large Language Models and Code Repositories via Code Graph Databases	Xiangyan Liu et.al.	2408.03910	link
2024-08-07	Decoding Biases: Automated Methods and LLM Judges for Gender Bias Detection in Language Models	Shachi H Kumar et.al.	2408.03907	null
2024-08-07	Speech-MASSIVE: A Multilingual Speech Dataset for SLU and Beyond	Beomseok Lee et.al.	2408.03900	link
2024-08-07	Simplifying Scholarly Abstracts for Accessible Digital Libraries	Haining Wang et.al.	2408.03899	link
2024-08-07	From Data to Story: Towards Automatic Animated Data Video Creation with LLM-based Multi-Agent Systems	Leixian Shen et.al.	2408.03876	null
2024-08-07	PackMamba: Efficient Processing of Variable-Length Sequences in Mamba training	Haoran Xu et.al.	2408.03865	null
2024-08-07	GAIA -- A Large Language Model for Advanced Power Dispatch	Yuheng Cheng et.al.	2408.03847	null
2024-08-07	MaxMind: A Memory Loop Network to Enhance Software Productivity based on Large Language Models	Yuchen Dong et.al.	2408.03841	null
2024-08-07	WalledEval: A Comprehensive Safety Evaluation Toolkit for Large Language Models	Prannaya Gupta et.al.	2408.03837	link
2024-08-07	Target Prompting for Information Extraction with Vision Language Model	Dipankar Medhi et.al.	2408.03834	null
2024-08-07	Leveraging Variation Theory in Counterfactual Data Augmentation for Optimized Active Learning	Simret Araya Gebreegziabher et.al.	2408.03819	null
2024-08-07	Generative Language Models with Retrieval Augmented Generation for Automated Short Answer Scoring	Zifan Wang et.al.	2408.03811	null
2024-08-07	'Finance Wizard' at the FinLLM Challenge Task: Financial Text Summarization	Meisin Lee et.al.	2408.03762	null
2024-08-07	MMSummary: Multimodal Summary Generation for Fetal Ultrasound Video	Xiaoqing Guo et.al.	2408.03761	null
2024-08-07	Advancing Multimodal Large Language Models with Quantization-Aware Scale Learning for Efficient Adaptation	Jingjing Xie et.al.	2408.03735	link
2024-08-07	Question Rephrasing for Quantifying Uncertainty in Large Language Models: Applications in Molecular Chemistry Tasks	Zizhang Chen et.al.	2408.03732	null
2024-08-07	A Convex-optimization-based Layer-wise Post-training Pruner for Large Language Models	Pengxiang Zhao et.al.	2408.03728	null
2024-08-07	Local Topology Measures of Contextual Language Model Latent Spaces With Applications to Dialogue Term Extraction	Benjamin Matthias Ruppik et.al.	2408.03706	null
2024-08-06	CoverBench: A Challenging Benchmark for Complex Claim Verification	Alon Jacovi et.al.	2408.03325	null
2024-08-06	Segment Anything in Medical Images and Videos: Benchmark and Deployment	Jun Ma et.al.	2408.03322	link
2024-08-06	TextIM: Part-aware Interactive Motion Synthesis from Text	Siyuan Fan et.al.	2408.03302	null
2024-08-06	KaPO: Knowledge-aware Preference Optimization for Controllable Knowledge Selection in Retrieval-Augmented Language Models	Ruizhe Zhang et.al.	2408.03297	null
2024-08-06	Biomedical SAM 2: Segment Anything in Biomedical Images and Videos	Zhiling Yan et.al.	2408.03286	link
2024-08-07	StructEval: Deepen and Broaden Large Language Model Assessment via Structured Evaluation	Boxi Cao et.al.	2408.03281	link
2024-08-06	Compress and Compare: Interactively Evaluating Efficiency and Behavior Across ML Model Compression Experiments	Angie Boggust et.al.	2408.03274	null
2024-08-06	Synthesizing Text-to-SQL Data from Weak and Strong LLMs	Jiaxi Yang et.al.	2408.03256	null
2024-08-06	Unveiling Factual Recall Behaviors of Large Language Models through Knowledge Neurons	Yifei Wang et.al.	2408.03247	link
2024-08-06	Making Long-Context Language Models Better Multi-Hop Reasoners	Yanyang Li et.al.	2408.03246	link
2024-08-06	Leveraging Parameter Efficient Training Methods for Low Resource Text Classification: A Case Study in Marathi	Pranita Deshmukh et.al.	2408.03172	null
2024-08-06	Conditioning LLMs with Emotion in Neural Machine Translation	Charles Brazier et.al.	2408.03150	null
2024-08-06	Leveraging Entity Information for Cross-Modality Correlation Learning: The Entity-Guided Multimodal Summarization	Yanghai Zhang et.al.	2408.03149	link
2024-08-06	Inference Optimizations for Large Language Models: Effects, Challenges, and Practical Considerations	Leo Donisch et.al.	2408.03130	null
2024-08-06	Lisbon Computational Linguists at SemEval-2024 Task 2: Using A Mistral 7B Model and Data Augmentation	Artur Guimarães et.al.	2408.03127	link
2024-08-06	Evaluating the Translation Performance of Large Language Models Based on Euas-20	Yan Huang et.al.	2408.03119	null
2024-08-06	Topic Modeling with Fine-tuning LLMs and Bag of Sentences	Johannes Schneider et.al.	2408.03099	link
2024-08-07	TestART: Improving LLM-based Unit Test via Co-evolution of Automated Generation and Repair Iteration	Siqi Gu et.al.	2408.03095	null
2024-08-06	500xCompressor: Generalized Prompt Compression for Large Language Models	Zongqian Li et.al.	2408.03094	link
2024-08-06	Extend Model Merging from Fine-Tuned to Pre-Trained Large Language Models via Weight Disentanglement	Le Yu et.al.	2408.03092	link
2024-08-05	Lumina-mGPT: Illuminate Flexible Photorealistic Text-to-Image Generation with Multimodal Generative Pretraining	Dongyang Liu et.al.	2408.02657	link
2024-08-05	Can Reinforcement Learning Unlock the Hidden Dangers in Aligned Large Language Models?	Mohammad Bahrami Karkevandi et.al.	2408.02651	null
2024-08-05	Command-line Obfuscation Detection using Small Language Models	Vojtech Outrata et.al.	2408.02637	null
2024-08-05	SEAS: Self-Evolving Adversarial Safety Optimization for Large Language Models	Muxi Diao et.al.	2408.02632	null
2024-08-05	Language Model Can Listen While Speaking	Ziyang Ma et.al.	2408.02622	null
2024-08-05	Progressively Selective Label Enhancement for Language Model Alignment	Biao Liu et.al.	2408.02599	null
2024-08-05	Modelling Visual Semantics via Image Captioning to extract Enhanced Multi-Level Cross-Modal Semantic Incongruity Representation with Attention for Multimodal Sarcasm Detection	Sajal Aggarwal et.al.	2408.02595	null
2024-08-05	Leveraging the Power of LLMs: A Fine-Tuning Approach for High-Quality Aspect-Based Summarization	Ankan Mullick et.al.	2408.02584	null
2024-08-05	DanModCap: Designing a Danmaku Moderation Tool for Video-Sharing Platforms that Leverages Impact Captions	Siying Hu et.al.	2408.02574	null
2024-08-05	Evaluating and Enhancing LLMs Agent based on Theory of Mind in Guandan: A Multi-Player Cooperative Game under Imperfect Information	Yauwai Yim et.al.	2408.02559	null
2024-08-05	Generative AI as a Service in 6G Edge-Cloud: Generation Task Offloading by In-context Learning	Hao Zhou et.al.	2408.02549	null
2024-08-05	RAG Foundry: A Framework for Enhancing LLMs for Retrieval Augmented Generation	Daniel Fleischer et.al.	2408.02545	link
2024-08-05	Caution for the Environment: Multimodal Agents are Susceptible to Environmental Distractions	Xinbei Ma et.al.	2408.02544	link
2024-08-05	Towards Coarse-grained Visual Language Navigation Task Planning Enhanced by Event Knowledge Graph	Zhao Kaichen et.al.	2408.02535	null
2024-08-05	Practical Attacks against Black-box Code Completion Engines	Slobodan Jenko et.al.	2408.02509	null
2024-08-05	UnifiedMLLM: Enabling Unified Representation for Multi-modal Multi-tasks With Large Language Model	Zhaowei Li et.al.	2408.02503	link
2024-08-05	Context Conquers Parameters: Outperforming Proprietary LLM in Commit Message Generation	Aaron Imani et.al.	2408.02502	null
2024-08-05	A First Look at License Compliance Capability of LLMs in Code Generation	Weiwei Xu et.al.	2408.02487	link
2024-08-05	Exploring Conditional Multi-Modal Prompts for Zero-shot HOI Detection	Ting Lei et.al.	2408.02484	link
2024-08-05	From LLMs to LLM-based Agents for Software Engineering: A Survey of Current, Challenges and Future	Haolin Jin et.al.	2408.02479	null
2024-08-02	Prompt Recursive Search: A Living Framework with Adaptive Growth in LLM Auto-Prompting	Xiangyu Zhao et.al.	2408.01423	null
2024-08-02	Mission Impossible: A Statistical Perspective on Jailbreaking LLMs	Jingtong Su et.al.	2408.01420	null
2024-08-02	DebateQA: Evaluating Question Answering on Debatable Knowledge	Rongwu Xu et.al.	2408.01419	link
2024-08-02	Talk Less, Interact Better: Evaluating In-context Conversational Adaptation in Multimodal LLMs	Yilun Hua et.al.	2408.01417	null
2024-08-02	Pre-trained Language Models Improve the Few-shot Prompt Ability of Decision Transformer	Yu Yang et.al.	2408.01402	null
2024-08-02	Coalitions of Large Language Models Increase the Robustness of AI Agents	Prattyush Mangal et.al.	2408.01380	null
2024-08-02	Toward Automatic Relevance Judgment using Vision--Language Models for Image--Text Retrieval Evaluation	Jheng-Hong Yang et.al.	2408.01363	null
2024-08-02	Hallu-PI: Evaluating Hallucination in Multi-modal Large Language Models within Perturbed Inputs	Peng Ding et.al.	2408.01355	link
2024-08-02	MCGMark: An Encodable and Robust Online Watermark for LLM-Generated Malicious Code	Kaiwen Ning et.al.	2408.01354	link
2024-08-02	Prompt Refinement or Fine-tuning? Best Practices for using LLMs in Computational Social Science Tasks	Anders Giovanni Møller et.al.	2408.01346	null
2024-08-02	MuChoMusic: Evaluating Music Understanding in Multimodal Audio-Language Models	Benno Weck et.al.	2408.01337	link
2024-08-02	A Backbone for Long-Horizon Robot Task Understanding	Xiaoshuai Chen et.al.	2408.01334	null
2024-08-02	FANNO: Augmenting High-Quality Instruction Data with Open-Sourced LLMs Only	He Zhu et.al.	2408.01323	null
2024-08-02	A Comprehensive Review of Multimodal Large Language Models: Performance and Challenges Across Different Tasks	Jiaqi Wang et.al.	2408.01319	null
2024-08-02	Reconsidering Token Embeddings with the Definitions for Pre-trained Language Models	Ying Zhang et.al.	2408.01308	null
2024-08-02	The Mismeasure of Man and Models: Evaluating Allocational Harms in Large Language Models	Hannah Chen et.al.	2408.01285	null
2024-08-02	RAGEval: Scenario Specific RAG Evaluation Dataset Generation Framework	Kunlun Zhu et.al.	2408.01262	link
2024-08-02	The Phantom Menace: Unmasking Privacy Leakages in Vision-Language Models	Simone Caldarella et.al.	2408.01228	null
2024-08-02	High-Throughput Phenotyping of Clinical Text Using Large Language Models	Daniel B. Hier et.al.	2408.01214	null
2024-08-02	Misinforming LLMs: vulnerabilities, challenges and opportunities	Bo Zhou et.al.	2408.01168	null
2024-08-01	AgentGen: Enhancing Planning Abilities for Large Language Model based Agent via Environment and Task Generation	Mengkang Hu et.al.	2408.00764	link
2024-08-01	UniTalker: Scaling up Audio-Driven 3D Facial Animation through A Unified Model	Xiangyu Fan et.al.	2408.00762	null
2024-08-01	Tamper-Resistant Safeguards for Open-Weight LLMs	Rishub Tamirisa et.al.	2408.00761	link
2024-08-01	Thermal Conductivity Predictions with Foundation Atomistic Models	Balázs Póta et.al.	2408.00755	link
2024-08-01	Coarse Correspondence Elicit 3D Spacetime Understanding in Multimodal Language Model	Benlin Liu et.al.	2408.00754	null
2024-08-01	Collaborative Vision-Text Representation Optimizing for Open-Vocabulary Segmentation	Siyu Jiao et.al.	2408.00744	link
2024-08-01	DynamoLLM: Designing LLM Inference Clusters for Performance and Energy Efficiency	Jovan Stojkovic et.al.	2408.00741	null
2024-08-01	Virchow 2: Scaling Self-Supervised Mixed Magnification Models in Pathology	Eric Zimmermann et.al.	2408.00738	null
2024-08-01	Improving Retrieval-Augmented Generation in Medicine with Iterative Follow-up Questions	Guangzhi Xiong et.al.	2408.00727	link
2024-08-01	An Empirical Analysis of Compute-Optimal Inference for Problem-Solving with Language Models	Yangzhen Wu et.al.	2408.00724	null
2024-08-01	Pathway to Secure and Trustworthy 6G for LLMs: Attacks, Defense, and Opportunities	Sunder Ali Khowaja et.al.	2408.00722	null
2024-08-01	SAM 2: Segment Anything in Images and Videos	Nikhila Ravi et.al.	2408.00714	link
2024-08-01	Point-supervised Brain Tumor Segmentation with Box-prompted MedSAM	Xiaofeng Liu et.al.	2408.00706	null
2024-08-01	Improving Text Embeddings for Smaller Language Models Using Contrastive Fine-tuning	Trapoom Ukarapol et.al.	2408.00690	link
2024-08-01	Can Developers Prompt? A Controlled Experiment for Code Documentation Generation	Hans-Alexander Kruse et.al.	2408.00686	null
2024-08-01	ExpertAF: Expert Actionable Feedback from Video	Kumar Ashutosh et.al.	2408.00672	null
2024-08-01	AutoM3L: An Automated Multimodal Machine Learning Framework with Large Language Models	Daqin Luo et.al.	2408.00665	link
2024-08-01	Disentangling Dense Embeddings with Sparse Autoencoders	Charles O'Neill et.al.	2408.00657	null
2024-08-02	SentenceVAE: Faster, Longer and More Accurate Inference with Next-sentence Prediction for Large Language Models	Hongjun An et.al.	2408.00655	link
2024-08-01	Towards End-to-End Explainable Facial Action Unit Recognition via Vision-Language Joint Learning	Xuri Ge et.al.	2408.00644	null
2024-07-31	Generalized Out-of-Distribution Detection and Beyond in Vision Language Model Era: A Survey	Atsuyuki Miyai et.al.	2407.21794	null
2024-07-31	Vision-Language Model Based Handwriting Verification	Mihir Chauhan et.al.	2407.21788	null
2024-07-31	Large Language Monkeys: Scaling Inference Compute with Repeated Sampling	Bradley Brown et.al.	2407.21787	link
2024-07-31	The Llama 3 Herd of Models	Abhimanyu Dubey et.al.	2407.21783	null
2024-07-31	Paying More Attention to Image: A Training-Free Method for Alleviating Hallucination in LVLMs	Shi Liu et.al.	2407.21771	null
2024-07-31	MoMa: Efficient Early-Fusion Pre-training with Mixture of Modality-Aware Experts	Xi Victoria Lin et.al.	2407.21770	null
2024-07-31	ReplanVLM: Replanning Robotic Tasks with Visual Language Models	Aoran Mei et.al.	2407.21762	null
2024-07-31	Learning Video Context as Interleaved Multimodal Sequences	Kevin Qinghong Lin et.al.	2407.21757	link
2024-07-31	A Federated Learning-Friendly Approach for Parameter-Efficient Fine-Tuning of SAM in 3D Segmentation	Mothilal Asokan et.al.	2407.21739	null
2024-07-31	Open-Vocabulary Audio-Visual Semantic Segmentation	Ruohao Guo et.al.	2407.21721	null
2024-07-31	Adaptive Retrieval-Augmented Generation for Conversational Systems	Xi Wang et.al.	2407.21712	null
2024-07-31	CEAR: Automatic construction of a knowledge graph of chemical entities and roles from scientific literature	Stefan Langer et.al.	2407.21708	null
2024-07-31	TransferTOD: A Generalizable Chinese Multi-Domain Task-Oriented Dialogue System with Transfer Capabilities	Ming Zhang et.al.	2407.21693	link
2024-07-31	Synth-Empathy: Towards High-Quality Synthetic Empathy Data	Hao Liang et.al.	2407.21669	link
2024-08-01	Defending Jailbreak Attack in VLMs via Cross-modality Information Detector	Yue Xu et.al.	2407.21659	link
2024-07-31	MTA-CLIP: Language-Guided Semantic Segmentation with Mask-Text Alignment	Anurag Das et.al.	2407.21654	null
2024-07-31	Zero-Shot Cross-Domain Dialogue State Tracking via Dual Low-Rank Adaptation	Xiang Luo et.al.	2407.21633	link
2024-07-31	TAROT: Task-Oriented Authorship Obfuscation Using Policy Optimization Methods	Gabriel Loiseau et.al.	2407.21630	link
2024-07-31	LLM-for-X: Application-agnostic Integration of Large Language Models to Support Personal Writing Workflows	Lukas Teufelberger et.al.	2407.21593	null
2024-07-31	A Performance Study of LLM-Generated Code on Leetcode	Tristan Coignion et.al.	2407.21579	null
2024-07-30	ThinK: Thinner Key Cache by Query-Driven Pruning	Yuhui Xu et.al.	2407.21018	null
2024-07-30	CLEFT: Language-Image Contrastive Learning with Efficient Large Language Model and Prompt Fine-Tuning	Yuexi Du et.al.	2407.21011	link
2024-07-30	GABInsight: Exploring Gender-Activity Binding Bias in Vision-Language Models	Ali Abdollahi et.al.	2407.21001	link
2024-07-30	MoFO: Momentum-Filtered Optimizer for Mitigating Forgetting in LLM Fine-Tuning	Yupeng Chen et.al.	2407.20999	null
2024-07-30	From Feature Importance to Natural Language Explanations Using LLMs with RAG	Sule Tekkesinoglu et.al.	2407.20990	link
2024-07-30	Large Language Models (LLMs) for Semantic Communication in Edge-based IoT Networks	Alakesh Kalita et.al.	2407.20970	null
2024-07-30	MMTrail: A Multimodal Trailer Video Dataset with Language and Music Descriptions	Xiaowei Chi et.al.	2407.20962	link
2024-07-30	UniProcessor: A Text-induced Unified Low-level Image Processor	Huiyu Duan et.al.	2407.20928	link
2024-07-30	SSPA: Split-and-Synthesize Prompting with Gated Alignments for Multi-Label Image Recognition	Hao Tan et.al.	2407.20920	null
2024-07-30	Automated Review Generation Method Based on Large Language Models	Shican Wu et.al.	2407.20906	link
2024-07-30	Faithful and Plausible Natural Language Explanations for Image Classification: A Pipeline Approach	Adam Wojciechowski et.al.	2407.20899	link
2024-07-30	ThinkRepair: Self-Directed Automated Program Repair	Xin Yin et.al.	2407.20898	link
2024-07-30	Effective Black Box Testing of Sentiment Analysis Classification Networks	Parsa Karbasizadeh et.al.	2407.20884	null
2024-07-30	Breaking Agents: Compromising Autonomous LLM Agents Through Malfunction Amplification	Boyang Zhang et.al.	2407.20859	null
2024-07-30	Learn by Selling: Equipping Large Language Models with Product Knowledge for Context-Driven Recommendations	Sarthak Anand et.al.	2407.20856	null
2024-07-30	Large Language Model (LLM)-enabled Graphs in Dynamic Networking	Geng Sun et.al.	2407.20840	null
2024-07-30	How to Measure the Intelligence of Large Language Models?	Nils Körber et.al.	2407.20828	null
2024-07-30	Diffusion Augmented Agents: A Framework for Efficient Exploration and Transfer Learning	Norman Di Palo et.al.	2407.20798	null
2024-07-30	Interpretable Pre-Trained Transformers for Heart Time-Series Data	Harry J. Davies et.al.	2407.20775	link
2024-07-30	OmniBal: Towards Fast Instruct-tuning for Vision-Language Models via Omniverse Computation Balance	Yongqiang Yao et.al.	2407.20761	link
2024-07-29	Specify and Edit: Overcoming Ambiguity in Text-Based Image Editing	Ekaterina Iakovleva et.al.	2407.20232	null
2024-07-29	Improving 2D Feature Representations by 3D-Aware Fine-Tuning	Yuanwen Yue et.al.	2407.20229	null
2024-07-29	FlexAttention for Efficient High-Resolution Vision-Language Models	Junyan Li et.al.	2407.20228	null
2024-07-29	Can Editing LLMs Inject Harm?	Canyu Chen et.al.	2407.20224	null
2024-07-29	SANGRIA: Surgical Video Scene Graph Optimization for Surgical Workflow Prediction	Çağhan Köksal et.al.	2407.20214	null
2024-07-29	QAEA-DR: A Unified Text Augmentation Framework for Dense Retrieval	Hongming Tan et.al.	2407.20207	null
2024-07-29	MindSearch: Mimicking Human Minds Elicits Deep AI Searcher	Zehui Chen et.al.	2407.20183	link
2024-07-29	Theia: Distilling Diverse Vision Foundation Models for Robot Learning	Jinghuan Shang et.al.	2407.20179	link
2024-07-29	AutoScale: Automatic Prediction of Compute-optimal Data Composition for Training LLMs	Feiyang Kang et.al.	2407.20177	link
2024-07-29	Advancing Multimodal Large Language Models in Chart Question Answering with Visualization-Referenced Instruction Tuning	Xingchen Zeng et.al.	2407.20174	link
2024-07-29	Diffusion Feedback Helps CLIP See Better	Wenxuan Wang et.al.	2407.20171	link
2024-07-29	Language-Conditioned Offline RL for Multi-Robot Navigation	Steven Morad et.al.	2407.20164	null
2024-07-29	rLLM: Relational Table Learning with LLMs	Weichen Li et.al.	2407.20157	link
2024-07-29	ByteCheckpoint: A Unified Checkpointing System for LLM Development	Borui Wan et.al.	2407.20143	null
2024-07-29	Strong Copyright Protection for Language Models via Adaptive Model Fusion	Javier Abad et.al.	2407.20105	null
2024-07-29	Orca: Ocean Significant Wave Height Estimation with Spatio-temporally Aware Large Language Models	Zhe Li et.al.	2407.20053	null
2024-07-29	Exploring Large Language Models to generate Easy to Read content	Paloma Martínez et.al.	2407.20046	null
2024-07-29	MaskInversion: Localized Embeddings via Optimization of Explainability Maps	Walid Bousselham et.al.	2407.20034	null
2024-07-29	Efficient Training of Large Language Models on Distributed Infrastructures: A Survey	Jiangfei Duan et.al.	2407.20018	null
2024-07-29	Rosetta Statements: Lowering the Barrier for Semantic Parsing and Increasing the Cognitive Interoperability of Knowledge Graphs	Lars Vogt et.al.	2407.20007	null
2024-07-26	Wolf: Captioning Everything with a World Summarization Framework	Boyi Li et.al.	2407.18908	null
2024-07-26	SHIC: Shape-Image Correspondences with no Keypoint Supervision	Aleksandar Shtedritski et.al.	2407.18907	null
2024-07-26	A Flexible and Scalable Approach for Collecting Wildlife Advertisements on the Web	Juliana Barbosa et.al.	2407.18898	link
2024-07-26	Small Molecule Optimization with Large Language Models	Philipp Guevorguian et.al.	2407.18897	link
2024-07-26	Human-artificial intelligence teaming for scientific information extraction from data-driven additive manufacturing research using large language models	Mutahar Safdar et.al.	2407.18827	null
2024-07-26	Automatic Detection of Moral Values in Music Lyrics	Vjosa Preniqi et.al.	2407.18787	link
2024-07-26	The power of Prompts: Evaluating and Mitigating Gender Bias in MT with LLMs	Aleix Sant et.al.	2407.18786	null
2024-07-26	Foundation Models for the Digital Twin Creation of Cyber-Physical Systems	Shaukat Ali et.al.	2407.18779	null
2024-07-26	TAGIFY: LLM-powered Tagging Interface for Improved Data Findability on OGD portals	Kevin Kliimask et.al.	2407.18764	null
2024-07-26	Knowledge Graph Structure as Prompt: Improving Small Language Models Capabilities for Knowledge-based Causal Discovery	Yuni Susanti et.al.	2407.18752	link
2024-07-26	Towards Effective and Efficient Continual Pre-training of Large Language Models	Jie Chen et.al.	2407.18743	null
2024-07-26	Towards Generalized Offensive Language Identification	Alphaeus Dmonte et.al.	2407.18738	null
2024-07-26	LLASP: Fine-tuning Large Language Models for Answer Set Programming	Erica Coppolillo et.al.	2407.18723	null
2024-07-26	Neurosymbolic AI for Enhancing Instructability in Generative AI	Amit Sheth et.al.	2407.18722	null
2024-07-26	Cluster-norm for Unsupervised Probing of Knowledge	Walter Laurito et.al.	2407.18712	link
2024-07-26	Adaptive Contrastive Search: Uncertainty-Guided Decoding for Open-Ended Text Generation	Esteban Garces Arias et.al.	2407.18698	link
2024-07-26	Collaborative Evolving Strategy for Automatic Data-Centric Development	Xu Yang et.al.	2407.18690	null
2024-07-26	The BIAS Detection Framework: Bias Detection in Word Embeddings and Language Models for European Languages	Alexandre Puttick et.al.	2407.18689	link
2024-07-26	Right Now, Wrong Then: Non-Stationary Direct Preference Optimization under Preference Drift	Seongho Son et.al.	2407.18676	null
2024-07-26	Every Part Matters: Integrity Verification of Scientific Figures Based on Multimodal Large Language Models	Xiang Shi et.al.	2407.18626	link
2024-07-25	Self-Training with Direct Preference Optimization Improves Chain-of-Thought Reasoning	Tianduo Wang et.al.	2407.18248	link
2024-07-25	LoRA-Pro: Are Low-Rank Adapters Properly Optimized?	Zhengbo Wang et.al.	2407.18242	link
2024-07-25	Recursive Introspection: Teaching Language Model Agents How to Self-Improve	Yuxiao Qu et.al.	2407.18219	null
2024-07-26	Exploring Scaling Trends in LLM Robustness	Nikolaus Howe et.al.	2407.18213	null
2024-07-25	AsEP: Benchmarking Deep Learning Methods for Antibody-specific Epitope Prediction	Chunan Liu et.al.	2407.18184	link
2024-07-25	Gene Regulatory Network Inference from Pre-trained Single-Cell Transcriptomics Transformer with Joint Graph Learning	Sindhura Kommu et.al.	2407.18181	null
2024-07-25	Unlocking Tokens as Data Points for Generalization Bounds on Larger Language Models	Sanae Lotfi et.al.	2407.18158	null
2024-07-25	$\mathbb{X}$ -Sample Contrastive Loss: Improving Contrastive Learning with Sample Similarity Graphs	Vlad Sobal et.al.	2407.18134	null
2024-07-25	Dallah: A Dialect-Aware Multimodal Large Language Model for Arabic	Fakhraddin Alwajih et.al.	2407.18129	null
2024-07-25	Efficient Inference of Vision Instruction-Following Models with Elastic Cache	Zuyan Liu et.al.	2407.18121	link
2024-07-25	Multi-Resolution Histopathology Patch Graphs for Ovarian Cancer Subtyping	Jack Breen et.al.	2407.18105	link
2024-07-25	Fine-Tuning Large Language Models for Stock Return Prediction Using Newsflow	Tian Guo et.al.	2407.18103	null
2024-07-25	PEFT-U: Parameter-Efficient Fine-Tuning for User Personalization	Christopher Clarke et.al.	2407.18078	link
2024-07-25	C2P: Featuring Large Language Models with Causal Reasoning	Abdolmahdi Bagheri et.al.	2407.18069	null
2024-07-25	ComPeer: A Generative Conversational Agent for Proactive Peer Support	Tianjian Liu et.al.	2407.18064	link
2024-07-25	Audio Entailment: Assessing Deductive Reasoning for Audio Understanding	Soham Deshmukh et.al.	2407.18062	link
2024-07-25	Difficulty Estimation and Simplification of French Text Using LLMs	Henri Jamet et.al.	2407.18061	null
2024-07-25	The Geometry of Queries: Query-Based Innovations in Retrieval-Augmented Generation	Eric Yang et.al.	2407.18044	null
2024-07-25	RestoreAgent: Autonomous Image Restoration Agent via Multimodal Large Language Models	Haoyu Chen et.al.	2407.18035	null
2024-07-25	GermanPartiesQA: Benchmarking Commercial Large Language Models for Political Bias and Sycophancy	Jan Batzner et.al.	2407.18008	null
2024-07-24	I Could've Asked That: Reformulating Unanswerable Questions	Wenting Zhao et.al.	2407.17469	link
2024-07-24	WildHallucinations: Evaluating Long-form Factuality in LLMs with Real-World Entity Queries	Wenting Zhao et.al.	2407.17468	null
2024-07-24	CMR Scaling Law: Predicting Critical Mixture Ratios for Continual Pre-training of Language Models	Jiawei Gu et.al.	2407.17467	null
2024-07-24	$VILA^2$ : VILA Augmented VILA	Yunhao Fang et.al.	2407.17453	null
2024-07-24	Fluent Student-Teacher Redteaming	T. Ben Thompson et.al.	2407.17447	link
2024-07-24	Can Watermarking Large Language Models Prevent Copyrighted Text Generation and Hide Training Data?	Michael-Andrei Panaitescu-Liess et.al.	2407.17417	null
2024-07-24	(PASS) Visual Prompt Locates Good Structure Sparsity through a Recurrent HyperNetwork	Tianjin Huang et.al.	2407.17412	null
2024-07-24	Dependency Transformer Grammars: Integrating Dependency Structures into Transformer Language Models	Yida Zhao et.al.	2407.17406	link
2024-07-24	Grammar-based Game Description Generation using Large Language Models	Tsunehiko Tanaka et.al.	2407.17404	link
2024-07-24	3D Question Answering for City Scene Understanding	Penglei Sun et.al.	2407.17398	null
2024-07-24	PERSONA: A Reproducible Testbed for Pluralistic Alignment	Louis Castricato et.al.	2407.17387	null
2024-07-24	A Comprehensive Approach to Misspelling Correction with BERT and Levenshtein Distance	Amirreza Naziri et.al.	2407.17383	null
2024-07-24	MMRA: A Benchmark for Multi-granularity Multi-image Relational Association	Siwei Wu et.al.	2407.17379	link
2024-07-24	ViPer: Visual Personalization of Generative Models via Individual Preference Learning	Sogand Salehi et.al.	2407.17365	null
2024-07-24	Gradient-based inference of abstract task representations for generalization in neural networks	Ali Hummos et.al.	2407.17356	null
2024-07-24	Scalify: scale propagation for efficient low-precision LLM training	Paul Balança et.al.	2407.17353	link
2024-07-24	Boosting Large Language Models with Socratic Method for Conversational Mathematics Teaching	Yuyang Ding et.al.	2407.17349	link
2024-07-24	DexGANGrasp: Dexterous Generative Adversarial Grasping Synthesis for Task-Oriented Manipulation	Qian Feng et.al.	2407.17348	null
2024-07-24	Label Alignment and Reassignment with Generalist Large Language Model for Enhanced Cross-Domain Named Entity Recognition	Ke Bao et.al.	2407.17344	null
2024-07-24	How Good (Or Bad) Are LLMs at Detecting Misleading Visualizations?	Leo Yu-Ho Lo et.al.	2407.17291	null
2024-07-23	PartGLEE: A Foundation Model for Recognizing and Parsing Any Objects	Junyi Li et.al.	2407.16696	link
2024-07-23	Stress-Testing Long-Context Language Models with Lifelong ICL and Task Haystack	Xiaoyue Xu et.al.	2407.16695	link
2024-07-23	Can Large Language Models Automatically Jailbreak GPT-4V?	Yuanwei Wu et.al.	2407.16686	null
2024-07-23	SAM-CP: Marrying SAM with Composable Prompts for Versatile Segmentation	Pengfei Chen et.al.	2407.16682	null
2024-07-23	RedAgent: Red Teaming Large Language Models with Context-aware Autonomous Language Agent	Huiyu Xu et.al.	2407.16667	null
2024-07-23	Course-Correction: Safety Alignment Using Synthetic Preferences	Rongwu Xu et.al.	2407.16637	link
2024-07-23	Lawma: The Power of Specialization for Legal Tasks	Ricardo Dominguez-Olmedo et.al.	2407.16615	null
2024-07-23	Data Mixture Inference: What do BPE Tokenizers Reveal about their Training Data?	Jonathan Hayase et.al.	2407.16607	link
2024-07-23	Shared Imagination: LLMs Hallucinate Alike	Yilun Zhou et.al.	2407.16604	null
2024-07-23	A Comparative Study on Patient Language across Therapeutic Domains for Effective Patient Voice Classification in Online Health Discussions	Giorgos Lysandrou et.al.	2407.16593	null
2024-07-23	Exploring Automatic Cryptographic API Misuse Detection in the Era of LLMs	Yifan Xia et.al.	2407.16576	null
2024-07-23	TLCR: Token-Level Continuous Reward for Fine-grained Reinforcement Learning from Human Feedback	Eunseop Yoon et.al.	2407.16574	link
2024-07-23	Retrieve, Generate, Evaluate: A Case Study for Medical Paraphrases Generation with Small Language Models	Ioana Buhnila et.al.	2407.16565	link
2024-07-23	Patched RTC: evaluating LLMs for diverse software development tasks	Asankhaya Sharma et.al.	2407.16557	link
2024-07-24	MicroEmo: Time-Sensitive Multimodal Emotion Recognition with Micro-Expression Dynamics in Video Dialogues	Liyun Zhang et.al.	2407.16552	null
2024-07-23	Quantifying the Role of Textual Predictability in Automatic Speech Recognition	Sean Robertson et.al.	2407.16537	null
2024-07-23	Imperfect Vision Encoders: Efficient and Robust Tuning for Vision-Language Models	Aristeidis Panos et.al.	2407.16526	null
2024-07-23	AMONGAGENTS: Evaluating Large Language Models in the Interactive Text-Based Social Deduction Game	Yizhou Chi et.al.	2407.16521	null
2024-07-23	Language-Based Security for Low-Level MPC	Christian Skalka et.al.	2407.16504	null
2024-07-23	Machine Translation Hallucination Detection for Low and High Resource Languages using Large Language Models	Kenza Benkirane et.al.	2407.16470	link
2024-07-22	AutoAD-Zero: A Training-Free Framework for Zero-Shot Audio Description	Junyu Xie et.al.	2407.15850	link
2024-07-22	LLMmap: Fingerprinting For Large Language Models	Dario Pasquini et.al.	2407.15847	link
2024-07-22	SlowFast-LLaVA: A Strong Training-Free Baseline for Video Large Language Models	Mingze Xu et.al.	2407.15841	link
2024-07-22	MMInstruct: A High-Quality Multi-Modal Instruction Tuning Dataset with Extensive Diversity	Yangzhou Liu et.al.	2407.15838	link
2024-07-22	dMel: Speech Tokenization made Simple	He Bai et.al.	2407.15835	null
2024-07-22	J-CHAT: Japanese Large-scale Spoken Dialogue Corpus for Spoken Dialogue Language Modeling	Wataru Nakata et.al.	2407.15828	null
2024-07-22	Accelerating Pre-training of Multimodal LLMs via Chain-of-Sight	Ziyuan Huang et.al.	2407.15819	null
2024-07-22	Perceptions of Linguistic Uncertainty by Language Models and Humans	Catarina G Belem et.al.	2407.15814	link
2024-07-22	AdaCLIP: Adapting CLIP with Hybrid Learnable Prompts for Zero-Shot Anomaly Detection	Yunkang Cao et.al.	2407.15795	link
2024-07-22	CLIP with Generative Latent Replay: a Strong Baseline for Incremental Learning	Emanuele Frascaroli et.al.	2407.15793	link
2024-07-22	Extracting Structured Insights from Financial News: An Augmented LLM Driven Approach	Rian Dolphin et.al.	2407.15788	null
2024-07-22	Concept-Based Interpretable Reinforcement Learning with Limited to No Human Labels	Zhuorui Ye et.al.	2407.15786	null
2024-07-22	Conditioned Language Policy: A General Framework for Steerable Multi-Objective Finetuning	Kaiwen Wang et.al.	2407.15762	null
2024-07-22	MoRSE: Bridging the Gap in Cybersecurity Expertise with Retrieval Augmented Generation	Marco Simoni et.al.	2407.15748	null
2024-07-22	OMoS-QA: A Dataset for Cross-Lingual Extractive Question Answering in a German Migration Context	Steffen Kleinle et.al.	2407.15736	null
2024-07-22	TaskGen: A Task-Based, Memory-Infused Agentic Framework using StrictJSON	John Chong Min Tan et.al.	2407.15734	link
2024-07-22	Zero-Shot Embeddings Inform Learning and Forgetting with Vision-Language Encoders	Laura Niss et.al.	2407.15731	null
2024-07-22	SAM2CLIP2SAM: Vision Language Model for Segmentation of 3D CT Scans for Covid-19 Detection	Dimitrios Kollias et.al.	2407.15728	null
2024-07-22	DStruct2Design: Data and Benchmarks for Data Structure Driven Generative Floor Plan Design	Zhi Hao Luo et.al.	2407.15723	link
2024-07-22	Do Large Language Models Have Compositional Ability? An Investigation into Limitations and Scalability	Zhuoyan Xu et.al.	2407.15720	link
2024-07-19	Internal Consistency and Self-Feedback in Large Language Models: A Survey	Xun Liang et.al.	2407.14507	link
2024-07-19	On Pre-training of Multimodal Language Models Customized for Chart Understanding	Wan-Cyuan Fan et.al.	2407.14506	null
2024-07-19	PD-TPE: Parallel Decoder with Text-guided Position Encoding for 3D Visual Grounding	Chenshu Hou et.al.	2407.14491	null
2024-07-19	Evaluating the Reliability of Self-Explanations in Large Language Models	Korbinian Randl et.al.	2407.14487	link
2024-07-19	Data-Centric Human Preference Optimization with Rationales	Hoang Anh Just et.al.	2407.14477	link
2024-07-19	Contrastive Learning with Counterfactual Explanations for Radiology Report Generation	Mingjie Li et.al.	2407.14474	null
2024-07-19	Check-Eval: A Checklist-based Approach for Evaluating Text Quality	Jayr Pereira et.al.	2407.14467	null
2024-07-19	Undermining Mental Proof: How AI Can Make Cooperation Harder by Making Thinking Easier	Zachary Wojtowicz et.al.	2407.14452	null
2024-07-19	Token-level Correlation-guided Compression for Efficient Multimodal Document Understanding	Renshan Zhang et.al.	2407.14439	link
2024-07-19	Jumping Ahead: Improving Reconstruction Fidelity with JumpReLU Sparse Autoencoders	Senthooran Rajamanoharan et.al.	2407.14435	null
2024-07-19	Mixture of Experts with Mixture of Precisions for Tuning Quality of Service	HamidReza Imani et.al.	2407.14417	null
2024-07-19	System-1.x: Learning to Balance Fast and Slow Planning with Language Models	Swarnadeep Saha et.al.	2407.14414	link
2024-07-19	DEAL: Disentangle and Localize Concept-level Explanations for VLMs	Tang Li et.al.	2407.14412	link
2024-07-19	The Vision of Autonomic Computing: Can LLMs Make It a Reality?	Zhiyang Zhang et.al.	2407.14402	null
2024-07-19	Frontiers of Deep Learning: From Novel Application to Real-World Deployment	Rui Xie et.al.	2407.14386	null
2024-07-19	Open Artificial Knowledge	Vadim Borisov et.al.	2407.14371	null
2024-07-19	Enhancing Zero-shot Audio Classification using Sound Attribute Knowledge from Large Language Models	Xuenan Xu et.al.	2407.14355	link
2024-07-19	Improving Retrieval in Sponsored Search by Leveraging Query Context Signals	Akash Kumar Mohankumar et.al.	2407.14346	null
2024-07-19	LLMs left, right, and center: Assessing GPT's capabilities to label political bias from web domains	Raphael Hernandes et.al.	2407.14344	null
2024-07-19	Multimodal Misinformation Detection using Large Vision-Language Models	Sahar Tahmasebi et.al.	2407.14321	null
2024-07-18	Latent Causal Probing: A Formal Perspective on Probing with Causal Models of Data	Charles Jin et.al.	2407.13765	null
2024-07-18	SegPoint: Segment Any Point Cloud via Large Language Model	Shuting He et.al.	2407.13761	null
2024-07-18	Black-Box Opinion Manipulation Attacks to Retrieval-Augmented Generation of Large Language Models	Zhuo Chen et.al.	2407.13757	null
2024-07-18	CellularLint: A Systematic Approach to Identify Inconsistent Behavior in Cellular Network Specifications	Mirza Masfiqur Rahman et.al.	2407.13742	null
2024-07-18	Baba Is AI: Break the Rules to Beat the Benchmark	Nathan Cloos et.al.	2407.13729	null
2024-07-18	CoDefeater: Using LLMs To Find Defeaters in Assurance Cases	Usman Gohar et.al.	2407.13717	link
2024-07-18	Understanding Reference Policies in Direct Preference Optimization	Yixin Liu et.al.	2407.13709	link
2024-07-18	A Comprehensive Review of Recommender Systems: Transitioning from Theory to Practice	Shaina Raza et.al.	2407.13699	null
2024-07-18	Benchmark Agreement Testing Done Right: A Guide for LLM Benchmark Evaluation	Yotam Perlitz et.al.	2407.13696	link
2024-07-18	Prover-Verifier Games improve legibility of LLM outputs	Jan Hendrik Kirchner et.al.	2407.13692	null
2024-07-18	Shaded Route Planning Using Active Segmentation and Identification of Satellite Images	Longchao Da et.al.	2407.13689	null
2024-07-18	FuLG: 150B Romanian Corpus for Language Model Pretraining	Vlad-Andrei Bădoiu et.al.	2407.13657	null
2024-07-18	COMCAT: Leveraging Human Judgment to Improve Automatic Documentation and Summarization	Skyler Grandel et.al.	2407.13648	null
2024-07-18	Weak-to-Strong Reasoning	Yuqing Yang et.al.	2407.13647	link
2024-07-18	Scaling Laws with Vocabulary: Larger Models Deserve Larger Vocabularies	Chaofan Tao et.al.	2407.13623	link
2024-07-18	KNOWNET: Guided Health Information Seeking from LLMs via Knowledge Graph Integration	Youfu Yan et.al.	2407.13598	null
2024-07-18	PLANTS: A Novel Problem and Dataset for Summarization of Planning-Like (PL) Tasks	Vishal Pallagani et.al.	2407.13597	null
2024-07-18	EarthMarker: A Visual Prompt Learning Framework for Region-level and Point-level Remote Sensing Imagery Comprehension	Wei Zhang et.al.	2407.13596	link
2024-07-18	Robust Calibration of Large Vision-Language Adapters	Balamurali Murugesan et.al.	2407.13588	link
2024-07-18	Towards Zero-Shot Multimodal Machine Translation	Matthieu Futeral et.al.	2407.13579	link
2024-07-17	LMMs-Eval: Reality Check on the Evaluation of Large Multimodal Models	Kaichen Zhang et.al.	2407.12772	link
2024-07-17	EchoSight: Advancing Visual-Language Models with Wiki Knowledge	Yibin Yan et.al.	2407.12735	null
2024-07-17	NL2Contact: Natural Language Guided 3D Hand-Object Contact Modeling with Diffusion Model	Zhongqun Zhang et.al.	2407.12727	null
2024-07-17	Is Sarcasm Detection A Step-by-Step Reasoning Process in Large Language Models?	Ben Yao et.al.	2407.12725	null
2024-07-17	The Future of Learning: Large Language Models through the Lens of Students	He Zhang et.al.	2407.12723	null
2024-07-17	MoME: Mixture of Multimodal Experts for Generalist Multimodal Large Language Models	Leyang Shen et.al.	2407.12709	link
2024-07-17	Subgraph-Aware Training of Text-based Methods for Knowledge Graph Completion	Youmin Ko et.al.	2407.12703	link
2024-07-17	Patch-Level Training for Large Language Models	Chenze Shao et.al.	2407.12665	link
2024-07-17	Zero-shot Text-guided Infinite Image Synthesis with LLM guidance	Soyeong Kwon et.al.	2407.12642	null
2024-07-17	Domain-specific or Uncertainty-aware models: Does it really make a difference for biomedical text classification?	Aman Sinha et.al.	2407.12626	null
2024-07-17	Harnessing the Power of Artificial Intelligence to Vitalize Endangered Indigenous Languages: Technologies and Experiences	Claudio Pinhanez et.al.	2407.12620	null
2024-07-17	AudienceView: AI-Assisted Interpretation of Audience Feedback in Journalism	William Brannon et.al.	2407.12613	link
2024-07-17	VisFocus: Prompt-Guided Vision Encoders for OCR-Free Dense Document Understanding	Ofir Abramovich et.al.	2407.12594	null
2024-07-18	Benchmarking Robust Self-Supervised Learning Across Diverse Downstream Tasks	Antoni Kowalczuk et.al.	2407.12588	**[link](https://github.com/la

Name		Name	Last commit message	Last commit date
Latest commit History 792 Commits
.github/workflows		.github/workflows
docs		docs
README.md		README.md
config.yaml		config.yaml
daily_arxiv.py		daily_arxiv.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Updated on 2025.02.04

Single Object & Visual Language Tracking

Large Language Model

About

Releases

Packages

Languages

Xuchen-Li/cv-arxiv-daily

Folders and files

Latest commit

History

Repository files navigation

Updated on 2025.02.04

Single Object & Visual Language Tracking

Large Language Model

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages