While Tsinghua has cultivated generations of AI talent, it now faces the challenge of elevating them globally.
To address this issue, in this paper, we propose a novel generalist model, i.e., Video-3D LLM, for 3D scene understanding. By treating 3D scenes as dynamic videos and incorporating 3D position ...
2412.10443 null 2024-12-11 COEF-VQ: Cost-Efficient Video Quality Understanding through a Cascaded Multimodal LLM Framework Xin Dong et.al. 2412.10435 null 2024-12-13 Apollo: An Exploration of Video ...
SLMs can offer a more streamlined, cost-effective alternative, with the added benefit of being easier to fine-tune and ...
Retrieval-Augmented Generation (RAG) is a transformative approach in artificial intelligence (AI) that enhances the ...
Abstract: Mental health has attracted substantial attention in recent years and large language model (LLM) can be an effective technology for alleviating this problem owing to its capability in text ...
The path forward will be marked by collaboration between fields like neuroscience, computer science, and cognitive psychology ...
Understanding exactly how the human brain processes speech and language can accelerate ... For example, a 2024 PNAS study by ...