Welcome

OpenMoss Lab

Welcome to our lab! The OpenMoss Lab, led by Prof. Xipeng Qiu at Fudan University, originated as part of the FudanNLP group, with a strong foundation in Natural Language Processing. Over time, the lab has significantly broadened its scope, now encompassing large language models, multimodal learning, embodied AI, and beyond.

Our mission is to advance the theory, methods, and applications of large-scale AI systems—from pretraining and reasoning to multimodal and embodied intelligence—while grounding our research in real-world applications and products that make a lasting impact.

Achievements

  • Pioneering LLM development in China: Released MOSS, one of the earliest open-source conversational LLMs in China.
  • Influential open-source contributions: Developed widely adopted NLP toolkits such as FudanNLP, FastNLP, and the CoLLiE framework for efficient LLM fine-tuning.
  • Strong industry collaborations: Joint projects with Huawei, Honor, ByteDance, and other leading companies in large-scale model training and deployment.
  • Global recognition: Lab alumni have continued their studies at top universities, including MIT, UC Berkeley, and CMU, or joined leading companies such as ByteDance, Alibaba, AWS, Optiver, and other top quant firms.
  • Academic excellence: Publications at top-tier venues such as NeurIPS, ICML, ICLR, ACL, EMNLP, NAACL, and ICCV, with several receiving Outstanding Paper Awards.
  • Talent cultivation: Multiple PhD graduates have secured faculty positions at Fudan University, Shanghai Jiao Tong University, and Shanghai AI Lab, or been selected as national-level young talents. Others have become entrepreneurs and executives (CEO/CTO) in LLM startups.

Research Interests

Our work covers a broad range of topics in large-scale AI and contextual intelligence, including but not limited to:

AI Infrastructure

  • Optimizers (e.g., LOMO, AdaLomo)
  • LLM fine-tuning frameworks (e.g., CoLLiE)
  • Inference optimization for scalable deployment

Multimodal Large Models

  • Speech-language models (SpeechGPT, SpeechTokenizer, SpeechAlign)
  • Vision-language understanding and reasoning (AnyGPT, Visuothink, UnifiedVisual)
  • Unified multimodal generation and alignment

Reinforcement Learning & Deep Reasoning in LLMs

  • Reasoning-enhanced dialogue and search agents (Exchange-of-Thought, ConvSearch-R1)
  • Exploration of test-time scaling and implicit reward optimization

Tools and Agents

  • Tool-augmented LLMs (UnifiedToolHub, FamilyTool, R3-RAG)
  • Real-world simulation environments (VehicleWorld)

Embodied Intelligence

  • Vision-language-action reasoning (VLA-bench, D²PO)
  • Task planning with embodied agents

New Architectures

  • Diffusion-based LLMs (Sparse-dLLM, LongLLaDA)
  • Long-context modeling (LongWanjuan, LongSafety)
  • Memory-efficient KV-cache methods and transformer variants

Mechanistic Interpretability

  • Dictionary learning & circuits
  • Attention decomposition and low-rank analysis

Pretraining & Post-training of LLMs

  • Large-scale pretraining (MOSS, InternLM series)
  • Data synthesis and decontamination
  • Weak-to-strong generalization strategies

Want to work with us? Check out our join page or explore career opportunities!