I am a Ph.D. student in Computer Science and Technology at Beihang University, advised by Prof. Xianglong Liu and Prof. Ruihao Gong. My research focuses on efficient training technologies for world models, with interests in scalable training, model compression, and system-algorithm co-design for generative AI.
I received my B.S. in Computer Science and Technology from Beihang University, ranking in the top 10% of my cohort. My recent work includes a survey on low-bit LLMs, Triton-based low-bit FlashAttention operators, and near-lossless 4-bit LLM training.
My work studies efficient training technologies for world models, spanning scalable optimization, data and systems efficiency, compression-aware training, and hardware-conscious implementation.
Current topics: world model training, efficient generative modeling, training system optimization, and model compression.

Jinyang Du, Ruihao Gong#, Linghan Ai, Zining Wang, Yunke Peng, Yao Wang, Lei Yan, Xuefei Wang, Yaoyuan Wang, Jinyang Guo, Dahua Lin, Xianglong Liu (# corresponding author)
Findings of the Association for Computational Linguistics (ACL Findings) 2026 First Author ACL Findings
Half-S revisits FP4 scaling for heavy-tailed LLM tensors and proposes a minimal scale correction that improves quantization grid utilization for practical near-lossless 4-bit training.
Jinyang Du, Ruihao Gong#, Linghan Ai, Zining Wang, Yunke Peng, Yao Wang, Lei Yan, Xuefei Wang, Yaoyuan Wang, Jinyang Guo, Dahua Lin, Xianglong Liu (# corresponding author)
Findings of the Association for Computational Linguistics (ACL Findings) 2026 First Author ACL Findings
Half-S revisits FP4 scaling for heavy-tailed LLM tensors and proposes a minimal scale correction that improves quantization grid utilization for practical near-lossless 4-bit training.

Jinyang Du, Jinyang Guo, Yifu Ding, Xianglong Liu
IEEE/CVF International Conference on Computer Vision Workshops (ICCVW) 2025 First Author
This work designs low-bit FlashAttention operators with Triton, using operator fusion and mixed-precision execution to improve long-context quantized inference efficiency.
Jinyang Du, Jinyang Guo, Yifu Ding, Xianglong Liu
IEEE/CVF International Conference on Computer Vision Workshops (ICCVW) 2025 First Author
This work designs low-bit FlashAttention operators with Triton, using operator fusion and mixed-precision execution to improve long-context quantized inference efficiency.

Ruihao Gong, Yifu Ding, Zining Wang, Chengtao Lv, Xingyu Zheng, Jinyang Du, Yong Yang, Shiqiao Gu, Haotong Qin, Jinyang Guo, Dahua Lin, Michele Magno, Xianglong Liu
Neural Networks 2025 Journal
This survey reviews low-bit quantization for large language models from basic formats, system support, and algorithmic strategies, connecting practical toolchains with future efficient LLM deployment.
Ruihao Gong, Yifu Ding, Zining Wang, Chengtao Lv, Xingyu Zheng, Jinyang Du, Yong Yang, Shiqiao Gu, Haotong Qin, Jinyang Guo, Dahua Lin, Michele Magno, Xianglong Liu
Neural Networks 2025 Journal
This survey reviews low-bit quantization for large language models from basic formats, system support, and algorithmic strategies, connecting practical toolchains with future efficient LLM deployment.
Led field practice trips to Zhongyang, Shanxi and Yu County, Hebei.
Helped organize more than 10 activities and completed over 250 hours of volunteer service.
Served as organization committee member, cohort academic representative, and class coordinator.