Overview#
Shanshan Shen | 申杉杉
September 18th, 1998
I am currently a software engineer working at Huawei Ascend, focusing on LLM/VLM inference and GPU/NPU computing. I have contributed to some popular open source projects such as vLLM, to build a easy-to-use software ecosystem for Ascend. Before this, I was a student at Beijing Jiao Tong University (BSc/MSc), majoring in communication engineering. You can see my projects at Github, or see my posts at Zhihu.
Experiences#
Huawei
Software Engineer
July '23 - Present
Working at Huawei Ascend, building vLLM inference engine, previously at Quality and Process IT Department.Beijing Jiao Tong University
Postgraduate Student
September '21 - June '23
Studying at School of Electronic and Information Engineering, focusing on wireless communication NAS layer security.Beijing Jiao Tong University
Undergraduate Student
September '16 - June '20
Studying at School of Electronic and Information Engineering, majoring in communication engineering.
Open Source Contributions#
During my open source experience, I have become:
- Outside collaborator of vllm.
- Core contributor to vllm-ascend.
- Contributor to vllm-omni.
A high-throughput and memory-efficient inference and serving engine for LLMs
Community maintained hardware plugin for vLLM on Ascend
A framework for efficient model inference with omni-modality models
In addition, I have contributed 131 PRs to the vLLM ecosystem, focusing on multi-modal inference, structured output and elastic scaling. Find more details about all PRs I have contributed here.
Projects#
cs-self-learning is one of my projects used for archiving my notes, codes and materials of CS (Computer Science) learning.
This repo is used for archiving my notes, codes and materials of cs learning.
Programming Skills#
Multilingual Skills#
- English: IELTS overall band score of 6.5.
Contact me#
Feel free to drop me an email, gmail or qq-mail are both available.
You can also have my WeChat through the picture shown below:

NOTE: Remember to add a description about your intentions before sending a request.