Niu Wenxu

AI Software/Algorithm Engineer

Tell me and I forget, teach me and I may remember, involve me and I learn.

Education

UESTC

Bachelor of Engineering 2015-2019

GPA: 3.62/4.00; TOEFL 98/120; GRE: 314+2.5; First Prize Scholarship; MCM/ICM Meritorious Winner;

Project: Real-time object detection on Jetson TX2;

Internship: Virtuos Chengdu: Participate in AAA game dev: Beyond Two Soul PC Port;

Internship: China Mobile Research: Edge AI dev & SASS management system;

HKUST

MSc ICDE 2019-2021

Project: HLS FPGA Based human face counting system algorithm dev;

Internship: Slowfast action recognition, end to end data collection and processing engine

Internship: CRAFT + CRNN hand written Chinese character recognition deployment on Cloud and mini version on edge;

Key Skills

  • Computer Vision
  • AI ML Algorithm
  • TinyML
  • Embedded dev
  • AI Operator dev
  • AI on edge
  • AI Core Arch
  • SIMT SIMD Systolic array

Work Experience

PengCheng Lab / Jide Tech co.

| 2020.02 - 2021.06

  • Atlas800 AI HPC Perf Suit dev; Atlas 300 Face Detection/Recognition & Serverless Cloud native based Deployment platform;
  • uAISS TinyML Inference on edge platform, 50% latency improvement compared with TFLM;
  • AI operator dev on RISC-V vector extension;
  • Based on custom RISC-V Chip, working with MLCommoms community to submit v0.5 MLPerf Tiny; Paper accepted to NIPS 2021;

Huawei / Hisilicon

2021.06 - Today

  • Face Detection and landmark detection deployment on Hi3796, Allwinner T5, Hi3751 mobile class GPU; Algorithm fine tuning.
  • Lane detection deployment on NXP IMX8; Road object detection algorithm deployment on Hisilicon SD3403 NPU; IR and RGB fusion with uvc and mipi camera;
  • Conv3D Stereo matching and Livox lidar 3D objection algorithm development and deployment on Jetson Xavier;
  • AI ASIC Hardware architecture design, Operator development, architecture verification case dev; Experience coding with SIMT SIMD and systolic array based matrix core architecture; Experience working with AI core of different power consumption level, from mW mcu to autonomous driving to HPC.