I'm a computer vision learner and builder, currently focusing on AI scene understanding for real-world camera systems. I care about both model performance and whether a vision system can actually work reliably in messy, practical environments.
I have hands-on experience with core vision tasks such as object detection, instance segmentation, visual perception, and scene understanding. I'm also interested in VLMs, VLA models, and end-to-end vision-language-action systems, especially how frontier models connect perception, reasoning, and interaction. I enjoy turning research ideas into usable prototypes and tools.
- Visual AI systems for cameras and surveillance scenarios
- Object detection, segmentation, visual perception, and scene reasoning
- VLM / VLA applications, multimodal agents, and end-to-end perception-to-action models
- CNN / Transformer / vision foundation model applications
- PyTorch training & inference, OpenCV pipelines, and desktop AI tools
- Building clean demos, evaluation scripts, and practical engineering workflows
- Windows-Face-Hello — RGB webcam face unlock experiment for Windows, combining face recognition, liveness detection, and system integration.
- Owen-Studio — My technical blog for computer vision and deep learning notes.
- VIT — Vision model experiments covering classification, detection, segmentation, and training workflows.
- lane-vehicle-counter — Lane-level vehicle detection and counting based on OpenCV.
- VLM-Workbench — A desktop prototype for exploring structured visual understanding with VLMs.
Thanks for visiting — feel free to explore my projects or reach me by email. 🚀