|
🖐️I'm interested in Document Ai, Multi-modal tasks(CV+NLP), OCR, RL, ML
🖥️ Skills : Python, Pytorch, OpenCV, Docker, Triton, Tensorflow
- Synthetic Document Generator
- AI_Homework :: YOLO, FRCNN Character Detection
- AI_Homework :: Seq2Seq Video Captioning
- Landmark based VLN
- Joint Multimodal Embedding based VLN
- Places365 with Misaeng
- PAPA app Project
- Pancreas-CT Segmentation(by.Unet)
- Messenger Robot System
- Deep Reinforcement Learning for Visual Dialogue Agents-2018.05, KIPS Conference
- Deep Reinforcement Learning for Optimizing Visual Questions-2018.09, Journal of ICROS
- Real-Time Visual Grounding for Natural Language Instructions with Deep Neural Network-2019.05, KIPS Conference
- LVLN : A Landmark-Based Deep Neural Network Model for Vision-and-Language Navigation-2019.09, Journal of KIPS(KTSDE)
- Landmark-based Search for Vision-and-Language Navigation-2019.12 KSC Conference
- AnoVid: A Deep Neural Network-Based Tool for Video Annotation-2020.08, Journal of KMMS
- 시각-언어 이동을 위한 다중 모달 공동 임베딩과 역추적 탐색- Master's thesis
- Joint Multimodal Embedding and Backtracking Search in Vision-and-Language Navigation-2021.02, Journal of Sensors(SCIE)