About Me

Thrive to be a great data miner, to uncover business insights, and to build practical applications.

I'm Yingchi, born in China and currently working as a data scientist at Indeed, Singapore. I specialize in data mining and machine learning, and enjoy playing with big data techniques and environment.

Topics of interest: text mining, recommendation, neural network and more...

I love taekwondo, piano and ice cream. And I'm keen to learn, experience and share.

Download Resume



  • Apr 2019


    Data Scientist

    • Work on salary estimation projects based on textual (unstructured) and numerical (structured) data.

  • Jul 2018
    Jan 2019


    Data Scientist

    • Part of the btc.com team.

  • Jul 2017
    Jul 2018


    Data Scientist

    • Part of the product team.

    • Research on footfall analytics with telco data using machine learning algorithms. Implement and productionize those algorithms into our data analytics platform.

    • Develop intellectual property for the company by submitting research papers.

    • Design and develop the network planning application for telco operators to reduce upgrading cost while improving customer experience. Build the application in scala and deploy it in a big-data environment

  • Dec 2016
    Jan 2017


    Data Analytics Intern

    • Established the pipeline of internal metrics reporting by understanding the raw data, current data management system and the requirements from various team leaders

    • Produced dashboards on system and business performance to enable stakeholders to make effective decisions, using Chartio and SQL

    • Assisted engineering teams in database design

  • June 2016
    Nov 2016


    Data Science Intern

    • Part of the production team at DataSpark as a Data Science Intern

    • Worked in a big data environment with Hadoop as the data management platform and Apache Spark as the data processing engine

    • Conducted several geolocation data analysis projects which added new features to our data solutions and improved the model accuracy

    • Applied models such Maximum A Posterior Estimation (MAP), Naive Bayes, Multinomial Logit Regression, and Support Vector Machine (SVM)

    • Implemented reproducible code using R Markdown and Python for the projects

    • Built interactive data visualizations (Web apps) for the consulting team using JavaScript, NodeJS and ReactJS

  • May 2015
    July 2015

    Millward Brown

    Assistant Research Executive Intern

    • Worked on a long-term marketing research project for Budweiser

    • Successfully prepared the 2015 Q1 report for Budweiser with another team member in time, which received a great response from the client

    • Collected and complied the consumer survey data weekly using SPSS Survey Reporter

    • Detected several problems and initiated deep dive research to find explanations

    • Explore third party data and find causes problems we have identified from various sources of data.


  • 2018

    National University of Singapore

    Master of Computer Science

    Main courses taken:
    Neural Networks and Deep Learning (CS5242)
    Big-Data Analytics Technology (CS5344)
    Phenomena and Theories of Human-Computer Interaction (CS4249)
    Text Mining (CS5246)
    Knowledge Discovery and Data Mining (CS5228)
    Uncertainty Modelling in AI (CS5340)

  • 2013

    National University of Singapore

    Bachelor Degree of Business Analytics, 4.91/5.0

    Honours with Highest Distinction
    Winner of Lee Kuan Yem Gold Medal
    Awarded for Dean's List for 5 semesters

    Main courses taken:
    Mining Web Data for Business Insights (BT4222) | Search Engine Optimization & Analytics (BT4212)
    Data Mining (ST4240) | Business Intelligence Systems (IS4240)
    Stochastic Models in Management (DSC3215) | Computational Methods for Business
    Analytics (BT3102) | Statistical Methods for Finance (ST4245)
    Social Media Network Analysis (IS4241) | Simulation (ST3247)
    Stochastic Process (ST3236) | Regression Analysis (ST3131)

  • 2016

    CFA Institute

    Passed Level I of the CFA Program


Footfall Count Estimation Techniques Using Mobile Data

2017 IEEE 18th International Conference on Mobile Data Management (MDM)


RNN Chinese Novel Generator

A Chinese text generator using RNN (Recurrent Neural Network) and LSTM (Long-short Term Memory) layers. The training text is Modu 《默读》, a popular web fiction in Chinese.

Flask Calendar Integrated with Plotly Charts

A concise calendar (Fullcalendar) using Flask framework, and integrated with plotly.js to showcase interactive charts for the data.


Somewhere in Singapore



Leave me a message