Thrive to be a great data miner, to uncover business insights, and to build practical applications.
I'm Yingchi, born in China and currently working as a data scientist at DataSpark, Singapore. I specialize in data mining and machine learning, and enjoy playing with big data techniques and environment.
Topics of interest: regression, classification, text mining, neural network and more...
I love taekwondo, piano and ice cream. And I'm keen to learn, experience and share.
• Part of the product team.
• Work on geoanalytics and devops.
• Established the pipeline of internal metrics reporting by understanding the raw data, current data management system and the requirements from various team leaders
• Produced dashboards on system and business performance to enable stakeholders to make effective decisions, using Chartio and SQL
• Assisted engineering teams in database design
• Part of the production team at DataSpark as a Data Science Intern
• Worked in a big data environment with Hadoop as the data management platform and Apache Spark as the data processing engine
• Conducted several geolocation data analysis projects which added new features to our data solutions and improved the model accuracy
• Applied models such Maximum A Posterior Estimation (MAP), Naive Bayes, Multinomial Logit Regression, and Support Vector Machine (SVM)
• Implemented reproducible code using R Markdown and Python for the projects
• Worked on a long-term marketing research project for Budweiser
• Successfully prepared the 2015 Q1 report for Budweiser with another team member in time, which received a great response from the client
• Collected and complied the consumer survey data weekly using SPSS Survey Reporter
• Detected several problems and initiated deep dive research to find explanations
• Explore third party data and find causes problems we have identified from various sources of data.
Main courses taken:
Mining Web Data for Business Insights (BT4222) | Search Engine Optimization & Analytics (BT4212)
Data Mining (ST4240) | Business Intelligence Systems (IS4240)
Stochastic Models in Management (DSC3215) | Computational Methods for Business
Analytics (BT3102) | Statistical Methods for Finance (ST4245)
Social Media Network Analysis (IS4241) | Simulation (ST3247)
Stochastic Process (ST3236) | Regression Analysis (ST3131)
2017 IEEE 18th International Conference on Mobile Data Management (MDM)
A Chinese text generator using RNN (Recurrent Neural Network) and LSTM (Long-short Term Memory) layers. The training text is Modu 《默读》, a popular web fiction in Chinese.
Somewhere in Singapore