projects

modelscope

Embodied AI

Modelscope's Embodied AI Open-source Community

04/2024 - present

I initiated and managed Modelscope’s Embodied Intelligence open-source community; designed and promoted development processes using open-source models to popularize embodied intelligence. The homepage attracted more than 30,000 views.

Modelscope Homepage

Embodied AI Education

08/2024 - present

I made a presentation about applying multimodal large language models to embodied AI. I also wrote articles as a survey of embodied AI, publihsed on DataWhale and WayToAGI to popularize the concepts of Embodied AI.

Article 1 on DataWhale Article 2 on DataWhale

Article 1 on WayToAGI Article 2 on WayToAGI

python

Kpop Data Analysis

02/2021 - 04/2021

Part.1 Kpop Explained by Data

Analyzed data of all K-pop idols from its start to 2021 about K-pop Industry, artists and companies

pt.1 code

Part.2 Kpop Companies Explained by Data

Visualized the business performance of public K-pop companies and analyzed their artist management and international marketing strategies

pt.2 code

Here are the interactive data visualization of revenues and net income of Kpop Agencies from 2016 to 2020. If you hover your pointers over the lines of each year, the chart will show a hover box of the revenue or the net income of all companies that year.

Chart of Revenue Chart of Net Income

Part.3 International Kpop Artists

In my last Kpop Data Analysis Project. I realized that there are some mistakes about nationality of kpop artists in the dataset. I corrected the data and made a clearer visualization of international Kpop artists by using Python and Plotly. This an interative choropleth of Kpop Stars' nationality other than South Korea. If you hover your pointers on the map, ther will be a information box showing how many Kpop star are from this country.

pt.3 code Interactive Map

Part.4 Kpop On YouTube Explained by Data

As Kpop becomes increasingly international, YouTube plays a pivotal roles as the digital platforms for Kpop idols to share their music video to the audience all over the world. The view count is a key metrics reflecting the music videos' international popularity. I extracted the data of all Kpop music videos from Kpop Database and scraped the view counts of all 4262 music videos from YouTube by 04/05/2021.

pt.4 code

Part.5 Why Kpop Groups Have So Many Members?

On average, Kpop groups have 5.5 members. 5-member group is the most common form. But why can some Kpop groups become so big? The largest Kpop group, NCT, has 23 members. I did an exploratory data analysis of Kpop group sizes by timeline.

pt.5 code

python

Machine Learning in Python

Sentiment Analysis of Movie Reviews

01/2021

When you have a large amount of movie reviews, how can you know whether they are complements or criticisms? In this project, I used natural language processing tools to classify the sentiment of the text by using both shallow learning and deep learning, and made a sentiment analysis of the dataset of reviews on imdb.

pt.1-Basics pt.2-LSA pt.3-Ngram pt.4-BERT

Dimension Reduction with PCA

01/2021

In this article I will use Principal Component Analysis to showcase dimension reduction on 'banknote authentication' dataset

Social Media Analytics

11/2020

Analyzed the data of Trump and Biden's recent tweets by scraping their recent tweets, investigating people's responses and inspecting the contents of their tweets

Calculating π by Monte-Carlo Simulation

07/2021

As we learned more and more math, we found more and more ways to calculate π. In computational statistics, there is a way to calculate π by brute force -- Monte-Carlo Simulation. In this article, I will do a simple Monte-Carlo Simulation on the calculation of π, or the area of a circle. This method can also be applied to the calculation of any area of geometric shapes.

Bayesian Spam Filter

05/2020

In this project, I will use Naive Bayes Classifier and Bag-of-Words model to implement a Bayesian spam filter. This article will walk you through the process of implementation, training and testing.