As I pointed it out in part.1 of Kpop Data Analysis. On average, Kpop groups have 5.5 members. 5-member group is the most common form. Usually, a Kpop group consists of a main vocalist, a main rapper, a main dancer, a visual (the gorgeous member that usually do acting and modeling for solo activities). This is all it needs to form a small Kpop group, like BLACKPINK and aespa. For larger groups, there are positions like lead vocalists, sub-vocalists, lead dancers, lead rappers. But why can some Kpop groups become so big? The largest Kpop group, NCT, has 23 members. I did an exploratory data analysis of Kpop group sizes by timeline.
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
plt.rcParams['figure.figsize'] = [12, 6]
boys = pd.read_csv('kpop_idols_boy_groups.csv')
girls = pd.read_csv('kpop_idols_girl_groups.csv')
all_groups = pd.concat([boys, girls])
all_groups.Debut = pd.to_datetime(all_groups.Debut)
all_groups.Debut = [debut.year for debut in all_groups.Debut]
all_groups.Company.replace(to_replace=[''], value=np.nan, inplace=True)
all_groups.Company = all_groups.Company.str.split(", ")
average = round(all_groups.groupby("Debut")['Orig. Memb.'].mean(),1)
average.plot(title="Average Group Size by Debut Year", ylabel="Members", xlabel="Debut Year")
average
Debut 1995 2.0 1996 5.0 1997 4.7 1998 6.0 1999 3.5 2001 4.0 2003 5.0 2004 4.0 2005 6.0 2006 4.5 2007 5.4 2008 4.8 2009 5.0 2010 5.5 2011 5.3 2012 5.7 2013 6.2 2014 5.3 2015 6.2 2016 6.1 2017 6.7 2018 6.6 2019 5.6 2020 5.9 2021 7.0 Name: Orig. Memb., dtype: float64
As we can see by the statistics above, the average Kpop group size generally increases year by year. Because there are too few Kpop groups debuting before 2000, the sample size is not big enough. Therefore, we can just look at the data after 2000. From 2001 to 2021, the average group size rose from 4 to 7. Especially, there is a small peak at 2005. This was because Super Junior, the first large-size (male) Kpop group debuted.
If we consider a Kpop group with 2-4 people as a small-size group, 5-7 people as a medium size group, then a group with over 7 people is a large size group. Super Junior, However, debuted with a whopping 12 members, and had 13 members at their peak. Also, Super Junior's Agency, SM Entertainment debuted the the first large-size female Kpop group as well in 2007 -- Girls' Generation (SNSD) with 9 members. This is also the reason why there is another small peak at 2007 in the chart above.
It was a bold and innovative trial for SM Entertainment to produce large-size Kpop groups at first. It might seem to be excessive to have so many people on stage. But eventually, Super Junior and Girls' Generation both became the most successful Kpop groups because their activities are also arranged in a different way from smaller groups. Here are the advantages of big groups:
Because of the early big groups' success, Kpop companies produced more and more big groups later on. By far, there are 59 large-size groups, consisting of 16.21% of all Kpop groups.
big = all_groups[all_groups["Orig. Memb."]>7]
big.shape[0]
59
round(big.shape[0]/all_groups.shape[0],4)
0.1621
big = big.sort_values(by=['Debut'])
big.groupby('Debut')['Name'].count()
Debut 2005 1 2007 1 2010 2 2011 1 2012 2 2013 2 2014 1 2015 4 2016 7 2017 12 2018 14 2019 3 2020 8 2021 1 Name: Name, dtype: int64
Here is the timeline of big groups only. We can see that big groups started within the 2nd Generation Kpop groups (2004 - 2011). This was the time that Kpop groups quickly developed and created the first batch of successful big groups like Super Junior and Girls' Generation I mentioned above. But there was only 1 or 2 big groups debuting from 2005 to 2014. After 2015, among the 3rd Generation Kpop groups, the number of big groups debuting each year drastically increased and reached the maximum of 14 groups in 2018. A feature of this era is the craze of survival reality shows, where the trainees compete with each other to get the chance of debuting in a big group according to the audience's votes. In 2015, JYP Entertainment brought out its 16 female trainees for its survival reality show and debuted the 9-member group TWICE. Later, TWICE became the most successful Kpop group at the time. From 2016 to 2019, Mnet made 4 seasons of the survival reality show Produce 101. The show gathers 101 trainees each year from different Kpop companies. For Season 3, Produce 48, the show even gathered Jpop idols from AKB48 series groups. Produce 101 series produced 3 11-member groups I.O.I, Wanna One, X1 and a 12-member IZ*ONE. Because of the popularity of the shows themselves and because the members were selected by the audiences. These big groups from survival reality shows also became extremely successful.
big.groupby('Debut')['Name'].count().plot.bar(title="Big Groups by Debut Year", xlabel="Debut Year",rot=0)
<AxesSubplot:title={'center':'Big Groups by Debut Year'}, xlabel='Debut Year'>
However, after 2018, the number of big groups debuting decreased significantly. I could not find an evidence-backed explanation for that. If you have any opinions, please let me know. I can only speculate that maybe big groups reached a market saturation then. Although big groups have great advantages that make them successful, there are also disadvantages: