Principal Component Analysis on the Twitter Data in the Restaurant Industry

Social Networking Service (SNS) is prevailing rapidly in Japan in recent years. Facebook, mixi and Twitter are the popular one. These are utilized in various field of life together with the convenient tool such as smart-phone. In this paper, principal component analysis and cluster analysis are executed in order to clarify the relationship among the corporate performance and the SNS utilization condition. We focus on restaurant industry and convenience store industry, where marketing competition which utilizes SNS to consumers is fierce. Marketing application would then be extracted. Reviewing past researches, there are some related papers, but they do not handle these analysis techniques. Moreover there have been few researches made on our theme stated above. Some interesting results were obtained.


Introduction
Social Networking Service (SNS) is prevailing rapidly in Japan in recent years. Facebook, mixi and twitter are the popular one. In particular, the number of users is increasing year by year and it has reached 328 million users at the point of September 2017.These are utilized in various field of life together with the convenient tool such as smart-phone. Twitter is well used in the marketing activities of each company. They carry out campaign through SNS, which become very popular in Japan. It is reported that many companies have improved corporate performance by utilizing SNS. In this paper, principal component analysis and cluster analysis are executed in order to clarify the relationship among the corporate performance and the SNS utilization condition. We focus on restaurant industry and convenience store industry, where marketing competition which utilizes SNS to consumers is fierce. Marketing application would then be extracted. Reviewing past researches, there are some related papers, but they do not handle these analysis techniques. Moreover there have been few researches made on our theme stated above. Some interesting results were obtained.  Next, plot chart is exhibited in Figure 1 where X axis is the 1st principal component and Y axis is the 2nd principal component. The 1st principal component shows scale, corporate performance, number of tweets, number of followers etc., which means "Scale". The 2nd principal component has a large value at retweets, tweets retweeted, tweets favorited, which implies "Diffusion".
Next, plot chart is exhibited in Figure 2 for the 3rd principal component (X axis) and the 4th principal component (Y axis).   Table 3 where up to 4th principal component are shown. Next, plot chart is exhibited in Figure 3 (The 1st principal component for X axis and the 2nd principal component for Y axis). We can observe the following 5 big clusters.

Right: Seven & i Holdings, Lawson
This is a high corporate performance, high frequency SNS utilization group. This is a single group. It is strong for retweets group. It makes many campaign and has good communication with consumers. Scale is rather small and the number of tweets is rather few. They do not make so much effort to SNS or it does not make so much hit.
Next, plot chart is exhibited in Figure 4 where the 3rd principal component is located at X axis and the 4th principal component is placed at Y axis. Thus we could obtain fruitful results by utilizing principal component analysis.

Cluster Analysis
Cluster analysis is executed in order to confirm the relationship/closeness among companies. The data used are the same with those of principal component analysis. First of all, cluster cohesion process is exhibited in Table 4. Distance is calculated by using Euclidean square distance. Dendrogram by Ward method is exhibited in Figure 5.  Principal Component Analysis has much more information than Cluster Analysis because Principal Component Analysis has the information of distance in the plotting plane. Principal Component Analysis and Cluster Analysis are not used at the same time so far, because the method and the objective of using it is quite different. But we have obtained marvelous results as stated above. This relationship should be examined in various cases.

Convenience Store Industry
We have obtained the result that Seven & i Holdings and Lawson are in the high corporate performance, high frequency SNS utilization group. They have twitter followers for more than 2 million consumers and are distinct from other companies.
MINISTOP has rather small 380 thousand followers but the number of likes is 2811 which is the most in the convenience store industry. Total retweet is 966527 which is also the most in the convenience store industry. MINISTOP maybe makes some device for the consumers to retweet. Looking into the retweet in detail, MINISTOP makes tweet that if the consumers make follow and retweet, consumers can get coupon by lottery. Thus, many consumers make retweet.

Restaurant Industry
From Figure 3, we can observe that McDonald is overwhelming in retweet theme. The example of McDonald's campaign to stimulate retweet is as follows.
If consumers follow McDonald's account (@McDonalds Japan) and retweet the tweet which is to be executed on 20 o'clock May 23, 5 persons are selected by lottery and "Suitable burger" are given for the number of followers.
From Figure 4, we can observe that KFC Holdings Japan is in a high communication group. Followers are 540 thousand, which is 1/4 compared with McDonald, but the number of tweet is 240 thousand, which is 30 times, and the number of follow is 6 thousand, which is 15 times, and the number of replies 3 thousand, which is 30 times compare with McDonald. KFC Holdings Japan is making device as follows.
Consumers can get KFC's LINE stamp by free of charge only by making follow even if the consumers do not retweet.
Thus each company is making every effort to sharpen swords.

Conclusion
Social Networking Service (SNS) is prevailing rapidly in Japan in recent years. Facebook, mixi and Twitter are the popular one. These are utilized in various field of life together with the convenient tool such as smart-phone. In this paper, principal component analysis and cluster analysis are executed in order to clarify the relationship among the corporate performance and the SNS utilization condition. We focus on restaurant industry and convenience store industry, where marketing competition which utilizes SNS to consumers is fierce. Marketing application would then be extracted.
The main results of principal component analysis are as follows.
In the chart of the 1st principal component (X axis) and the 2nd principal component (Y axis), we can observe the following 5 big clusters.

Right: Seven & i Holdings, Lawson
This is a high corporate performance, high frequency SNS utilization group.

Left Upper: McDonald's Holdings Company (Japan)
This is a single group. It is strong for retweets group. It makes many campaign and has good communication with consumers.

Lower Right: KFC Holdings Japan, SKYLARK (GUSTO), FamilyMart
This cluster has the characteristics that corporate performance and scale are rather big and retweets group are slightly low. Scale is rather small and the number of tweets is rather few. They do not make so much effort to SNS or it does not make so much hit.
Cluster analysis was executed in order to confirm the relationship/closeness among companies. The data used were the same with those of principal component analysis.
In the principal component analysis for the 1st principal component and the 2nd principal component, we could observe 5 big clusters as stated above. Cluster analysis wholly coincided with these results. This is really an astonishing result. Principal Component Analysis and Cluster Analysis are not used at the same time so far, because the method and the objective of using it is quite different. But we have obtained marvelous results as stated above. This relationship should be examined in various cases.
These are utilized for constructing a much more effective and useful marketing plan building for SNS. Although it has a limitation that it is restricted in the number of research, we could obtain the fruitful results. To confirm the findings by utilizing the new consecutive records would be the future works to be investigated.