[MGT6705] Time Series Data Analysis and Forecasting Final-term Project
π Project Overview
This project aimed to forecast monthly Japanese tourist arrivals to Korea and identify the impact of Korea-related keyword trends on travel behavior. Using Google Trends data and actual tourist statistics, we developed interpretable models to enhance prediction accuracy and provide tourism policy insights. Served as a team leader, responsible for topic proposal, data analysis, and report writing.
π¬ Methodology
- Dataset: Monthly Japanese tourist arrivals (2010β2023) + 9 Korea-related search terms (e.g., βK-popβ, βSamgyeopsalβ, βMyeongdongβ in Japanese)
- Model: SARIMA & SARIMAX (with lagged external regressors)
- Time Series Decomposition: STL(Seasonal-Trend decomposition using Loess) to separate seasonal/trend components
- Model Selection: Grid search over 15,625 SARIMAX lag combinations
- Evaluation Metric: Root Mean Squared Error (RMSE)
- Final Selected Variables:
Samgyeopsal
,K-pop
,Myeongdong
,Hangang
,Dakgalbi
(with optimized lags)
π Key Findings
- Samgyeopsal search interest precedes increases in tourists by ~3 months β strong long-term predictor.
- Myeongdong and Hangang show short-lag response (0β1 month), suitable for time-sensitive campaigns.
- SARIMAX models significantly outperformed SARIMA baselines in forecast accuracy.
- Seasonal lag patterns provide insight into how cultural and culinary interests shape Japanese tourist flow.
π‘ Policy Recommendations
- π Promote short-term campaigns for locations like Myeongdong with digital ads 1 month in advance.
- π² Launch food-themed cultural promotions (e.g., Samgyeopsal Weeks) 2β3 months before peak seasons.
- π€ Leverage K-pop concerts (e.g., Waterbomb) to stimulate summer visits, traditionally a low-demand season.
These strategies allow tourism stakeholders to act on behavioral lead times reflected in online search trends.