Two Models for Electricity Demand using Keyword Search Volume and Panel Artificial Neural Network
- Two Models for Electricity Demand using Keyword Search Volume and Panel Artificial Neural Network
- Other Titles
- 전력 수요 분석을 위한 키워드 검색량 모형과 패널 인공신경망 모형
- Jinsoo Kim
- Issue Date
- Hanyang University
- Big data analysis and machine learning are rising analytical tools in data analysis. Big data is an area that collects and maintains a huge amount of raw data for field-specific data analysis. Machine learning is the main analytical tool for handling such data. This study investigates the applicability of keyword search volume, and develops an ANN (Artificial Neural Network) model using panel data to analyze electricity demand and forecast prices.There is no analysis using keyword search volume in econometrics, especially energy economics. Therefore, this study intends to build a new electricity demand model. In addition, since there is no model building study that applies panel data, this study constructs a novel panel ANN model. This study consists of two essays: panel analysis model development and panel ANN model development.
In the first essay, this analysis derives the relationship between US household electricity consumption and renewable energy. For this purpose, keyword search volume is used to present new influential factors in analyzing economic indicators. The model considers three keywords related to electricity consumption: “renewable,” “weather forecast,” and “temperature.” Furthermore, there has been no way to quantify household renewable energy consumption, no studies have analyzed the correlation between renewable energy and US household electricity consumption. Such consumption is difficult to estimate and
it is more difficult to grasp than other major sectors including commerce and industry because of issues related to personal information collection and the cost of measurement. This study therefore analyzes the correlation with household electricity consumption by constructing a model including interest in renewable energy using keyword search volume.
The model, which analyzes the impact of these keywords is constructed using three regression equations based on the static energy demand model, and analyze the impact of these keywords. In the household sector, although a variety of renewable energy is used, it is difficult to derive the economic implications of such use as it is not converted into a quantifiable value. Therefore, this study uses the search keyword “renewable” to estimate the impact of renewable energy. “Weather forecast” and “temperature” were also selected as Internet search keywords. These keywords are used because temperature is one of the important factors in determining household electricity consumption.
As a result, all the variables are stationary and the Hausman test indicates that the fixed effects estimation is more robust than the random effects estimation. In the case of the model using the keyword “renewable” as an explanatory variable, all the variables except the price variable are statistically significant at the 1% level; this search term has a negative correlation with household electricity consumption. Household electricity consumption decreases by 16.017 million kWh for every one unit increase in the keywords search using “renewable.” “Temperature” also has a negative coefficient, which is similar to heating degree days.
The correlation between the two variables, which intuitively appear to be unrelated, could have significant meaning. When one searches for “renewable” in the context of their household, they probably have a clear purpose. In the event that excessive electricity is consumed or electricity bills are high, households will search for alternatives to reduce electricity consumption. In the case of households equipped with renewable energy facilities, the power consumption will decrease in proportion to the capacity, and the results of the estimation can be seen.
This study finds that the correlation coefficient of the “renewable” variable is the highest, and the “temperature” variable also has a significant correlation with household electricity consumption. The “renewable” keyword has a large negative correlation with household electricity consumption, which can be estimated as being a result of the growing interest in renewable energy. Although the electricity consumption patterns of households are influenced by many variables, this study suggests that interest in renewable energy should also be included as a major factor influencing such consumption.
In the second essay, this study predicts electricity price using ANN, which have already been used as tools for prediction in various fields. In general, ANN have been used for short-term forecasting in many economic analysis studies. On the other hand, as the forecast point increases, the accuracy of prediction decreases sharply. The forecasting accuracy in long-term forecasting is greater than that of short-term forecasting in the same dataset. Therefore, this study uses panel data to compensate for the decline in ANN forecasting accuracy in long-term forecasts in the same dataset.
The panel data contains information that time series data does not have. It has trend information of time series data as well as state or country characteristics. However, there are very few studies in economics that have used panel data for prediction using ANN. Existing studies use panel data without differentiating between entities in the model structure. The panel ANN studies did not differentiate between state and national data or have independent learning such as the pooled OLS method. Therefore, this study constructs a panel ANN structure using the advantages of panel data and analyzes its accuracy according to the change of forecasting periods. The model intends to improve the accuracy of predicted values by learning the unobserved heterogeneity contained in panel data from each state. The analysis is conducted on the assumption that it would be possible to learn not only time series information but also country or state information.
The panel analysis removes the cross-sectional dependence in the unobserved heterogeneity of the panel data. Unlike panel analysis, this study constructs a model structure to learn the unobserved heterogeneity of such data. The learning is conducted separately for each state, and two or three hidden layers are inserted. After 6, 12, 18 and 24 months forecasting, total RMSE and MAPE are estimated and the optimal model is selected.
For empirical analysis, this study uses panel data of US electricity prices by state. Natural gas prices are also predicted for additional model verification. For the electricity price forecasting model, the accuracy of the result using time series data in 6 and 12 months forecasts is higher than using panel data. On the other hand, the results of 18 and 24 months indicate that the results of panel data are much better. In the case of natural gas Citygate prices, the results of the model using time series data for only 6-month predictions are better while other predictions show that the panel data model has high accuracy. A noteworthy point is that panel data models tend to be more accurate as the forecast period increases. Although the timing of improvement in accuracy differs, both models show an improvement of the panel data forecasting model in long-term predictions.
According to the results, when estimating a small number of predicted values, the trend of the time-series data greatly influences the result and a time-series model produces better predictions. On the other hand, the longer the forecast period, the better the panel data model that learns from unobserved heterogeneity of the states rather than from the trends. Since weights are updated without affecting each layer, it can be said that the model learns by considering the heterogeneity of each state. In comparison to a time series model in which only the trend is learned, the panel data model utilizes more information to improve accuracy by learning the trends and heterogeneity of each state.
In this study, electricity consumption is analyzed using panel data and electricity price prediction is performed. The electricity consumption analysis suggests a new approach based on the model considered in household electricity consumption literature that incorporates data drawn from keyword search volume. This study used keyword search volume as a substitute variable to analyze the phenomenon that was impossible to explain due to the lack of quantitative data. This study shows that variables that have not been used hitherto, as they are not quantifiable or statistically significant, can be analyzed through keyword search volume.
In the electricity price forecasting analysis, a novel panel ANN model is proposed to compensate for the decrease in forecasting accuracy when the forecasting period increases Panel ANN is a model that can be applied from day-to-day and hourly forecasts to longterm trends of several years depending on the type of panel data. In analyzing the longterm trends, a neural network model that can replace the large-scale simulation models such as NEMS (National Energy Modeling System) and WEM (World Energy Model) can also be constructed. Therefore, this model can be applied in various fields ranging from the hourly price forecast of the next day's electricity market to the long-term trend of CO2 emissions. 본 연구는 패널 데이터를 활용하여 전력 소비량을 분석하고 전력 가격 예측을 시행하였다. 전력 소비량의 패널 분석은 주거용 전력 소비 문헌에서 고려된 모형을 기반으로 키워드 검색량 자료와 접목하여 새로운 모형을 제시하고 있다. 신 재생 설비 설치 용량 뿐 아니라 재생에너지에 대한 관심이 전력 소비량에 영향을 미침에도 불구하고 정량적인 데이터가 없어 설명할 수 없었던 부분을 본 연구에서는 키워드 검색량을 대체 변수로 사용하여 분석하였다. 결론적으로 이 연구는 지금까지 사용되지 않았지만 계량화가 불가능하거나 통계적으로 유의미한 변수가 키워드 검색량을 통해 분석 될 수 있음을 보여준다.
- Appears in Collections:
- GRADUATE SCHOOL[S](대학원) > EARTH RESOURCES AND ENVIRONMENTAL ENGINEERING(자원환경공학과) > Theses (Ph.D.)
- Files in This Item:
- Two Models for Electricity Demand using Keyword Search Volume and Panel Artificial Neural Network.pdfDownload
- RIS (EndNote)
- XLS (Excel)