Abstract
Sequential pattern mining (SPM) is one of the main application areas in the field of online business, e-commerce, bioinformatics, etc. The traditional approaches in SPM are unable to accurately mine the huge volume of data. Therefore, the proposed work employs a sequential mining model based on deep learning to minimize complexity in handling huge data. Application areas such as online retailing, finance, and e-commerce face a dynamic change in data, which results in non-stationary data. Therefore, our proposed work uses discrete wavelet analysis to convert non-stationary data into time series. In the proposed SPM, a reformed hybrid combination of convolutional neural network (CNN) with long short-term memory (LSTM) is designed to find out customer behavior and purchasing patterns in terms of time. CNN is used to find the concerned itemsets (frequent) at the end of the pattern and LSTM for finding the time interval among each pair of successive itemsets. The proposed work mines the sequential pattern from a progressive database that removes the obsolete data. Finally, the accuracy of the proposed work is compared with some traditional algorithms to demonstrate its robustness.
Similar content being viewed by others
References
Abboud Y, Brun A, Boyer A (2019) C3Ro: an efficient mining algorithm of extended-closed contiguous robust sequential patterns in noisy data. Expert Syst Appl 1(131):172–189
Agarwal S (2013) Data mining: data mining concepts and techniques. In: 2013 International conference on machine intelligence and research advancement. IEEE, pp 203–207
Ahmed CF, Tanbeer SK, Jeong BS, Lee YK, Choi HJ (2012) Single-pass incremental and interactive mining for weighted frequent patterns. Expert Syst Appl 39(9):7976–7994
Almasoud AM, Al-Khalifa HS, Al-Salman A (2015) Recent developments in data mining applications and techniques. In: 2015 Tenth international conference on digital information management (ICDIM). IEEE, pp 36–42
Anwar T, Uma V (2019) CD-SPM: cross-domain book recommendation using sequential pattern mining and rule mining. J King Saud Univ Comput Inf Sci. https://doi.org/10.1016/j.jksuci.2019.01.012
Belhadi A, Djenouri Y, Lin JC, Zhang C, Cano A (2020) Exploring pattern mining algorithms for hashtag retrieval problem. IEEE Access 8:10569–10583
Chen MS, Han J, Yu PS (1996) Data mining: an overview from a database perspective. IEEE Trans Knowl Data Eng 8(6):866–883
Dam TL, Ramampiaro H, Nørvåg K, Duong QH (2019) Towards efficiently mining closed high utility itemsets from incremental databases. Knowl-Based Syst 165:13–29
Djenouri Y, Belhadi A, Lin JCW, Cano A (2019a) Adapted k-nearest neighbors for detecting anomalies on spatio–temporal traffic flow. IEEE Access 7:10015–10027
Djenouri Y, Djenouri D, Belhadi A, Cano A (2019b) Exploiting GPU and cluster parallelism in single scan frequent itemset mining. Inf Sci 1(496):363–377
Fournier-Viger P, Gomariz A, Gueniche T, Soltani A, Wu C, Tseng VS (2014) SPMF: a java open-source pattern mining library. J Mach Learn Res (JMLR) 15:3389–3393. http://www.philippefournier-viger.com/spmf/
Gan W, Lin JC, Fournier-Viger P, Chao HC, Yu PS (2018) A survey of parallel sequential pattern mining. arXiv:1805.10515
He Z, Zhang S, Wu J (2019) Significance-based discriminative sequential pattern mining. Expert Syst Appl 15(122):54–64
Huang JW, Tseng CY, Ou JC, Chen MS (2008) A general model for sequential pattern mining with a progressive database. IEEE Trans Knowl Data Eng 20(9):1153–1167
Huang SC, Chiou CC, Chiang JT, Wu CF (2020) Online sequential pattern mining and association discovery by advanced artificial intelligence and machine learning techniques. Soft Comput 24:8021–8039
Huynh B, Trinh C, Huynh H, Van TT, Vo B, Snasel V (2018) An efficient approach for mining sequential patterns using multiple threads on very large databases. Eng Appl Artif Intell 1(74):242–251
Huynh HM, Nguyen LTT, Vo B, Nguyen A, Tseng VS (2020a) Efficient methods for mining weighted clickstream patterns. Expert Syst Appl 142:112993
Huynh HM, Nguyen LT, Vo B, Yun U, Oplatková ZK, Hong TP (2020b) Efficient algorithms for mining clickstream patterns using pseudo-IDLists. Future Gener Comput Syst 1(107):18–30
Karthikeyan L, Kumar DN (2013) Predictability of nonstationary time series using wavelet and EMD based ARMA models. J Hydrol 10(502):103–119
Kim B, Yi G (2019) Location-based parallel sequential pattern mining algorithm. IEEE Access 7:128651–128658
Lan K, Wang DT, Fong S, Liu LS, Wong KK, Dey N (2018) A survey of data mining and deep learning in bioinformatics. J Med Syst 42(8):139
Lim HK, Kim Y, Kim MK (2017) Failure prediction using sequential pattern mining in the wire bonding process. IEEE Trans Semicond Manuf 30(3):285–292
Lin JC, Gan W, Hong TP (2014) Efficiently maintaining the fast updated sequential pattern trees with sequence deletion. IEEE Access 24(2):1374–1383
Lin JC, Li T, Pirouz M, Zhang J, Fournier-Viger P (2019) High average-utility sequential pattern mining based on uncertain databases. Knowl Inf Syst 62:1199–1228
Liu X, Huang Z, Tong B (2016) Review on the data mining technology and the applications on financial analysis area. In: 2016 International conference on communication and electronics systems (ICCES). IEEE, pp 1–7
Lu Q, Lyu ZJ, Xiang Q, Zhou Y, Bao J (2017) Research on data mining service and its application case in complex industrial process. In: 2017 13th IEEE conference on automation science and engineering (CASE). IEEE, pp 1124–1129
Perera D, Kay J, Koprinska I, Yacef K, Zaïane OR (2008) Clustering and sequential pattern mining of online collaborative learning data. IEEE Trans Knowl Data Eng 21(6):759–772
Rjeily CB, Badr G, Al Hassani AH, Andres E (2018) Overview on sequential mining algorithms and their extensions. In: Recent trends in computer applications. Springer, Cham, pp 3–16
Saleti S, Subramanyam RB (2019) A novel mapreduce algorithm for distributed mining of sequential patterns using co-occurrence information. Appl Intell 49(1):150–171
Shaji SP (2019) Prediction and diagnosis of heart disease patients using data mining technique. In: 2019 International conference on communication and signal processing (ICCSP). IEEE, pp 0848–0852
Singh Y, Chauhan AS (2009) Neural networks in data mining. J Theor Appl Inf Technol 5(1):37–42
Soltani S (2002) On the use of the wavelet decomposition for time series prediction. Neurocomputing 48(1–4):267–277
Stahl F, Jordanov I (2012) An overview of the use of neural networks for data mining tasks. Wiley Interdiscip Rev Data Min Knowl Discov 2(3):193–208
Wu CL, Koh JL, An PY (2005) Improved sequential pattern mining using an extended bitmap representation. In: Int Conf Database Expert Syst Appl. Springer, Berlin, pp 776–785
Yang Z, Kitsuregawa M (2005) LAPIN-SPAM: an improved algorithm for mining sequential pattern. In: 21st International conference on data engineering workshops (ICDEW’05). IEEE, pp 1222–1222
Yun CH, Chen MS (2007) Mining mobile sequential patterns in a mobile commerce environment. IEEE Trans Syst Man Cybern Part C (Appl Rev) 37(2):278–295
Yun U, Lee G, Ryu KH (2014) Mining maximal frequent patterns by considering weight conditions over data streams. Knowl-Based Syst 1(55):49–65
Zhang H (2011) A short introduction to data mining and its applications. In: 2011 International conference on management and service science. IEEE, pp 1–4
Zhang C, Almpanidis G, Wang W, Liu C (2018) An empirical evaluation of high utility itemset mining algorithms. Expert Syst Appl 101:91–115
Funding
There is no funding for this study.
Author information
Authors and Affiliations
Contributions
All the authors have participated in writing the manuscript and have revised the final version. All authors read and approved the final manuscript.
Corresponding author
Ethics declarations
Conflict of interest
Authors declares that they have no conflict of interest.
Ethical approval
This article does not contain any studies with human participants and/or animals performed by any of the authors.
Informed consent
There is no informed consent for this study.
Additional information
Communicated by V. Loia.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Jamshed, A., Mallick, B. & Kumar, P. Deep learning-based sequential pattern mining for progressive database. Soft Comput 24, 17233–17246 (2020). https://doi.org/10.1007/s00500-020-05015-2
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00500-020-05015-2