Big data in cybersecurity: a survey of applications and future trends

With over 4.57 billion people using the Internet in 2020, the amount of data being generated has exceeded 2.5 quintillion bytes per day. This rapid increase in the generation of data has pushed the applications of big data to new heights; one of which is cybersecurity. The paper aims to introduce a thorough survey on the use of big data analytics in building, improving, or defying cybersecurity systems. This paper surveys state-of-the-art research in different areas of applications of big data in cybersecurity. The paper categorizes applications into areas of intrusion and anomaly detection, spamming and spoofing detection, malware and ransomware detection, code security, cloud security, along with another category surveying other directions of research in big data and cybersecurity. The paper concludes with pointing to possible future directions in research on big data applications in cybersecurity.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic €32.70 /Month

Buy Now

Price includes VAT (France)

Instant access to the full article PDF.

Rent this article via DeepDyve

Similar content being viewed by others

Big Data and Its Role in Cybersecurity

Chapter © 2023

Big Data Analytics and Cybersecurity: Emerging Trends

Chapter © 2021

Cyber Security Challenges of Big Data Applications in Cloud Computing: A State of the Art

Chapter © 2022

Explore related subjects

References

  1. 20 ransomware statistics youre powerless to resist reading - hashed out by the ssl store. https://www.thesslstore.com/blog/ransomware-statistics/. Accessed on 01 Aug 2020
  2. 2019 cyber security statistics trends & data: The ultimate list of cyber security stats—purplesec. https://purplesec.us/resources/cyber-security-statistics/. Accessed on 30 Jul 2020
  3. 2020 trustwave global security report—trustwave. https://www.trustwave.com/en-us/resources/library/documents/2020-trustwave-global-security-report/. Accessed on 01 Aug 2020
  4. 5 cybersecurity threats to be aware of in 2020—ieee computer society. https://www.computer.org/publications/tech-news/trends/5-cybersecurity-threats-to-be-aware-of-in-2020/. Accessed on 30 Jul 2020
  5. Apple reveals windows 10 is four times more popular than the mac. howpublished, https://www.theverge.com/2017/4/4/15176766/apple-microsoft-windows-10-vs-mac-users-figures-stats. Accessed on 3 Dec 2018
  6. Computer science. https://arxiv.org/archive/cs. Accessed on 30 Jul 2020
  7. Cyberthreat trends: 15 cybersecurity threats for 2020—nortonlifelock. https://us.norton.com/internetsecurity-emerging-threats-cyberthreat-trends-cybersecurity-threat-review.html. Accessed on 30 Jul 2020
  8. Github—mozilla/openwpm: A web privacy measurement framework. https://github.com/mozilla/OpenWPM. Accessed on 23 Mar 2019
  9. Global digital population as of april 2020. https://www.statista.com/statistics/617136/digital-population-worldwide/. Accessed: 13 May 2020
  10. Global\_2020\_forecast\_highlights. https://www.cisco.com/c/dam/m/en_us/solutions/service-provider/vni-forecast-highlights/pdf/Global_2020_Forecast_Highlights.pdf. Accessed on 30 Jul 2020
  11. Half of the malware detected in 2019 was classified as zero-day threats, making it the most common malware to date - cynet. https://www.cynet.com/blog/half-of-the-malware-detected-in-2019-was-classified-as-zero-day-threats-making-it-the-most-common-malware-to-date/. Accessed on 30 Jul 2020
  12. How much data do we create every day? https://www.forbes.com/sites/bernardmarr/2018/05/21/how-much-data-do-we-create-every-day-the-mind-blowing-stats-everyone-should-read/. Accessed 22 Oct 2018
  13. The iot rundown for 2020: Stats, risks, and solutions—security today. https://securitytoday.com/Articles/2020/01/13/The-IoT-Rundown-for-2020.aspx. Accessed on 30 Jul 2020
  14. Malware statistics youd better get your computer vaccinated. https://dataprot.net/statistics/malware-statistics/. Accessed 29 May 2020
  15. Microsoft vulnerabilities more than doubled in 2017. howpublished, https://www.securitynow.com/author.asp?section_id=649&doc_id=740671. Accessed 3 Dec 2018
  16. Top cybersecurity threats in 2020. https://onlinedegrees.sandiego.edu/top-cyber-security-threats/. Accessed on 30 Jul 2020
  17. Twitter hack: Us and uk teens arrested over breach of celebrity accounts—twitter—the guardian. https://www.theguardian.com/technology/2020/jul/31/twitter-hack-arrests-florida-uk-teenagers. Accessed on 01 Aug 2020
  18. What is big data analytics, howpublished. https://searchbusinessanalytics.techtarget.com/definition/big-data-analytics. Accessed 24 Nov 2018
  19. Ransomware cyber attacks: Which industries are being hit the hardest? https://www.bitsighttech.com/blog/ransomware-cyber-attacks. Accessed on 08 Dec 2018
  20. Us hospital pays $55,000 to hackers after ransomware attack—zdnet. https://www.zdnet.com/article/us-hospital-pays-55000-to-ransomware-operators/. Accessed on 08 Dec 2018
  21. Abdlhamed M, Kifayat K, Shi Q, Hurst W (2017) Intrusion prediction systems. Information fusion for cyber-security analytics. Springer, New York, pp 155–174 ChapterGoogle Scholar
  22. Abraham S, Nair S (2015) Predictive cyber-security analytics framework: a non-homogenous markov model for security quantification. arXiv:1501.01901
  23. Aditham S, Ranganathan N (2018) A system architecture for the detection of insider attacks in big data systems. IEEE Trans Depend Secure Comput 15(6):974–987 ArticleGoogle Scholar
  24. Alani MM (2016) What is the cloud? Elements of cloud computing security. Springer, New York, pp 1–14 Google Scholar
  25. AlEroud A, Karabatis G (2017) Using contextual information to identify cyber-attacks. Information fusion for cyber-security analytics. Springer, New York, pp 1–16 Google Scholar
  26. Aleroud A, Zhou L (2017) Phishing environments, techniques, and countermeasures: a survey. Comput Secur 68:160–196 ArticleGoogle Scholar
  27. Alguliyev, R., Imamverdiyev, Y.: Big data: big promises for information security. In: Application of Information and Communication Technologies (AICT), 2014 IEEE 8th international conference on. IEEE, pp 1–4
  28. Alhuzali A, Gjomemo R, Eshete B, Venkatakrishnan V (2018) \(\\) : Precise and scalable exploit generation for dynamic web applications. In: 27th \(\\) Security Symposium ( \(\\) Security 18), pp 377–392
  29. Alrabaee S, Shirani P, Wang L, Debbabi M (2018) Fossil: a resilient and efficient system for identifying foss functions in malware binaries. ACM Trans Priv Secur 21(2):8 ArticleGoogle Scholar
  30. Alsadhan AA, Hussain A, Alani MM (2018) Detecting ndp distributed denial of service attacks using machine learning algorithm based on flow-based representation. In: 2018 11th International Conference on Developments in eSystems Engineering (DeSE). IEEE, pp 134–140
  31. Amini L, Christodorescu M, Cohen MA, Parthasarathy S, Rao J, Sailer R, Schales DL, Venema WZ, Verscheure O (2015) Adaptive cyber-security analytics. US Patent 9,032,521
  32. Baikalov IA, Froelich C, McConnell T, McGloughlin JP et al (2016) Cyber security analytics architecture. US Patent 9,516,041
  33. Balaban D (2020) 11 types of spoofing attacks every security professional should know about—2020-03-24—security magazine. https://www.securitymagazine.com/articles/91980-types-of-spoofing-attacks-every-security-professional-should-know-about. Accessed on 01 Aug 2020
  34. Banescu S, Collberg C, Pretschner A (2017) Predicting the resilience of obfuscated code against symbolic execution attacks via machine learning. In: 26th \(\\) Security Symposium ( \(\\) Security 17), pp 661–678
  35. Barradas D, Santos N, Rodrigues L (2018) Effective detection of multimedia protocol tunneling using machine learning. In: 27th \(\\) Security Symposium ( \(\\) Security 18), pp 169–185
  36. Biham E, Shamir A (1991) Differential cryptanalysis of des-like cryptosystems. J Cryptol 4(1):3–72 ArticleMathSciNetMATHGoogle Scholar
  37. Bilge L, Han Y, Dell’Amico M (2017) Riskteller: Predicting the risk of cyber incidents. In: Proceedings of the 2017 ACM SIGSAC conference on computer and communications security. ACM, pp 1299–1311
  38. Cao P, Badger EC, Kalbarczyk ZT, Iyer RK, Withers A, Slagell AJ (2015) Towards an unified security testbed and security analytics framework. In: Proceedings of the 2015 symposium and bootcamp on the science of security. ACM
  39. Cao Y, Yang J (2015) Towards making systems forget with machine unlearning. In: 2015 IEEE symposium on security and privacy. IEEE, pp 463–480
  40. Chakraborty R, Vishik C, Rao HR (2013) Privacy preserving actions of older adults on social media: Exploring the behavior of opting out of information sharing. Decis Support Syst 55(4):948–956 ArticleGoogle Scholar
  41. Chiew KL, Yong KSC, Tan CL (2018) A survey of phishing attacks: Their types, vectors and technical approaches. Expert Syst Appl 106:1–20 ArticleGoogle Scholar
  42. Cinque M, Della Corte R, Pecchia A (2019) Microservices monitoring with event logs and black box execution tracing. IEEE Trans Serv Comput
  43. Cinque M, Della Corte R, Pecchia A (2020) Contextual filtering and prioritization of computer application logs for security situational awareness. Future Gener Comput Syst 111:668–680 ArticleGoogle Scholar
  44. Curtin M, Dolske J (1998) A brute force search of des keyspace. In: 8th Usenix Symposium, January. Citeseer, pp 26–29
  45. Cuzzocrea A, Martinelli F, Mercaldo F, Grasso GM (2018) Experimenting and assessing machine learning tools for detecting and analyzing malicious behaviors in complex environments. J Reliab Intell Environ 4(4):225–245 ArticleGoogle Scholar
  46. DATA G (2018) Malware in 2018: the danger is on the web—g data blog. https://www.gdatasoftware.com/blog/2018/09/31037-malware-figures-first-half-2018-danger-web. Accessed on 31 Mar 2019
  47. Dias LF, Correia M (2020) Big data analytics for intrusion detection: an overview. In: Handbook of research on machine and deep learning applications for cyber security. IGI Global, pp 292–316
  48. Du M, Li F, Zheng G, Srikumar V (2017) Deeplog: Anomaly detection and diagnosis from system logs through deep learning. In: Proceedings of the 2017 ACM SIGSAC conference on computer and communications security. ACM, New York, pp 1285–1298
  49. Englehardt S, Narayanan A (2016) Online tracking: A 1-million-site measurement and analysis. In: Proceedings of the 2016 ACM SIGSAC conference on computer and communications security. ACM, pp 1388–1401
  50. Fang W, Wen XZ, Zheng Y, Zhou M (2017) A survey of big data security and privacy preserving. IETE Tech Rev 34(5):544–560 ArticleGoogle Scholar
  51. Farris KA, Shah A, Cybenko G, Ganesan R, Jajodia S (2018) Vulcon: A system for vulnerability prioritization, mitigation, and management. ACM Trans Priv Secur 21(4):16:1–16:28. https://doi.org/10.1145/3196884
  52. Feng Q, Zhou R, Xu C, Cheng Y, Testa B, Yin H (2016) Scalable graph-based bug search for firmware images. In: Proceedings of the 2016 ACM SIGSAC conference on computer and communications security. ACM, New York, pp 480–491. https://doi.org/10.1145/2976749.2978370
  53. Funk C, Garnaeva M (2013) Kaspersky security bulletin 2013. overall statistics for 2013, vol 10. Kaspersky Lab
  54. Gai K, Qiu M, Elnagdy SA (2016) A novel secure big data cyber incident analytics framework for cloud-based cybersecurity insurance. In: 2016 IEEE 2nd International Conference on Big Data Security on Cloud (BigDataSecurity), IEEE International Conference on High Performance and Smart Computing (HPSC), and IEEE international conference on intelligent data and security (IDS). IEEE, pp 171–176
  55. Gandomi A, Haider M (2015) Beyond the hype: Big data concepts, methods, and analytics. Int J Inf Manag 35(2):137–144 ArticleGoogle Scholar
  56. García S, Ramírez-Gallego S, Luengo J, Benítez JM, Herrera F (2016) Big data preprocessing: methods and prospects. Big Data Anal 1(1):9 ArticleGoogle Scholar
  57. Gong NZ, Liu B (2018) Attribute inference attacks in online social networks. ACM Trans Priv Secur 21(1):3 ArticleGoogle Scholar
  58. Gou L, Zhou MX, Yang H (2014) Knowme and shareme: understanding automatically discovered personality traits from social media and user sharing preferences. In: Proceedings of the SIGCHI conference on human factors in computing systems. ACM, New York, pp 955–964
  59. Grahn K, Westerlund M, Pulkkis G (2017) Analytics for network security: a survey and taxonomy. Information fusion for cyber-security analytics. Springer, New York, pp 175–193 ChapterGoogle Scholar
  60. Guo W, Mu D, Xu J, Su P, Wang G, Xing X (2018) Lemna: explaining deep learning based security applications. In: Proceedings of the 2018 ACM SIGSAC conference on computer and communications security, CCS ’18. ACM, New York, pp 364–379. https://doi.org/10.1145/3243734.3243792
  61. Gutierrez CN, Kim T, Della Corte R, Avery J, Goldwasser D, Cinque M, Bagchi S (2018) Learning from the ones that got away: Detecting new forms of phishing attacks. IEEE Trans Depend Secure Comput 15(6):988–1001 ArticleGoogle Scholar
  62. Gyöngyi Z, Garcia-Molina H, Pedersen J (2004) Combating web spam with trustrank. In: Proceedings of the Thirtieth international conference on Very large data bases, vol 30. VLDB Endowment, pp 576–587
  63. Hale B (2016) Estimating log generation for security information event and log management. Retrieved Sep 15
  64. He P, Zhu J, He S, Li J, Lyu MR (2018) Towards automated log parsing for large-scale log data analysis. IEEE Trans Depend Secure Comput 15(6):931–944 ArticleGoogle Scholar
  65. Hong JB, Nhlabatsi A, Kim DS, Hussein A, Fetais N, Khan KM (2019) Systematic identification of threats in the cloud: a survey. Comput Netw 150:46–69 ArticleGoogle Scholar
  66. Hossain MN, Wang J, Weisse O, Sekar R, Genkin D, He B, Stoller SD, Fang G, Piessens F, Downing E, et al (2018) Dependence-preserving data compaction for scalable forensic analysis. In: 27th \(\\) Security Symposium ( \(\\) Security 18), pp 1723–1740
  67. Huang DY, Aliapoulios MM, Li VG, Invernizzi L, Bursztein E, McRoberts K, Levin J, Levchenko K, Snoeren AC, McCoy D (2018) Tracking ransomware end-to-end. In: 2018 IEEE symposium on security and privacy (SP). IEEE, pp 618–631
  68. Ikram M, Onwuzurike L, Farooqi S, Cristofaro ED, Friedman A, Jourjon G, Kaafar MA, Shafiq MZ (2017) Measuring, characterizing, and detecting facebook like farms. ACM Trans Priv Secur 20(4):13 ArticleGoogle Scholar
  69. Jansen K, Schäfer M, Moser D, Lenders V, Pöpper C, Schmitt J (2018) Crowd-gps-sec: Leveraging crowdsourcing to detect and localize gps spoofing attacks. In: 2018 IEEE symposium on security and privacy (SP). IEEE, pp 1018–1031
  70. John NA (2013) The social logics of sharing. Commun Rev 16(3):113–131 ArticleGoogle Scholar
  71. Johnstone M, Peacock M (2020) Seven pitfalls of using data science in cybersecurity. In: Data Science in Cybersecurity and Cyberthreat Intelligence. Springer, Nwe York
  72. Jurgens D (2013) That’s what friends are for: Inferring location in online social media platforms based on social relationships. In: Seventh International AAAI conference on weblogs and social media
  73. Kelsey J, Schneier B, Wagner D (1996) Key-schedule cryptanalysis of idea, g-des, gost, safer, and triple-des. Annual International Cryptology Conference. Springer, New York, pp 237–251 Google Scholar
  74. Khan MUK, Park HS, Kyung CM (2019) Rejecting motion outliers for efficient crowd anomaly detection. IEEE Trans Inf Forensics Secur 14(2):541–556 ArticleGoogle Scholar
  75. Kim D, Kwon BJ, Kozák K, Gates C, Dumitras T (2018) The broken shield: Measuring revocation effectiveness in the windows code-signing \(\\) . In: 27th \(\\) Security Symposium ( \(\\) Security 18), pp 851–868
  76. Kim S, Woo S, Lee H, Oh H (2017) Vuddy: A scalable approach for vulnerable code clone discovery. In: 2017 IEEE symposium on security and privacy (SP). IEEE, pp 595–614
  77. Koli J (2018) Randroid: Android malware detection using random machine learning classifiers. In: 2018 Technologies for smart-city energy security and power (ICSESP). IEEE, pp 1–6
  78. Kotenko I, Saenko I, Branitskiy A (2020) Machine learning and big data processing for cybersecurity data analysis. Data science in cybersecurity and cyberthreat intelligence. Springer, New York, pp 61–85 ChapterGoogle Scholar
  79. Kumar R, Goyal R (2019) On cloud security requirements, threats, vulnerabilities and countermeasures: a survey. Comput Sci Rev 33:1–48 ArticleMathSciNetGoogle Scholar
  80. Kwon BJ, Mondal J, Jang J, Bilge L, Dumitraş T (2015) The dropper effect: insights into malware distribution with downloader graph analytics. In: Proceedings of the 22nd ACM SIGSAC conference on computer and communications security. ACM, New York, pp 1118–1129
  81. Laney D (2001) 3d data management: controlling data volume, velocity and variety. META Group Res Note 6(70):1 Google Scholar
  82. Li H, Xu X, Liu C, Ren T, Wu K, Cao X, Zhang W, Yu Y, Song D (2018) A machine learning approach to prevent malicious calls over telephony networks. In: 2018 IEEE symposium on security and privacy (SP). IEEE, pp 53–69
  83. Liao X, Yuan K, Wang X, Li Z, Xing L, Beyah R (2016) Acing the ioc game: Toward automatic discovery and analysis of open-source cyber threat intelligence. In: Proceedings of the 2016 ACM SIGSAC conference on computer and communications security. ACM, New York, pp 755–766
  84. Liao X, Yuan K, Wang X, Pei Z, Yang H, Chen J, Duan H, Du K, Alowaisheq E, Alrwais S, et al (2016) Seeking nonsense, looking for trouble: Efficient promotional-infection detection through semantic inconsistency search. In: 2016 IEEE symposium on security and privacy (SP). IEEE, pp 707–723
  85. MacDonald N (2012) Information security is becoming a big data analytics problem. https://www.gartner.com/en/documents/1960615. Accessed on 13 May 2020
  86. Madi T, Jarraya Y, Alimohammadifar A, Majumdar S, Wang Y, Pourzandi M, Wang L, Debbabi M (2018) Isotop: auditing virtual networks isolation across cloud layers in openstack. ACM Trans Priv Secur 22(1):1 ArticleGoogle Scholar
  87. Mahmood T, Afzal U (2013) Security analytics: big data analytics for cybersecurity: a review of trends, techniques and tools. In: Information assurance (ncia), 2013 2nd national conference on. IEEE, pp 129–134
  88. Majumdar S, Tabiban A, Jarraya Y, Oqaily M, Alimohammadifar A, Pourzandi M, Wang L, Debbabi M (2018) Learning probabilistic dependencies among events for proactive security auditing in clouds. J Comput Secur:1–38 (preprint)
  89. Maltby D (2011) Big data analytics. In: 74th Annual Meeting of the Association for Information Science and Technology (ASIST), pp 1–6
  90. Martha V (2015) Big data processing algorithms. Big data. Springer, New York, pp 61–91 ChapterGoogle Scholar
  91. Matsui M (1993) Linear cryptanalysis method for des cipher. Workshop on the Theory and Application of of Cryptographic Techniques. Springer, New York, pp 386–397 Google Scholar
  92. Nadgowda S, Isci C, Bal M (2018) Déjàvu: bringing black-box security analytics to cloud. In: Proceedings of the 19th International middleware conference industry. ACM, New York, pp 17–24
  93. Nilizadeh S, Labrèche F, Sedighian A, Zand A, Fernandez J, Kruegel C, Stringhini G, Vigna G (2017) Poised: Spotting twitter spam off the beaten paths. In: Proceedings of the 2017 ACM SIGSAC conference on computer and communications security. ACM, New York, pp 1159–1174
  94. Oltisk J (2013) The big data security analytics era is here. Tech. rep, Enterprise Strategy Group Google Scholar
  95. Pearce P, Ensafi R, Li F, Feamster N, Paxson V (2017) Augur: Internet-wide detection of connectivity disruptions. In: 2017 IEEE symposium on security and privacy (SP). IEEE, pp 427–443
  96. Pierazzi F, Casolari S, Colajanni M, Marchetti M (2016) Exploratory security analytics for anomaly detection. Comput Secur 56:28–49 ArticleGoogle Scholar
  97. Reaves B, Vargas L, Scaife N, Tian D, Blue L, Traynor P, Butler KR (2018) Characterizing the security of the sms ecosystem with public gateways. ACM Trans Priv Secur 22(1):2 Google Scholar
  98. Richardson R, North MM (2017) Ransomware: evolution, mitigation and prevention. Int Manag Rev 13(1):10 Google Scholar
  99. Rieck K, Holz T, Willems C, Düssel P, Laskov P (2008) Learning and classification of malware behavior. International conference on detection of intrusions and malware, and vulnerability assessment. Springer, New York, pp 108–125 Google Scholar
  100. Rijmen V, Daemen J (2001) Advanced encryption standard. In: Proceedings of federal information processing standards publications. National Institute of Standards and Technology, pp 19–22
  101. Rose C (2011) The security implications of ubiquitous social media
  102. Salva S, Regainia L (2019) A catalogue associating security patterns and attack steps to design secure applications. J Comput Secur:1–26 (Preprint)
  103. Shen Y, Mariconti E, Vervier PA, Stringhini G (2018) Tiresias: predicting security events through deep learning. In: Proceedings of the 2018 ACM SIGSAC conference on computer and communications security. ACM, New York, pp 592–605
  104. Shu X, Araujo F, Schales DL, Stoecklin MP, Jang J, Huang H, Rao JR (2018) Threat intelligence computing. In: Proceedings of the 2018 ACM SIGSAC conference on computer and communications security. ACM, New York, pp 1883–1898
  105. Shu X, Yao DD, Ramakrishnan N, Jaeger T (2017) Long-span program behavior modeling and attack detection. ACM Trans Priv Secur 20(4):12 ArticleGoogle Scholar
  106. Siadati H, Memon N (2017) Detecting structurally anomalous logins within enterprise networks. In: Proceedings of the 2017 ACM SIGSAC conference on computer and communications security. ACM, New York, pp 1273–1284
  107. Singer PW, Friedman A (2014) Cybersecurity: what everyone needs to know. Oxford University Press, Oxford BookGoogle Scholar
  108. Sipola T (2015) Knowledge discovery from network logs. Cyber security: analytics, technology and automation. Springer, New York, pp 195–203 ChapterGoogle Scholar
  109. Siwicki B (2016) Ransomware attackers collect ransom from kansas hospital, dont unlock all the data, then demand more money. Healthcare IT News
  110. Standard DE et al (1977) Federal information processing standards publication 46. National Bureau of Standards, US Department of Commerce, vol 23
  111. Suciu O, Marginean R, Kaya Y, Daume III H, Dumitras T (2018) When does machine learning \(\\) ? generalized transferability for evasion and poisoning attacks. In: 27th \(\\) Security Symposium ( \(\\) Security 18), pp 1299–1316
  112. Sun B, Takahashi T, Zhu L, Mori T (2020) Discovering malicious urls using machine learning techniques. Data science in cybersecurity and cyberthreat intelligence. Springer, New York, pp 33–60 ChapterGoogle Scholar
  113. Talabis M, McPherson R, Miyamoto I, Martin J (2014) Information security analytics: finding security insights, patterns, and anomalies in big data. Syngress
  114. Tan Z, Nagar UT, He X, Nanda P, Liu RP, Wang S, Hu J (2014) Enhancing big data security with collaborative intrusion detection. IEEE Cloud Comput 1(3):27–33 ArticleGoogle Scholar
  115. Tankard C (2012) Big data security. Netw Secur 2012(7):5–8 ArticleGoogle Scholar
  116. Terzi DS, Terzi R, Sagiroglu S (2015) A survey on security and privacy issues in big data. In: 2015 10th international conference for internet technology and secured transactions (ICITST). IEEE, pp 202–207
  117. Thirumaran J et al (2018) Applications of big data analytics-network security. Int J Res Sci Eng Technol 5(1):55–59 Google Scholar
  118. Tipton H (2019) Information security management handbook, vol IV. CRC Press, Boca Raton BookMATHGoogle Scholar
  119. Ugarte-Pedrero X, Graziano M, Balzarotti D (2019) A close look at a daily dataset of malware samples. ACM Trans Priv Secur 22(1):6 ArticleGoogle Scholar
  120. Ullah F, Babar MA (2019) Architectural tactics for big data cybersecurity analytics systems: a review. J Syst Softw 151:81–118 ArticleGoogle Scholar
  121. Von Solms R, Van Niekerk J (2013) From information security to cyber security. Comput Secur 38:97–102 ArticleGoogle Scholar
  122. Wohlin C (2014) Guidelines for snowballing in systematic literature studies and a replication in software engineering. In: Proceedings of the 18th international conference on evaluation and assessment in software engineering, pp 1–10
  123. Xu Z, Wu Z, Li Z, Jee K, Rhee J, Xiao X, Xu F, Wang H, Jiang G (2016) High fidelity data reduction for big data security dependency analyses. In: Proceedings of the 2016 ACM SIGSAC conference on computer and communications security. ACM, New York, pp 504–516
  124. Yang X, Ma T, Shi Y (2007) Typical dos/ddos threats under ipv6. In: 2007 International multi-conference on computing in the global information technology (ICCGI’07). IEEE
  125. Yao Y, Viswanath B, Cryan J, Zheng H, Zhao BY (2017) Automated crowdturfing attacks and defenses in online review systems. In: Proceedings of the 2017 ACM SIGSAC conference on computer and communications security. ACM, New York, pp 1143–1158
  126. You I, Yim K (2010) Malware obfuscation techniques: a brief survey. In: 2010 International conference on broadband, wireless computing, communication and applications. IEEE, pp 297–300
  127. Yuan Y, Adhatarao SS, Lin M, Yuan Y, Liu Z, Fu X (2020) Ada: Adaptive deep log anomaly detector. In: IEEE INFOCOM 2020-IEEE conference on computer communications. IEEE, pp 2449–2458
  128. Zhang D (2018) Big data security and privacy protection. In: 8th International conference on management and computer science (ICMCS 2018). Atlantis Press
  129. Zhang J, Zhang R, Zhang Y, Yan G (2016) The rise of social botnets: Attacks and countermeasures. IEEE Trans Depend Secure Comput
  130. Zhao JY, Kessler EG, Yu J, Jalal K, Cooper CA, Brewer JJ, Schwaitzberg SD, Guo WA (2018) Impact of trauma hospital ransomware attack on surgical residency training. J Surg Res 232:389–397 ArticleGoogle Scholar
  131. Zhu Z, Dumitraş T (2016) Featuresmith: automatically engineering features for malware detection by mining the security literature. In: Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security, pp 767–778. ACM, New York
  132. Zoldi S, Athwal J, Li H, Kennel M, Xue X (2015) Cyber security adaptive analytics threat monitoring system and method. US Patent 9,191,403
  133. Zuech R, Khoshgoftaar TM, Wald R (2015) Intrusion detection and big heterogeneous data: a survey. J Big Data 2(1):3 ArticleGoogle Scholar
  134. Zuo Y, Wu Y, Min G, Huang C, Pei K (2020) An intelligent anomaly detection scheme for micro-services architectures with temporal and spatial data analysis. IEEE Trans Cogn Commun Netw

Author information

Authors and Affiliations

  1. Senior Member of the ACM, Toronto, Canada Mohammed M. Alani
  1. Mohammed M. Alani
You can also search for this author in PubMed Google Scholar

Corresponding author

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix A: Literature summary tables

Appendix A: Literature summary tables

See Tables 1, 2, 3, 4, 5 and 6.

Table 1 Summary of papers in intrusion and anomaly detection section Table 2 Summary of papers in spamming, spoofing, and phishing detection section Table 3 Summary of papers in malware and ransomeware section Table 4 Summary of papers in code security section Table 5 Summary of papers in cloud security section Table 6 Summary of papers in other applications section

Rights and permissions

About this article

Cite this article

Alani, M.M. Big data in cybersecurity: a survey of applications and future trends. J Reliable Intell Environ 7, 85–114 (2021). https://doi.org/10.1007/s40860-020-00120-3

Share this article

Anyone you share the following link with will be able to read this content:

Get shareable link

Sorry, a shareable link is not currently available for this article.

Copy to clipboard

Provided by the Springer Nature SharedIt content-sharing initiative

Keywords