A survey on big data analytics using social media data

Author(s):  
P. Victer Paul ◽  
K. Monica ◽  
M. Trishanka
2021 ◽  
Vol 12 ◽  
Author(s):  
Muhammad Usman Tariq ◽  
Muhammad Babar ◽  
Marc Poulin ◽  
Akmal Saeed Khattak ◽  
Mohammad Dahman Alshehri ◽  
...  

Intelligent big data analysis is an evolving pattern in the age of big data science and artificial intelligence (AI). Analysis of organized data has been very successful, but analyzing human behavior using social media data becomes challenging. The social media data comprises a vast and unstructured format of data sources that can include likes, comments, tweets, shares, and views. Data analytics of social media data became a challenging task for companies, such as Dailymotion, that have billions of daily users and vast numbers of comments, likes, and views. Social media data is created in a significant amount and at a tremendous pace. There is a very high volume to store, sort, process, and carefully study the data for making possible decisions. This article proposes an architecture using a big data analytics mechanism to efficiently and logically process the huge social media datasets. The proposed architecture is composed of three layers. The main objective of the project is to demonstrate Apache Spark parallel processing and distributed framework technologies with other storage and processing mechanisms. The social media data generated from Dailymotion is used in this article to demonstrate the benefits of this architecture. The project utilized the application programming interface (API) of Dailymotion, allowing it to incorporate functions suitable to fetch and view information. The API key is generated to fetch information of public channel data in the form of text files. Hive storage machinist is utilized with Apache Spark for efficient data processing. The effectiveness of the proposed architecture is also highlighted.


2021 ◽  
Author(s):  
Steven F. Lehrer ◽  
Tian Xie

There exists significant hype regarding how much machine learning and incorporating social media data can improve forecast accuracy in commercial applications. To assess if the hype is warranted, we use data from the film industry in simulation experiments that contrast econometric approaches with tools from the predictive analytics literature. Further, we propose new strategies that combine elements from each literature in a bid to capture richer patterns of heterogeneity in the underlying relationship governing revenue. Our results demonstrate the importance of social media data and value from hybrid strategies that combine econometrics and machine learning when conducting forecasts with new big data sources. Specifically, although both least squares support vector regression and recursive partitioning strategies greatly outperform dimension reduction strategies and traditional econometrics approaches in forecast accuracy, there are further significant gains from using hybrid approaches. Further, Monte Carlo experiments demonstrate that these benefits arise from the significant heterogeneity in how social media measures and other film characteristics influence box office outcomes. This paper was accepted by J. George Shanthikumar, big data analytics.


Author(s):  
Philip Habel ◽  
Yannis Theocharis

In the last decade, big data, and social media in particular, have seen increased popularity among citizens, organizations, politicians, and other elites—which in turn has created new and promising avenues for scholars studying long-standing questions of communication flows and influence. Studies of social media play a prominent role in our evolving understanding of the supply and demand sides of the political process, including the novel strategies adopted by elites to persuade and mobilize publics, as well as the ways in which citizens react, interact with elites and others, and utilize platforms to persuade audiences. While recognizing some challenges, this chapter speaks to the myriad of opportunities that social media data afford for evaluating questions of mobilization and persuasion, ultimately bringing us closer to a more complete understanding Lasswell’s (1948) famous maxim: “who, says what, in which channel, to whom, [and] with what effect.”


2021 ◽  
pp. 074391562199967
Author(s):  
Raffaello Rossi ◽  
Agnes Nairn ◽  
Josh Smith ◽  
Christopher Inskip

The internet raises substantial challenges for policy makers in regulating gambling harm. The proliferation of gambling advertising on Twitter is one such challenge. However, the sheer scale renders it extremely hard to investigate using conventional techniques. In this paper the authors present three UK Twitter gambling advertising studies using both Big Data analytics and manual content analysis to explore the volume and content of gambling adverts, the age and engagement of followers, and compliance with UK advertising regulations. They analyse 890k organic adverts from 417 accounts along with data on 620k followers and 457k engagements (replies and retweets). They find that around 41,000 UK children follow Twitter gambling accounts, and that two-thirds of gambling advertising Tweets fail to fully comply with regulations. Adverts for eSports gambling are markedly different from those for traditional gambling (e.g. on soccer, casinos and lotteries) and appear to have strong appeal for children, with 28% of engagements with eSports gambling ads from under 16s. The authors make six policy recommendations: spotlight eSports gambling advertising; create new social-media-specific regulations; revise regulation on content appealing to children; use technology to block under-18s from seeing gambling ads; require ad-labelling of organic gambling Tweets; and deploy better enforcement.


2018 ◽  
Vol 03 (03) ◽  
pp. 1850003 ◽  
Author(s):  
Jared Oliverio

Big Data is a very popular term today. Everywhere you turn companies and organizations are talking about their Big Data solutions and Analytic applications. The source of the data used in these applications varies. However, one type of data is of great interest to most organizations, Social Media Data. Social Media applications are used by a large percentage of the world’s population. The ability to instantly connect and reach other people and companies over distributed distances is an important part of today’s society. Social Media applications allow users to share comments, opinions, ideas, and media with friends, family, businesses, and organizations. The data contained in these comments, ideas, and media are valuable to many types of organizations. Through Data Mining and Analysis, it is possible to predict specific behavior in users of the applications. Currently, several technologies aid in collecting, analyzing, and displaying this data. These technologies allow users to apply this data to solve different problems, in different organizations, including the finance, medicine, environmental, education, and advertising industries. This paper aims to highlight the current technologies used in Data Mining and Analyzing Social Media data, the industries using this data, as well as the future of this field.


2018 ◽  
Vol 5 (2) ◽  
pp. 205395171880773 ◽  
Author(s):  
Cheryl Cooky ◽  
Jasmine R Linabary ◽  
Danielle J Corple

Social media offers an attractive site for Big Data research. Access to big social media data, however, is controlled by companies that privilege corporate, governmental, and private research firms. Additionally, Institutional Review Boards’ regulative practices and slow adaptation to emerging ethical dilemmas in online contexts creates challenges for Big Data researchers. We examine these challenges in the context of a feminist qualitative Big Data analysis of the hashtag event #WhyIStayed. We argue power, context, and subjugated knowledges must each be central considerations in conducting Big Data social media research. In doing so, this paper offers a feminist practice of holistic reflexivity in order to help social media researchers navigate and negotiate this terrain.


2016 ◽  
Author(s):  
Jonathan Mellon

This chapter discusses the use of large quantities of incidentallycollected data (ICD) to make inferences about politics. This type of datais sometimes referred to as “big data” but I avoid this term because of itsconflicting definitions (Monroe, 2012; Ward & Barker, 2013). ICD is datathat was created or collected primarily for a purpose other than analysis.Within this broad definition, this chapter focuses particularly on datagenerated through user interactions with websites. While ICD has beenaround for at least half a century, the Internet greatly expanded theavailability and reduced the cost of ICD. Examples of ICD include data onInternet searches, social media data, and user data from civic platforms.This chapter briefly explains some sources and uses of ICD and thendiscusses some of the potential issues of analysis and interpretation thatarise when using ICD, including the different approaches to inference thatresearchers can use.


Sign in / Sign up

Export Citation Format

Share Document