Fill This Form To Receive Instant Help

Help in Homework

Application of Machine Learning For Social Media Analysis

  • Words: 790180

Published: May 29, 2024


The machine learning technology has been utilizing within social media that will further allow the machines in order to further decide the advertisements are to be represent to the audiences. It gathers the data and information from the multiple users, analyse this and finding out the preferences as per the advertisements which generally holds the core interests. Here this dissertation will show the way machine learning will represent the benefits such as image recognition that will be proven as beneficial for the social media analysis. The machine learning tool WEKA will be utilized to implement the concept of mining techniques accurately.


  1. Background

The research proposal topic is based on the application of machine learning that helps to analyse social media. This chosen research proposal subject is essential because machine learning in social media helps produce accurate insights based on online engagements. Machine learning lies as the sub-field area of Artificial Intelligence that involves creating systems to solve problems. Machine learning can create awareness regarding marketing campaigning that performs best in organizations. An effective form of social media monitoring can be conducted through machine learning (Kharwal, 2021). The sentiment analysis can be easily applied to social media through machine learning by analysing social conversations. Image recognition can be efficiently conducted by the application of machine learning in social media analysis. The machine learning application in social media has increased online interactions between consumers and organizations (Laurell et al., 2019).

In recent years, social media platforms are utilized by numerous users every day. Because of that, gathering and maintaining the data collected from numerous platforms for analysing purposes is not an easy task. Machine learning successfully provides the necessary assistance in allowing the digital devices to find out the spam content and bad backlinks on social media platforms, which are considered severe threats for the existing data on the platforms and the companies associated with it. So, machine learning is applied so that the rate of those challenges can be mitigated and the entire social media analytics process can be significantly improved (Yao et al., 2018).

Machine learning is a subset of Artificial Intelligence (A.I.) that helps leverage algorithms so that an enormous amount of relevant data can be critically analysed. These algorithms can efficiently operate without time constraints or human bias to compute data combinations that help understand the data. The machine learning applications also understand the boundaries of the information, which is essential. So, the applications based on the algorithms of Machine Learning can effectively provide the necessary assistance so that the processes of social media analytics can be improved (Laurell et al., 2019).

Social media analytics assists the users in understanding the contents of the social media platforms, which can successfully drive more user acceptance. In addition, the social media analytics processes can also simplify the data from numerous networks successfully. The behaviour of the social media user can be calculated successfully. In this case, the machine learning technology within social media platforms allows the devices and networks to decide which advertisement or content will be shown to which audience. Machine learning applications gather relevant information from the users and analyse the information to detect and understand their preferences. These can assist in presenting suitable advertisements according to the users' interests (Hu et al., 2019).

It is known that the number of social media users has increased to more than 3 billion by 2019. Along with that, the average number of social media accounts on the social media platforms per person has significantly reached 8.9, with the utilization of 2 hours 16 minutes per day (Yao et al., 2018). The content on social media, which the users primarily generate, produces six primary types of big data: service data, disclosed data, entrusted data, arbitrary data, behavioural data, and derived data. It is well observed that the noteworthy contribution of the applications, which are based on the machine learning algorithms in the field of social media analytics, successfully makes the domain more alluring because of the efficient nature of those applications. So, fruitful results can be easily derived from the machine learning application (Jimenez-Marquez et al., 2019).

    1. Statement of Problem

Social media analytics significantly contributes to the processes of developing business decisions within organizations. In this case, the machine learning tools play a crucial role because of their features of providing continuous improvement, trends, and patterns identification, which offers great help to enhance the process of data analytics. Along with that, the application of machine learning technology also provides the necessary assistance in terms of mitigating the existing challenges within social media analytics (Jimenez-Marquez et al., 2019). Some of the major problems are due to the lack of consistency within metrics across networks, the proliferation of the social networks, limited control over the content topic and delivery, difficulties regarding the aggregation of data across properties, and so on. Applications based on Artificial Intelligence (A.I.) and Machine Learning (ML) Technology are utilized to solve these problems (Yao et al., 2018)

    1. Scope

Machine learning can be used in the scenarios where we need to figure out the sentiments from the data available on social media platforms.

Here, the scope is to find sentiment analysis from an organization to help keep their reputation high, Develop Quality Products, improve customer service and media perceptions, discover New Marketing Strategies, and improve crises management to the respective Business (Chancellor et al., 2019).

    1. Research aim

The aim of this dissertation is to find out the role of Machine Learning (ML) applications in social media analysis for proper optimization and channelization of resources, aiming towards maximization of output.

    1. Objectives

The prime objective is to apply sentiment analysis, find people's pulse, and search out how to improve their positivity towards a product or business, and help recommend the organization for better customer service. In a nutshell, the objectives are to find out how machine learning is required to improve the prima facie of companies through social media.

      • To analyse the application of machine learning in social media platforms.
      • To enlist the benefits obtained from the machine learning tools to carry out successful business activities.
      • To enhance machine learning in social media for conducting online interactions between customers and organizations.
      • To find out how machine learning optimizes the method of marketing to match the proper market with the appropriate product.
      • To develop proper know-how of the working method of the algorithm in maximizing organic outgrowth.
      • To identify and explore the pros and the cons related to this advanced marketing tool.
    1. Research questions
  1. What are the various applications of machine learning that need to be introduced in social media?
  2. What are the various types of machine learning algorithms that are introduced in social media?
  3. What are the advantages of machine learning tools used by organizations?
  4. What is the importance of machine learning application in social media analytics
  1. How is the advent of machine learning in social media marketing impacting the sales record?
  2. What are the current challenges of implementing machine learning in social media analytics?
  3. Literature Review

According to (Jimenez-Marquez, 2019), the field of sentiment analysis can properly examine every customer's opinion, customer feedback, emotion, and the evaluation of sentiment from several written languages. This is a particular process of decision-making to determine the positive and negative characteristics of the product appropriately. One of the essential processes of this particular analysis is to properly recognize the convention of repudiation and the categorization of the number of positive and negative sentiments received through multiple users in the social group. Through the utilization of sentient analysis, the user will know every feedback about the product before purchasing (Jimenez-Marquez et al., 2019).





Figure 1: Sentiment Analysis Architecture

Source: (Mahendran and Mekala, 2018).

This particular analysis process is a specific language processing for determining the feel of every person about the specific products. Through the utilization of machine learning, there are the different processes of sentiment analysis these processes are including.

    1. Pre-processing of Data

This particular method is very responsible for removing the noisy as well as inconsistent and incomplete data. Throughout this process, data pre-processed should be adequately performed before any mining.

According to (Arasu, 2020), most marketers like to utilize Artificial Intelligence to transform data regarding valuable customer insights properly. Data collecting is one of the best arts as it involves determining the number of advantages of online marketing for developing the data collection as reviewing every feedback. Several studies reveal the possibility of determining the most effective classification and different prediction algorithms and utilizing specific machine learning tools (WEKA) of machine learning. This tool utilizes several algorithms for several scenarios. It has a vast number of algorithms to predict the number of conditions correctly. The Machine Learning technique is one of the best sub-disciplines of Artificial Intelligence. Several concepts of Artificial Intelligence help determine different market issues responsible for recovering and screening the number of crises regarding online business. Through machine learning-based social media analytics, a considerable part of the workload can be decreased. The human workload can be minimized through the utilization of Artificial Intelligence systems. There are multiple challenges of the online business market as well as these challenges are including,

      • Utilizing social media analytics
      • The capability to make and leverage different shopper perceptions.
      • Enhance the effectiveness of analytical abilities.




Figure 2: Machine learning algorithm in social media analysis

Source: (Arasu et al. 2020)

Social media analytics is included with natural language processing, predictive analysis, behavioural analysis, text analysis, and statistical analysis. Specifically, the mining techniques are utilized for text analysis. The machine learning tool WEKA has been utilized to implement the concept of mining techniques accurately. WEKA consists of different classifiers, prediction algorithms, and several clustering tools and data visualization tools for properly comparing and visualizing the results.


2.2. Study of Social Data Analysis by Utilizing Machine Learning

Natural Language Processing enables the Artificial Intelligence system to properly examine the human language to derive the exact meaning of different product reviews and the number of Facebook, Twitter, and Instagram posts. Nowadays, several retail organizations utilize Machine Learning technologies as different tools for providing numerous support to every market problem. The prediction technique has been utilized for accurately predicting different sales data as well as shelf-out scenarios. Clustering algorithms are very effective to utilize in customer segmentation along with the personalized communication and advertisements. Machine Learning has been utilized for correctly listing every product and ranking based on the concept of advertising (Arasu et al., 2020). Most organizations are used to performing analyses like sentiment analysis to obtain a proper understanding of the expectation of customers in online marketing. In the future, every business organization will understand the multiple mining techniques and different Machine Learning (ML) tools for analysing dynamic data. Machine Learning combines statistics and Artificial Intelligence, Machine Learning techniques involved with the learning from input data, and making considerable knowledge as a specific mode for creating smart decisions on unfamiliar data. Artificial Intelligence and Machine Learning tools are some of the primary tools in the social media analytics system. Automatic face recognition, description, and pattern grouping, and classification are the most significant areas of different issues in online marketing.





Figure 3: Techniques of Data Mining


Source: (Arasu et al., 2020).


    1. Machine Learning Associated Social Media Marketing and Performance Analysis

The machine learning associated approach of social media marketing involves text mining, Machine Learning analysis by utilizing WEKA and ML associated with social media marketing.

    1. Text Mining

This particular process occupies an influential position in some specific research fields. There is a meagre amount of data that has been organized on the web, as well as other vast amounts of unstructured data. Data access through content mining has generally been of high quality as well as the primary goal of content mining is to generate massive value for the Business (Salloum et al., 2021). Text mining involves determining every unstructured data to address the most significant data patterns as fast as possible. The number of individual sentences has some errors through web-dependent networking media like Twitter, Facebook, Whatsapp, etc. The content mining technique has been used to enable the determination process of the data by utilizing the organized sentences and legitimate language (Fan et al., 2020).


    1. The Approach of Machine Learning Integrated Social Media Marketing

Machine Learning is very relevant for generating specific systems of decision support to analyse. One of the significant innovations in digital marketing strategy is the combination of Artificial Intelligence-based tools to streamline marketing to make the business more effective and efficient. Most organizations utilize Machine Learning outcomes to obtain a specific and clear, detailed understanding of customer perceptions to optimize the marketing strategy accurately. Different tools of Machine Learning can be handy for every digital marketer because ML techniques can allow digital marketers to explore and understand data properly. As an example, it can be said that LinkedIn utilizes Artificial Intelligence (A.L.) along with Machine Learning (ML) for every product. LinkedIn utilizes several algorithms along with the ability to guess the fit user for the assigned role. The utilization of Machine Learning can provide the highlighted candidates who are very effective in responding and seeking innovative opportunities. In this current situation, Twitter has launched a valuable update to its different services. It utilizes Artificial Intelligence to crop a specific image using a face detection system (Salloum et al., 2021).





Figure 4: Machine learning-based approach to enhance social media


Source: (Salloum et al., 2021).


    1. Machine Learning Social Media Marketing Analysis by Utilizing WEKA

This particular tool is very effective for data analysis as well as it is also responsible for generating the expected result to obtain the most efficient marketing (Fan et al., 2020). This specific form of data analysis provides an understanding of consumer behaviour regarding the purchase of the product. This Machine Learning tool effectively combines Machine Learning algorithms for properly executing multiple data mining tasks (Arasu et al., 2020).


    1. Making Dataset for WEKA

This specific tool accepts data sets in a specific attribute relation file format.


      • This is a precise java-based open-source data mining tool, as well as this is involved in the collection of several machine learning and data mining algorithms.
      • This is Platform-independent software.
      • This software is much stronger than any other technique of machine learning as well as it is handy for improving the most innovative scheme of the machine learning technique.





Figure 5: WEKA Interfaces


Source: (Arasu et al., 2020).


According to (Harfoushi et al. 2018), sentiment analysis is a prevalent and utilized technique in specific natural language processing. It can encompass studying people's feelings as well as opinions and attitudes towards the product. This particular sentiment analysis can be utilized to assess the number of reviews that people post through online media. Microsoft Azure Machine Learning is very effective in performing data analytics. Azure Machine Learning is based on two specific machine learning algorithms like Support Vector Machine and Logistic Regression. These two particular algorithms can be utilized to make sentiment analysis (Arasu et al., 2020).

    1. Microsoft Azure Machine Learning

This particular machine learning technique encompasses different cloud services that can enable the deployment and creation, and management of several applications by responsible developers through a specific global network of data centers for Microsoft. This particular cloud computing model has emphasized the particular cloud platforms features like scalability, agility, and flexibility. Nowadays, Microsoft Azure Machine Learning calculates the actual score of the user's contribution dependent on the social media metrics. Azure Machine Learning can also provide support for several algorithms of Machine Learning regarding classification as well as regression and clustering. Besides this, it can also allow for the customization of several models by utilizing R and Python. Azure Machine Learning can also allow the dropping and dragging of multiple modules and datasets like selecting features, Machine Learning algorithms, and pre-processing (Welikala et al., 2018).

In social media analytics, Azure Machine Learning needs several users to manually and successfully finish every operation. This operation is included with validating different model results, data processing, exploration, and selecting methods. This technique can provide support to almost a hundred effective techniques that can address anomaly detection, text analysis, regression, and classification and recommendation (Welikala et al., 2018).

    1. Algorithm of Machine Learning

Different examples of machine learning programs are support vector machine, decision forest, logistic regression, and network regression. These four algorithms have different functions in social media analytics; Logistic Regression is a specific linear Machine Learning algorithm. It is very effective for task classification as well as it can be utilized as a prediction model. The Support Vector Machine is a particular approach to supervised learning. This is very effective in solving numerous issues regarding classification. It can accept labelled training data to generate hyperplanes that enhance the margin between high-dimensional space classes. The Decision Forest Algorithm is an effective method of learning, and it consists of different classification methods. It is utilized to build decision trees, each along with several classifications (Sharmin and Zaman, 2017).



Figure 6: Machine learning algorithm

Source: - (Sharmin and Zaman, 2017).

According to Wang et al., several social researchers leverage Machine Learning Technology to construct a specific social model of social media analytics that can identify abnormal behaviours from the social media platform or post.

    1. Generative Adversarial Networks





Figure 7: General Adversarial Networks


Source: Huang et al., 2020).


These particular networks are a specific class of Artificial Intelligence (A.I.) algorithms utilized in unsupervised Machine Learning. In the era of social media analytics, this particular network is a specific Neural Network that is adequately trained to determine the fake data to fool the discriminator network. On the other hand, the discriminator is a Neural Network trained to discriminate the original data samples from the number of synthesized samples. This comparison between the generator and the discriminator has shown that the generator generates much better synthetic data. Furthermore, the discriminator becomes more knowledgeable and skilled in synthetic data (Huang et al., 2020).

    1. Way of Utilizing Machine Learning in Facial Detection

Every industry regarding facial recognition technology is randomly maturing as the cause of advancement in Artificial Intelligence and Machine Learning and Deep Learning Technologies. Face recognition is a specific technology that can recognize an individual based on the face. It enables machine learning algorithms to find, store, capture, and analyse facial features to match with different individuals (Dhaoui et al., 2017).

    1. Face Identification

In social media analytics, Machine Learning plays a significant role in detecting different individual faces. The image recognition process utilizes machine learning to properly train every computer to determine a specific brand logo or photos. This particular ability of machine learning is very effective for the business when different organization customers have uploaded the photo on social media platforms. On the other hand, catboats are the most specific application of Artificial Intelligence that can mimic the original conversation (Liu et al., 2018). These catboats can be successfully embedded in different websites, including online stores or different third-party apps like Facebook, Messenger, Twitter, and Instagram. These catboats can allow the business to automate customer service without using human interaction. In social media marketing, Artificial Intelligence can be the most effective tool for businesses to forward the business in the future (Liu et al., 2018).

According to Al-Garage et al., social media is a particular platform that gives a huge opportunity to make a specific online community and share information and exchange content. One of the most common reasons to build the cyberbullying prediction model is to utilize the text classification approach, which involves machine learning classifiers from several labelled text instances. The primary reason is that the text of Social Media Websites is written in a specific and unstructured manner. It makes it difficult for the lexicon-based approach to identify cyberbullying. Furthermore, lexicons are utilized to obtain features; these can be utilized as inputs to different machine learning algorithms. In social media analytics, the machine learning field aims to improve and utilize several computer algorithms that develop along with experience. The primary objective of machine learning is to determine and define specific patterns and the correlation between different data.

Social media analytics plays a crucial role in the business environment to develop the business process and its competitors.






Figure 8: Framework of the social media analysis


Source: (Tay et al., 2020).

The sentimental analysis is a specific instrumental in social media for controlling the purpose in the business field. It allows the user to enhance an overview of the vast public opinions beyond particular social media marketing topics. The sentiment analysis in the text is not a very complex process compared to the video (Sangaiah et al., 2020). Dual sentiment analysis has considered two different sides of one review. This particular experiment is carried out on two separate tasks, and the first one is the classification of polarity, and the second one is positive. The negative sentiment classification and the experiment are appropriately evaluated using nine different sentimental datasets (Tay et al., 2020).












Support                        Vector Machine


The accuracy of the analysis     is    always


Creating                            and following                     different





better by utilizing the Support                        Vector Machine (Nadikattu, 2020).


public sentiments is not beneficial for fast choice making.




Speaker         specific speech data


It helps to understand the sentiment of the speaker's conversations properly.


This is not suitable if two users talk at the same time.






Provide                          better performance in the forecast accuracy.


Every attribute in the image must be labelled (Nadikattu, 2020).

Table 1: Summarization of the challenge and strength of the sentiment analysis in social media.

    1. Social Media Cyberbullying Risk

According to (Al-Garadi, 2019). Cyberbullying is a criticized practice that exists in different kinds of forms such as insulting a particular individual or victim, sending oppressive messages or posts, bullying a victim and making scornful posts made on generally using internet based telecommunication platforms or another word social media. Nowadays it is considered as the most spreading aggressive and oppressive behaviour. The popularity and omnipresence of social media have hastened the online bullying activities. A large number of users are the victims of social media bullying due the constructive features of social media websites (Al-Garadi et al., 2019). As social media favours in establishing friendships and an option to connect with large numbers of individuals or friends regardless geographic location boundation thus expanding the scopes of cyberbullying beyond area limitation. Social media allows anonymous users accession to the online SM platforms, this is considered the primary reason of increasing such bullying activities as well as the social media websites allows secret and continuous spared of these text or non-text based aggressive things.

    1. Utilizing Machine Learning in Cyberbullying Risk Prediction

Social media is an online platform to share information, create an online community and to share content generating a large set of data every day. The massive amount of data that is generated daily is mainly due to interaction of users, organizations and peoples and products. Social media analytics is the tool that is used for the analysis of structure and unstructured data that are generated as an outlet. Social media has now become the versatile big data containing system (Subroto et al., 2019).

By using the text classification approach of the machine learning classifier, a cyberbullying risk prediction model can be constructed. Another model can be created based on the lexicon that can compute a document. The lexicon based models approach is limited in cyberbullying prediction because in social media the texts basically exist in an unconstructed manner making it complex to detect cyberbullying. However, the lexicon model specifically extract features that are used as inputs in machine learning algorithms. In machine learning based cyberbullying prediction model features and their combination performs a crucial role.

As per the research paper, Machine learning basically works on constructing computer algorithms improving through experience. Supervised machine learning algorithm is used to establish a social media bullying risk prediction. There is no such particular algorithm that is best for solving all the problems.

According to (Subroto et al., 2019), the discussed machine learning algorithms are helpful in detecting cyber bullying.

      • Support Vector Machine: It is the supervised text classifier machine learning technique. It is generally constructed on statistical learning hypotheses. The aim of SVM is to reduce the classification error. By identifying the exceptional hyperplane SVM works on separating the hyperplane providing maximum distance or margins between the hyperplane and data points helps in minimizing the risk of misclassification. SVM can convert non-linear features to high dimensional space. The SVM algorithm is beneficial for its scalability, capability in prediction and dynamic updating patterns. The model is more effective in detecting bullying activities in social media (Park et al., 2018).
      • Naive Bayes algorithm: The Naive bayes algorithm works following the bayes’ theorem, used in text classification. It is capable of dealing with a great number of categorical features. This algorithm is widely used in machine learning in cases of social media related aspects as it is faster in prediction of risk.
      • Random Forest Model: The basic structure of random forest model based upon combination he decision forest and ensemble learning. The model works on several classification trees generating outputs or predictions from all trees. It includes a group of tree structure classifiers. Each tree is further divided into nodes that act as the output generation medium. They respond by voting for the most popular class and the superior voted class is taken as output for prediction.
      • K-nearest Neighbour: It is a non-parametric technique of machine learning that is used to classify anonymous instances based on a predicted number by its nearest neighbour. It classifies the anonymous classes considering the majority of votes from unknown neighbours. The simplest classification algorithm is helpful in building a cyberbullying prediction model effectively (Park et al., 2018).
      • Logistic Regression: The regression model is adopted from the statistical background. It works on separating hyper planes between two logistic databases. The regression algorithm of machine learning is generating output or prediction in social media cyberbullying using probability methods.
      • Utilizing of Machine Learning in Predicting Social Media Consumer Behaviour According to (Chaudhary, 2021), nowadays social media is the biggest platform where people are trading and purchasing all the time. It is becoming the easiest way of promoting brands or any large scale or small scale business and even individuals' business continually. Considering it the biggest online platform, transaction or interchange of a variety of information for trading purposes it generates a vast amount of supervised or unsupervised dataset on a daily basis.

The neural network algorithm of ML is working with a grouping of things with shared features, qualities and characteristics. It helps in accurate decision making that means prediction. The labelled data are used in training of standard neural networks.

RNN is another algorithm that helps in creating a prediction model on consumer behaviour. It is used to connect the earlier data with the following one. However RNA is a limited learning algorithm that is not able to perform on long term dependencies (Chaudhary et al., 2021).

As per the research paper, the obtained data from social media is of great significance, the big data from social media can be analysed with the help of machine learning and algorithm to predict consumer behaviours. It is also helpful in realizing the assertiveness of consumers towards social media and perception of consumers by big data analytics.

    1. Benefits of Big data analytics for SM consumers

Big data analytics of social media helps to supervise the future prediction of consumers’ behaviours.

  • As artificial intelligence allows data or information processing from social media websites and different channels thus the strategy of collecting their browsing history helps in predicting their future behaviour.
  • The recent activity of users like the page they like or comments on or the ads they have checked reflects their current interest by their real-time interactions.
  • The big data analytics in social media, with the changing of their demands and choices predicts the future choices (Arasu et al., 2020).

2.14 Gap in Literature

Above mentioned literature review has been developed based on the application of machine learning in social media analytics. Every literature source has been evaluated depending on the main objectives of the dissertation, including the benefits and challenges of machine learning in social media analytics (Sangaiah et al., 2020). This dissertation has been focused on the different algorithms of machine learning and mitigating strategies to solve different problems of social media analytics. This dissertation has discussed some areas to utilize machine learning in social media analytics. Specific gaps have been determined in this dissertation that need to be successfully fulfilled at the specific section of the data collection of this dissertation. The literature review section of this dissertation has addressed the scope and impact of machine learning in social media analytics. The literature review also has some deficiencies regarding the challenges of machine learning and proper monitoring strategy. This particular information gap should be considered an effective gap in this dissertation (Sangaiah et al., 2020).

Batch-based Active Learning in Social Media Data Application

Active learning is a classified data processor for collecting labels and informatics to yield them for unlabelled data. Social media is a platform to let anyone use any opinion or give it from anywhere at any time. As stated by the author (Pohl, 2018), the usefulness of data from social platforms has come to a very impressive state for quite some time. Active learning for data streams enhances the accuracy of the classifier consisting of two consecutive types- pool-based and streaming-based. In social media data analytics, the stream-based is very useful as it allows the learner to receive one sample at a time and let them determine whether to take the data or not. It is very natural in active learning or stream-based active learning in particular, to utilise external feedback from the environment. The data stream in this programming is partitioned and then dissected into segments and then the data cluster function is applied while there is a check for the homogeneity of the data. According to (Bouchachia, 2018),the best result comes out for batch-based active learning when it is applied in social media data is when OBAL or Online Batch-based Active Learning Algorithm is applied. OBAL is most effective when its application is based on textual data compartmentalisation or labelling. With the large number of uncertainty criteria tasking out on the unspecified or neglected data, the programming is effective in chalking out the data labels which are less queried. Repetitive and effectively used machine algorithms in this programming are the KNN and SVM. These two are profusely used as they are extremely swift at transforming textual items or data into vector specific models which, later are seized by the tf-idf or term frequency-inverse document frequency. OBAL is programmed to follow some certain protocols-

  • One data is chosen at first.
  • The budget check is then into action.
  • Query about the data is processed when the uncertainty program is activated.
  • A new input is in action for the data processing.
  • Boundary condition is then analysed of the data.
  • After reaching a point of satisfaction, the data is added to the boundary pool for sorting.
  • Data is sorted under a label.
  • The label is composed then with the nature of the sample.
  • The classifier is therefore trained.
  • Classifier is updated with the new pool of data.


Figure 9: Workflow in OBAL

(Citation-Hellwagner et al., 2018)

Each classifier for this programming has its own different measures when it is to utilize the boundary items in the classification of data. In the KNN algorithm, the item is taken as if it is so uncertain that there is less possibility to have any label majority in its neighbourhood of boundary items. In the other case, in the SVM algorithm, it is trained to identify the known boundary items and will be simultaneously updated if new boundary items are picked out. If SVM is not certain about a specific data item and its label, it is then queried only. In this case, the identified boundary items are applied for training the classifier (Hellwagner et al., 2018). In Social Media Analytics, the active learning method of OBAL is consistently used for its cognitive features and integrity. It is best to classify the synthetic dataset thus, social media dataset is always safe with this programmable in function. Not only it is programmed to compartmentalise the valid data items but it also discards irrelevant and unnecessary data. OBAL also comes with a labelling budget which can be altered while in use and it has its in- built strategies to label. It has very good discrimination power and thus, it is so extensively used.

By using the KNN and SVM, OBAL is highly integrated and swift while working. This program is hugely effective for binary classification of data also. It establishes better communication, longer running rate and provides feedback to the developers. It is a unified software with parameter settings, synthetic and real-world dataset and uncertainty strategies which is useful to gather information. The relevant and irrelevant dataset and choosing what is needed for the platform is what it is supposed to do. The ongoing research on the advancement of this technology gives us hope for it to reach greater and bigger measures.

Analysis of Social Media with Big Data through InfoSphere BigInsights and Apache Flume

As well as with all the functions of development while labelling data, promoting data and using data, it is important to check on the sympathetic approach of consumers and also the providers about using the social media platforms(Bohanec et al., 2017).The first process to attain data as well as processing them comes with the sentimental analysis. These days, the companies and organizations face a set of growing challenges and social media has provided profuse tactics to improve on their quality and quantity. The value is added in good measure to social media for helping through all the loopholes and diversities one organization can face in the current situation (Borštnar, 2017).

BigInsights InfoSphere

This is a huge platform to offer and facilitate every innovative way to use and contain data of huge volume. According to the author (Bohanec et al., 2017) ,IBM InfoSphere BigInsights allows one system to analyse a large volume of data from different sources and formations, in the need of gathering specific information about one specific topic that has not been gained much attention before. By that, this system finds it quite easy to analyse through the challenges, search for a function and then come up with a solution with maximum compatibility. The protocols are as followed-

  • Initiating and speeding up deployments with innovative programs
  • Using existing expertise and solutions
  • Initiating user-oriented investigation and equipment

Apache Flume

This program was first introduced with this name but later was known as Flume NG or Next Generation. It works as a unified network for data collection, conveying a target and short- termed storage (Robnik-Šikonja, 2017). It is profusely distributed, highly reliable and easily configurable. The protocols are as follows-

  • Retrieving messages - which works as a source
  • Storing data temporarily - which works as the channel
  • Destining data to the target - which is known as the quarry or target

It is of great measure to acquire that huge amount of data and then use them onto different perspectives. These two programs are specifically very useful for detailed description of combining data about sentiment analysis and to achieve great success in the field of several organisations. As a part of this specific work, a way of collecting social media data is hugely represented using the programs of Apache Flume, analysing and visualizing the data with BigInsights InfoSphere. This is not merely used and applied to process, stream and visualize data from social platforms but to enhance the application of data onto various platforms while the source is also various too. Also, these programs are extremely fast and agile. The data are mostly integrated and description is always provided. There are many traditional programs for Big Data, but at this very stage, these two are most advanced and hugely used compared to all of the others. Big Data has been used for quite a long run and it is still useful when the requirement comes around.

  1. Methodology
    1. Overview of Methodology

This particular section of the research method is considered to be one of the crucial sections of this particular dissertation so that the most relevant data can be collected and utilized. Numerous research functions provide various benefits to this particular dissertation. It is already well-understood that the proper collection of the research methodology offers different positive outcomes so that the required research work for this dissertation can be appropriately conducted. The methodology of the research work or the entire procedure consists of a research plan, research method, research onion, research approach, and technique. The limitations of the research and the ethical consideration for this dissertation will also be provided in this methodology section of this particular dissertation. It will be easier to explain every section of the research paper via these parts mentioned above. The methods of collecting relevant information related to the research topic can be divided into three major primary categories secondary research methodology for conducting any research work. These techniques are considered to be very useful in terms of collecting relevant data from the research materials. Because of that, it can be ensured that the research methodology is one of the most critical sections of any research paper as it provides a clear vision and proper understanding regarding the improvement of the topics already selected for the research papers.

    1. Research Onion

The framework of the research onion can be considered to be the framework that successfully evaluates the collected data concerning each layer of the onion based on different research operations. This research onion framework is held responsible for maintaining the data flow to the inside layer from the outside and revealing the next layer afterward.





Figure 10: Research Onion

Source: - (Melnikovas, 2018)

The outermost layer of the research onion framework is primarily associated with the research philosophy. On the other hand, the innermost layer of this particular framework is associated with the basic structure and the types of information. The research onion is composed of different operational layers. Those operational layers are highlighted in the above picture (Melnikovas, 2018).

    1. Research Philosophy

The research philosophy is situated at the research onion's outermost layer. This particular research activity follows the positivism research philosophy, which tends to constitute the evaluation of various research scenarios with the assistance of various scientific methods. The research activity for this particular dissertation is primarily associated with critically analysing the machine learning application's role in improving social media analytics operations. Because of that, the analysis will consider real-life incidents, including different facts and figures that can provide the necessary help to obtain the desired outcome. So, a scientific research approach is followed, primarily supported by the positivist philosophy that concerns the present research study. This will effectively help in terms of evaluating the problem statement of this research with primary data employment.

    1. Research Method

The research methodology can be categorized into two different sections, including the type of data collection and the types of investigation. For this particular dissertation, secondary research methodology has been utilized that effectively provides the necessary help in collecting relevant data related to the research topic from various research materials. The research materials are journals, articles, governmental websites, and previous research papers related to this particular research topic. By utilizing the secondary research methodology, the process of collecting relevant data becomes less costly as well as time-consuming. On the other hand, qualitative research methodology has been utilized for this dissertation. This research methodology provides the necessary help in making the paper rich, invaluable, and relevant information. Along with that, it can also be informed that there are several types of software methodology (Sangaiah et al., 2020).

Scrum models represent the progress of the project through a series of sprints. Scrum acts as an agile development methodology utilized within software development based on incremental and iterative processes.

On the other hand, the waterfall model represents the development procedures within a linear and sequential flow. This model is straightforward to understand as well as to utilize. Every phase needs to be executed before starting the next phase in this model, as overlapping the phases is prohibited.

Figure 12: Waterfall model

Source: - (Kramer, 2018)

For this particular dissertation, a waterfall model has been utilized. With the help of this particular model, the phases of the research work are processed and finished at a time. The process and the results are well documented too in the waterfall model.

On the other hand, the Python programming language and the application based on machine learning are utilized to improve collecting relevant information and better information (Sangaiah et al., 2020). Because of its extensive library support, relevant information can easily be collected from social media platforms to analyse the data to identify trends and solve the existing issues.

    1. Research Approach

The experimental research design has been utilized for this particular dissertation. It has provided the necessary help to obtain crucial and relevant information to obtain favourable outcomes. It can assist in terms of summarizing the valuable findings from numerous literary sources. In addition, throughout this dissertation, inductive research.

    1. Research Technique

To adequately explain the problems and the research question of this dissertation, the descriptive research technique has been utilized. With the proper assistance of the research techniques, collecting large amounts of information from the previously published research papers becomes straightforward.

    1. Type of Data Collection

Two major types of data collection can be utilized while writing the dissertation: primary data collection and secondary data collection for this particular dissertation. Secondary data collection has been utilized to collect data from different journals, articles, and research papers related to this particular research topic.

    1. Data Analysis

For this dissertation, qualitative data analysis has been utilized. In the literature review part, previously published research papers are related to this particular research topic. Along with that, educational journals, articles, governmental websites, and so on.

    1. Validity and Reliability

Every single approach to gathering relevant information needs to be adequately validated. Along with that, no data has been modified and followed the dissertation's authenticity and reliability.

    1. Ethical Consideration

Numerous problems were faced when gathering relevant information for the dissertation, such as data plagiarism, data loss, etc. Also, proper confidentiality and privacy of the data have been appropriately taken under different valuable guidelines.

    1. Research Limitation

The research limitations faced during the research work are lack of information, weak technical implementation, and poor data handling. Lack of support and lack of funds have also been faced while conducting the research work.

  1. Implementation

Importing the necessary packages





The above image shows the source code in which the import NumPy function is used. Numpy is a general-purpose array processing package that used Python code. NumPy is one of the basic packages for scientific computing in Python. The Numpy array facilitates the advanced level of mathematical and other types of operation with a large number of data.

Reading the data file


Finding out the top words using by entire dataset. The above image shows the raw data of the headline and publishes date of the projects.







The above image explains how to Applying a count vectorizer and plotting by the word count. This code is defined through the helper function. The word vector shows the top words in the above source code, and a shape is given to the headline.





The above image discusses the count_ vectorizer, which helps to specify the word values, aligns the vertical rotation, and provides a headline data set. Ax.set function is used to show the word, top headline, and several occurrences.





The above graph shows the number of word occurrences in the top words in the headline dataset.

Here 15 words are shown, which is used in the headline dataset. The number of occurrences of every word is shown in the graph. The highest value is between 35000 to 40000.




In the above image, we can see that nltk is imported. Tokenizing the words and sentences using word and sent tokenize functions is used to complete source code. This function import all the data set used in the code.





In the above source code we can see that PoS tags are applied for headlines in the dataset. PoS tag is one of the famous natural language processing that helps categorize the word text in correspondence with a particular part of speech. With the Pos function, we have made the headline of the application. The above image shows describe the PoS tag.





Drawing histogram for headline word lengths

The above graph describes the histogram of the project. This histogram is made with the help of the above source code, with figure size 18, 8 and range value is 1, 14. The histogram represents the distribution of numerical data. This graph represents the headline word length of the application.

The above images also show the Pos tagging for headline corpus. Through this function, we have sorted the headline of the application.








The above image shows the Performing monthly counts daily counts yearly counts. The above code reindexed function is used. Reindex function is sued to conform series to form a new index to optional filling logic.








The above images are the graph of monthly, yearly and daily counts of words of 9 years from 2004 to 2020.

The Vectorizer function shows the nested sequence of objects or NumPy arrays as inputs and returns a single. The above image describes the topic vectorization. We are constructing the features to process by using a count vectorizer.

Selecting the no of topics to be included





Applying LSA





LSA (latent semantic analysis) is a technique in natural language processing. The central core idea is to do a matrix. The topic matric will help to predict the type of headline. LSA helped to set documents and the terms they contain by producing an asset of concept.

It gives the predicted topics for topic modelling







The above image shows the integer list of the application. This code shows how the application will manage all the count details of the machines. This will help to count the categories of the product. Finding the most frequent words in each topic





The above images shows the code of the term matrix, word vector is used to make the matrix and if key is used to check the vector sum.








The above images shows some of the topic which this application will help to search.

Plotting the bar plot for top news among 8 topics





The above image shows the graph of the top news among 8 topics. The graph shows the counts of topic which is done.

Applying dimensionality reduction technique t-SNE





The above code is sued to rate the LSA model.








Applying LDA










The above images used the function latent dirichilet allocation. LDA is used to extract the topic. LDA help to treats documents as probabilistic distribution set of words or topic. LDA is a type of linear combination, a mathematical process in which various data item and applying function are set separately to analyse the multiple classes of objects.

LDA topic counts



The graphs shows the counts of LDA topic.




In the above image, t-SNE is applied for the reduction of dimension. This is used for exploring the high dimensional data. This helps to convert the similarities between data points to joint probabilities and minimize the divergence between the joint probabilities.

A works better compared to LSA in separating out the clusters






Scaling up the performance

The above code is based on Scaling up the performance. This is used to make something more significant in size, amount, etc. Scaling up help for production capacity










Plotting the heat map for above 8 topics






The above is the graph of the heat map. A heat map is a graphical representation of data that is used for the system of colour coding to represent different values. Heat map is sued to different forms of analytics but it is mostly used to show the behaviour on specific webpage or webpage template.

  1. Discussion
    1. Applications of Machine Learning in Social Media Analytics Social Media Analytics

In this digital era, when everything becomes digital and people are using social media and other digital platforms for getting every type of service, in this situation social media analytics become the essential tool for many applications. Social media analytics is the process that helps to gather various information from the social media platforms; this data can be beneficial for many applications, especially in business decisions. The organization can be benefited by using social media analytics as the data can help to take action and decisions as per the user need and present trends. This process to collect the insight of the social media platforms can be advantageous to many of the applications. Social media analytics is also important as it can help identify the users' activity and preferences. Along with this, social media analytics gives a better understanding of the target audience's preferences, behaviours, and other important data about the audiences. Hence it is important to measure social media analytics, and for this, there has increased the popularity of using machine learning along with this Artificial Intelligence and deep learning (Camacho et al., 2021).

Machine Learning and AI in Social Media Analytics

From social media, it is easy to get a lot of data but using the AI and machine learning system, the data can be useful for many applications, especially for digital marketing.

      • Provide accurate insights - the social media insights can provide guidance to the digital marketer about the present trends, target audiences' behaviours. This information can help the market to make decisions and act as per the customers' needs. It can help the market to offer more products and services as per the market needs. But to analyse the larger data, in this case, the market needs to analyse the bigger conversation that can be helpful to analyse the big data. But to analyse this type of data can be complex, noisy, and unstructured. To collect this type of data and to categorize and sort, machine learning and AI can help. To do this analysis manually is not possible as it can be a bigger process and time taking. But machine learning is capable of analysing any amount of social media data (Sun e al, 2019).
      • Identifying the conversation - to identify any trends in social media, need to do an analysis of the conversation of the audiences. The present trends can be beneficial to the digital market and the most important part of social media analytics. Artificial intelligence can be beneficial as it helps in this case to highlight the post that is valuable to find the present trends. Machine learning can be advantageous as it helps to analyse and differentiate between the posts and can identify the crucial part that can be helpful for the digital marketer. Machine learning is trained with the ability to distinguish the pattern or images that are available on the social media platform. By applying machine learning in social media analytics, one can be able to recognize the patterns of the conversation and can provide results to the market with better accuracy.
      • Identification of the trends - addressing the present trends in social media is an essential part of social media analytics, and it can be done by applying machine learning. Machine learning using AI is capable of identifying patterns or images that can be available on social media platforms. Using those patterns, machine learning can help identify new trends of the new topic that is becoming popular among audiences. The market can use machine learning to get benefits of this and can offer services or products by knowing the present popular trends. Machine learning basically uses the technique that is "unsupervised," this helps to find a new structure and highlight the structure for better findings. These algorithms can detect new trends and new topics over the existing values (Stieglitz et al., 2018).
      • Social media monitoring - social media analytics and monitoring is the most popular tool for the digital business. For this need to manage the various social media platforms. The most popular social media platforms that can give social media analytics are Twitter and Instagram. These are the in-built social media platforms and used as a tool to measure the social media insights like reaches to the target audiences, like in the posts these insights are beneficial as it provides the data regarding the audiences’ preferences, behaviours and others. AI is helpful in this case as it can provide recommendations about the customers' behaviours based on the social media algorithms. Artificial Intelligence can be helpful for the digital marketer for giving social media analytics and can give information for social media monitoring.
      • Sentiment analysis - machine learning and AI have a feature that is sentiment analysis. This sentiment analysis is used to judge the text of social media and can be used to clarify the opinion of the text data of the social media platforms. For this analysis and the process, machine learning basically uses the Natural Language Processing making pair with machine learning and can identify the predefined sentiment data. This can help to understand the sentiments of the messages and texts. The digital market can apply this sentiment analysis tool to understand the feedback from the customer, and thus, they can improve their service as needed.
      • Image recognition - machine learning has the ability to recognize images using Artificial Intelligence, and it can train the machine to identify the brand logos and other useful images. When any audience uploads a picture of any product with mentioning the brand name or the product details, then artificial intelligence can share the posts identifying the images (Grove et al., 2019). By this, the digital marketer can identify their potential customer and loyal customers and can engage them more with the business posts.
    1. Machine Learning Algorithms Useful In Social Media Analytics

Social media analytics implements different machine learning and deep learning algorithms. The different algorithms provide different sets of advantages and different limitations. Social media analytics identifies the right algorithms to be used to analyse the tons of data collected from the several social media platforms. Social media analytics includes numerous tasks such as behavioural analysis, statistical analysis, content analysis, predictive analysis, text analysis, natural language processing, and many more, that require the use of different algorithms (Arasu et al., 2020). Different algorithms can be categorised in different types such as clustering algorithms, prediction algorithms, classification algorithms, and data visualization. Some of the most common machine learning algorithms used in social media analytics include Logistic Regression, Support Vector Machine, and Naive Bayes Theorem. These algorithms are used for extracting useful information from the enormous chunks of data collected from the different social media platforms (KC, 2017).

The natural language processing algorithms are used in understanding the human language and analysing them which can be used for sentiment analysis, customer behaviour and attitude, and customer interests. Social media platforms are filled with billions of blogs, comments, reviews, tweets, status updates and posts which are such kinds of data that are not structured and are difficult for relational databases to store and analyse (Arasu et al., 2020). However, this unstructured data contains tons of information that can be used to obtain competitive advantages in businesses. Thus, for collecting such unstructured data, text mining algorithms are used (Arasu et al., 2020). The different text mining algorithms that are used in social media analytics include retrieval, extraction, and summarization, categorization, clustering, and filtering. The supervised learning algorithms of machine learning are used for the categorization tasks while the unsupervised learning algorithms such as K-means and term-based ontology, are used for clustering. The Support Vector Machine is the ML algorithm that is used for filtration tasks in the text mining process. Social media analytics implements clustering algorithms for segmenting customers based on common aspects. This helps in identifying target segments for target marketing or customised advertisements (Arasu et al., 2020).

There are several machine learning algorithms that are used by social media analytics. The algorithms are mostly classified into three types based on the approach of working of the algorithms. The three types of algorithms are supervised, unsupervised and reinforced learning algorithms. The supervised learning algorithms are the kinds of algorithms that are provided with both the data and the expected results and the prediction of the algorithms are based on the differences between the expected and the final results (Arasu et al., 2020). Based on the differences in the results the required changes are made to the model in such a way to ensure that the outcomes will match with the expected outcomes. Some of the supervised machine learning algorithms that are used by social media analytics include Logistic Regression, K-Nearest Neighbour, Random Forest, and Decision Tree. Supervised algorithms are used by social media analytics to predict sales and revenue outcomes for the future based on the data collected from social media platforms (Arasu et al., 2020). Unsupervised algorithms, on the other hand, are not provided with a target variable or expected results to help match the real outcomes with the expected outcomes. Such kinds of machine learning algorithms are used by social media analytics for clustering purposes such as segmenting customers based on different demographics, interests, regions, and so on (Arasu et al., 2020). This helps businesses in targeting the right segment of customers for specific intervention. Some of the unsupervised algorithms used in social media analytics include K-means, and Apriori algorithm. Reinforcement learning algorithms are used in social media analytics to provide help in decision making. In reinforcement learning the machine learning model is not provided with any results and it learns on its own through trial and error method. This algorithm uses tons of historic data to extract as much knowledge as possible and trains itself to improve its results. One example of a reinforcement algorithm is the Markov Decision Process (Arasu et al., 2020).

Some of the different algorithms that are used for different purposes in social media analytics can be listed below with their uses.

  • Regression algorithms: Regression algorithms are mainly used for prediction purposes based on the data fed to the machine learning model. There are two kinds of regressions namely linear and logistic regression. The linear regression algorithm is used to identify the dependent variables based on the analysis of the independent variables. However, the logistic regression algorithm is used to determine the target variable’s probability. Thus the data collected from the different social media platforms are used by regression algorithms to predict outcomes of the businesses (Balaji et al., 2021).
  • Naive Bayes: Naive Bayes algorithm is used to determine whether a data that has been collected newly belongs to a specific class or not based on the Naive Bayes Theorem. This helps in identification and the description of the new data and process it accordingly. It helps in assuming the dependency of a feature based on other features (Balaji et al., 2021).
  • Decision Tree: The decision tree algorithm is implemented by social media analytics generally for classification purposes. The algorithm divides the dataset into two homogeneous sets or more based on distinct properties of the data. Decision trees are also used for handling the issues of classification and regression (Balaji et al., 2021). It is used for one of the most important purposes of social media analytics which is sentiment analysis which is a process for determining if a data is negative, positive or neutral. Such helps in understanding the sentiments of customers towards a product or service sold by a business organization. It is used with natural language processing to determine the sentiment associated with the text information in social media posts, product reviews and other text information from different social media platforms (Balaji et al., 2021).
  • K-Nearest Neighbours: The regression and classification problems can also be solved using the K Nearest Neighbour algorithms. It contains all the old classes and identifies whether the new data belongs to any of the classes. This can be used as an alternative to the Decision Tree algorithm as it does the same purpose. In general, the KNN is considered to be the simplest one to implement as compared to decision trees (Balaji et al., 2021).
    1. Advantages of Machine Learning in Social Media Analytics

There are different advantages of machine learning in the case of social media analytics. These advantages are including,

Sentiment Analysis

This is the specific process of successfully examining every audience comment to determine the positive, negative, and neutral intents. These effective results have the ability to assist the organization in figuring out the way customers feel regarding different services or products of the organization. The process of sentiment analysis is one of the most time-consuming processes. Due to this particular reason, the implementation of machine learning is very important to control these problems adequately. Different sentiment analysis tools can be easily trained along with emotions to organize the intents of the most innovative inputs (Hopwood and Bleidorn. 2019).

Managing Reputation

Most of the time, an organization cannot always expect positive mentions on social media. Most of the brands used to face different complaints from customers on social media as well as fake news also. This news is very responsible for damaging the product of this particular brand. Machine learning is the only effective way to tackle these problems. There are a number of tools available for managing reputation. These tools include Trackur. Social mention, Review Push, and others. These machine learning tools are combined with artificial neural networks to assist specific brands in finding out every single problem and helps to react to those problems (Hopwood and Bleidorn. 2019).

Specific Content Offering

Usually, Instagram is utilizing Artificial Intelligence(AI) to select specific content from the organization's Instagram to explore page as well as it is dependent on the last interaction on the Instagram platform. This particular platform is utilizing an effective algorithm to determine the suitable content based on the interest successfully. In the case of the other social media platforms, Twitter and Facebook are utilizing for providing, most innovative connections like trends, content, and others. In addition, trends have been shown based on the company's location on Twitter (Hopwood and Bleidorn. 2019).

Targeting of Smart Audience

In this case of social media analysis, most organizations used to reach massive audiences if possible to attract the most potential consumers. On the other hand, the massive number of audiences are not able to enhance the marketing ROI. Due to this reason, the organization needs to implement a more accurate and effective method to select the target audience. Nowadays, most organizations are utilizing this particular approach regarding machine learning to define the target audience as well as reach out in a crucial stage. A number of companies are changing the approach to tbs particular machine learning-based approach to optimize company's advertisements successfully.

On the other hand, smart audience targeting helps to gather sufficient data regarding the region, gender, positions as well as age, and region. It can be said that the utilization of machine learning technology in social media can help to allow machines to consider the best advertisements for users. This technology is used to gather numerous information from users as well as examine it and find out the expectations of users. This particular process helps organizations to determine new customers over the world.

This smart audience is used to target the interests of social media users by utilizing artificial neural networks. Through this machine learning-based smart audience, an organization willbe able to prioritize the company's audience on social media easily. Moreover, this particular technology can be utilized to provide the most relevant and useful ads, content along with the proper posting times (Mahudeswaran et al., 2019).

Selecting the Proper Social Platform

It has been seen that considering the proper platform for the organization is a significant part of the marketing strategy. Demographics play an important role in identifying the appropriate place to find a better target audience. Different Machine learning-based machine learning tools have the ability to provide the most insightful data. Moreover, machine learning has a significant role in examining specific information in order to receive the most effective patterns (Mahudeswaran et al., 2019).


Processing of Image

Tracking products, services are very significant things in social media marketing. One of the major faults is that most of the time, users do not mention the product's actual name at the time of posting the product image. Due to this specific reason, most organizations should utilize the image processing system to interact with different social posts. This particular image processing is the most useful application of machine learning as well as it can be utilized to identify images. On the other hand, most of the organizations used to train computers to successfully recognize specific logos or different kinds of goods on different social media platforms (Agung and Darma. 2019).


Identifies Fraud Transactions

Machine learning is one of the effective technologies that are very useful to examine customer behavioural data. In the case of the business, this particular technology tracks user actions as well as determines the most unusual things of the users. Machine learning is used to enable organizations to generate a specific algorithm to process massive datasets along with a number of variables and help the organizations to find out the correlations between the patterns of customer behaviour as well as the effect of different fraudulent activities (Agung and Darma. 2019).

On the other hand, another vital factor is selecting this machine learning technique for identifying different fraudulent transactions by providing an auto-detection system and a quick data processing speed compared to manual work.

Increased media quality

Media is one of the significant parts of social media. The good quality of videos as well as images and photos can be increased automatically through the utilization of machine learning techniques. As an example, it can be said that Twitter is utilizing the machine learning technique for enhancing the quality of photos and videos to provide a good visual experience to users (Agung and Darma. 2019).

    1. Impact of Machine Learning in Sales Record

After analysing multiple research papers in the literature review section, it can be said that machine learning and artificial intelligence are two innovative technologies that play a vital role in social media analytics. Machine learning provides a substantial impact on sales records in ten different ways; these are given below:

      1. These days, the sales team of multiple organizations start incorporating artificial intelligence and machine learning in sales. Based on the information of a research paper, these two innovative technologies analyse social media data. After collecting important information from social media, social media analytics apply machine learning technology which plays a vital role in cost reduction and calls time reduction. They can reduce 60% of operational cost, and they can also reduce up to 70% call time (Bohanec et al., 2017).
      2. The sales department of an organization can adopt machine learning-based social media analytics to predict sales performance. The salesperson of an organization should consider the prediction and apply those to the real-life environment to get a better result in sales.
      3. Organizations can collect data from various social media platforms such as Facebook, Instagram, Twitter, etc. The social media analytics tool considers those data and applies machine learning algorithms to predict future sales. The machine learning algorithm can be applied to the test and training dataset for conducting a reliable prediction. Based on the prediction, organizations can organize their sales programs (Bohanec et al., 2017).
      1. According to a research paper, 30% of business-to-business organizations implement artificial intelligence and machine learning technology for sales processing. These two innovative technologies play an essential role in pattern recognition and identify the valuable customers of the organization.
      2. Social media analytics uses machine learning technology to collect historical data of buying, pricing, and selling. An effective machine learning algorithm improves the accuracy of the data prediction and forecasts more appropriately. Sales planning applications and sales management can use this prediction to improve sales performance (Pavlyshenko, 2019).
      3. After analysing data from social media through machine learning, the organization can enhance the customer lifetime value. It helps to develop the roadmap for enhancing customer lifetime value.
      4. Organizations need to follow a time management approach to enhance sales performance, which can be achieved through social media analytics. It uses machine learning algorithms to analyse social media data, and it predicts future sales performance in less time. It saves a tremendous amount of time for the sales management team and team members to not manually collect information (Pavlyshenko, 2019).
      5. It helps to personalize sales marketing content by analysing machine learning algorithms in social media data.
      6. The price optimization strategy is one of the most essential elements of sales for any organization, and this innovative technology plays a vital role in developing a price optimization strategy after analysing social media data. It helps to understand the price strategy of competitors and enhances competitive advantages.
      7. CRM applications like Salesforce have the ability to properly define a salesperson’s schedule that is dependent on the specific value of the potential sale. This potential sale is combined with the strength of the sales lead as well as it is dependent on the lead score of the sales. Artificial Intelligence (AI) and Machine Learning (ML) are used to optimize the time of a particular salesperson in order to move from the meeting of one customer to the next by dedicating the customer time to the most valuable prospects.

5.5. Current Challenges of Implementing Machine Learning in Social Media Analytics The benefits of machine learning in social media analytics are many. But some of the characteristics that are connected with machine learning cannot be ignored. These challenges can cause problems in modification and adaptability issues in machine learning. These challenges that are causing problem in implementation of machine learning are:

Challenge of fast moving data

A major challenge in social media analytics is that it has to deal with a lot of fast moving data inputs. Those streaming data is very useful for task monetization and like detection of fraud. It is becoming important for machine learning to adapt and handle the fast moving data flow. Machine learning can be used for incremental learning of the big data set, to learn from existing data to learn and employ such learning in the future dataset.

Challenges of noisy data

Denoising auto encoders is used in incremental learning. Denoising encoder is the type of auto encoder which helps in extracting the features from corrupt inputs. The features that are extracted are usually noisy in nature. They are also robust and are considered best for the purpose of classifications. Algorithm of machine learning which if used in general purpose, hides layers to help in extraction of representation of data and features. There is a hidden layer in denoising auto encoder which helps in extraction of features with the number of hidden nodes being equal to the same number of extracted features (Stieglitz et al., 2018).

Challenges of sample extraction

The extracted samples that don't follow the objective functions that are collected and are used in adding the nodes that are new, into the hidden layer. These nodes are initialized to the layer (hidden), on the basis of samples collected. The Data that are inbound are used in retaining all the features. This iterative feature of learning and mapping can helps to improve the generalization and discrimination of objective functions. But routine addition of data features can cause numerous redundant and overflowing data. The same features that are merged to create a set of compact features (Sebei et al., 2018).

Incremental learning problem

Method of iterative learning features helps to illustrate the convergence of the optimized number features in the massive online setting. This example of iterative feature extraction is helpful in the application of changing data with the provision of time in massive online dataflow. Some of the iterative learning features and the extraction lead to generalization of the other algorithms of machine learning like RBM. It also helps to make possible the adaptation of new inbound data streams. It tries to avoid the costly cross validation analysis of selection of number of features in the massive data sets.

Problem of huge data-set

Effective results have conceived the result of quality of the effective huge set of data that have been focusing on the data set model with the variant of numerous model variables that are available to extract more complication of features and representations. The problem of machine learning training with the billions of variables using numerous CPU cores. With the concept of recognition of speech and computer vision. The framework of software named DistBelief that is developed to use the cluster of computers with the use of thousands of machines in training of large-scale dataset. The framework model that supports the equilibrium between Performance and multithreading (Kurilovas, 2019).

Impassive data-set

The informal environment on social platforms encourages people to use colloquial and personal elements in their language. And in many cases, it is beyond the understanding of automated sentiment analysis programs. This makes it difficult for them to explain the background and mood of the brand in question. Most data are misinterpreted due to presence noise or unwanted data.

Inclusion of number don’t count

The number of likes on the brand page does not accurately reflect actual engagement or conversions. But as highlighted in the social media profile, most analysis activities are ultimately aimed at analysing and increasing the number of likes on the page. This generally results in a lack of tangible return on investment, even for successful activities. Many people have biased information about the company and their data, these lead them to have a biased opinion on the engagement (Morota et al., 2018).

Incomplete images

The contrast between people in the social media ecosystem is as compelling as they are in the world. Although some people have an active voice and interact more actively, most people just surf. Participation may also vary based on demographics and activities. Although you may hear more open opinions from younger audiences, larger groups of older people may choose to remain silent. This problem of older people remaining silent has occurred due to increased use of social media and many people don't have an opinion.

Data relevance and quality

The quality of the analysed online data is always a concern of the company. Social media platforms are flooded with false and duplicate personal information. Most users do not provide their actual information on social media. This makes it harder for machine learning to identify and process the actual information from the false data. This issue is added to the access restrictions in most configuration files, making it difficult to verify their validity. In addition, social channels provide little or inaccurate information about user journeys (Morota et al., 2018).

The challenges of implementing machine learning in social media analytics have been discussed above. Machine learning is the booming technology that is being implemented into the mundane and repetitive works such as social media analytics and data analytics. The importance of social media analytics in business are many fold and are used to collect the data for decision making and marketing strategy of the company.

6. Conclusion

With the help of machine learning applications, the users can obtain numerous benefits in gathering relevant data. As an example, it can be informed that Whatsapp data, which is one of the most utilized social media platforms, can analyse what the users are texting or what the users send documents or files to others. On the other hand, the data collected from the social media platform named Instagram can reach more users so that more valuable and relevant content can be shown to them. In addition, information collected from Facebook can effectively provide the necessary help in analysing the users' sentiments, and the data gathered from YouTube can help analyse the point of interest of the users so that the social media analytics can better understand their targeted customers. Several use cases are relevant to the research topic of machine learning applications in social media analytics: Instagram algorithm, Facebook Posts sentiment analysis, YouTube trending videos analysis, WhatsApp chat analysis, and so on. Algorithms within social media platforms can be defined as the technical support provided. Existing posts on those social media platforms can be sorted based on relevancy despite sorting based on the publish time. This helps to present or show more relevant content to the users of the platforms, which will retain the existing users and attract more users to the social media applications and websites. In this case, the application of machine learning helps significantly in the entire process of gathering relevant data and analysing it can be significantly improved. Those machine learning applications are image recognition, speech recognition, traffic recommendations, and product recommendations.

Reference list

  • Dias, D.S., Welikala, M.D. and Dias, N.G., 2018, September. We are identifying racist social media comments in the Sinhala language using text analytics models with machine learning. In 2018 18th International Conference on Advances in ICT for Emerging Regions (ICT) (pp. 1-6).    IEEE.                   Available     at:             Dias/publication/330469194_Identifying_Racist_Social_Media_Comments_in_Sinhala_Lang uage_Using_Text_Analytics_Models_with_Machine_Learning/links/5c70d9ca458515831f67 cc33/Identifying-Racist-Social-Media-Comments-in-Sinhala-Language-Using-Text- Analytics-Models-with-Machine-Learning.pdf
  • Fan, C., Wu, F. and Mostafavi, A., 2020. A hybrid machine learning pipeline for automated mapping of events and locations from social media in disasters. IEEE Access, 8, pp.10478- 10490. Available at:
  • Grover, P., Kar, A.K. and Janssen, M., 2019. Diffusion of blockchain technology: Insights from academic literature and social media
  • Harfoushi, O., Hasan, D. and Obiedat, R., 2018. Sentiment analysis algorithms through azure machine learning: Analysis and comparison. Modern Applied Science, 12(7), p.49. Available at:
  • Hu, L., He, S., Han, Z., Xiao, H., Su, S., Weng, M. and Cai, Z., 2019. Monitoring housing rental prices based on social media: An integrated approach of machine-learning algorithms and hedonic modeling to inform equitable housing policies. Land use policy, 82, pp.657-673. Available at:
  • Tay, L., Woo, S.E., Hickman, L., and Saef, R.M., 2020. Psychometric and validity issues in machine learning approaches to personality assessment: A focus on social media text mining. European Journal of Personality, 34(5), pp.826-844. Available at:
  • Yao, Q., Wang, M., Chen, Y., Dai, W., Li, Y.F., Tu, W.W., Yang, Q. and Yu, Y., 2018. I am taking the human out of learning applications: A survey on automated machine learning. arXiv preprint arXiv:1810.13306. Available at:
  • Yao, Q., Wang, M., Chen, Y., Dai, W., Li, Y.F., Tu, W.W., Yang, Q. and Yu, Y., 2018. I am taking the human out of learning applications: A survey on automated machine learning. arXiv preprint arXiv:1810.13306. Available at:

Get high-quality help


Daniel Miller

imgVerified writer
Expert in:Information Science and Technology

4.1 (256 reviews)

Thanks to their vast knowledge and brilliant ideas, I completed my dissertation on time. Their services are highly recommended.

img +122 experts online

Learn the cost and time for your paper

- +

In addition to visual imagery, Cisneros also employs sensory imagery to enhance the reader's experience of the novel. Throughout the story

Remember! This is just a sample.

You can get your custom paper by one of our expert writers.

+122 experts online