Paper on Natural Language Processing to Detect Fake News with Political Incentives

The problem of fake news has become an important topic of research in various fields, such as linguistics and computer science. Many have defined the concept of fake news in various ways. In the most recent incarnation of the problem, and especially in the 2016 United States presidential elections, fake news can be defined as the creation and spread of articles that either favored or attacked the contestants, including Donald Trump and Hillary Clinton. Generally, the concept of fake news does not only include false and misleading information that is spread to deceive people but it also includes the bias that is inherent in the news that is produced by biased individuals. This paper outlines the types of fake news, the spread and effects of fake news, approaches to political fake news problems, text classification of fake news, and practices on how to determine political fake news.

Types of Fake News

People often differ in opinions whenever it comes to identifying the types of fake news. However, when it comes to online content, there are various types of misleading news, such as clickbait, sloppy journalism, propaganda, misleading headings, satire, and slanted or biased news. Clickbait entails stories that are usually deliberately fabricated, with the producers targeting to attract new website visitors and gain more adverts revenue. To achieve this, the producers use sensational headlines to grab the attention of the public (Tandoc Jr, Lim, & Ling, 2018). Sloppy journalism is whereby journalists may publish stories with unreliable information and without cross-checking with the relevant government or state authorities. For instance, in the 2016 U.S. presidential election, the fashion retailer Urban Outfits published a misleading election guide informing the Americans that they were required to have their voter registration cards with them on the voting day. This was unreliable and misleading information as the registration card is not needed by any state for any American citizen to vote. Propaganda constitutes stories that are created by biased individuals to deliberately mislead the public or to enhance their political agenda or cause. Misleading headings are deliberately used to distort stories that are not completely false. These sensational headings spread fast on social media sites where they appear as newsfeeds to the public. Satire stories are false stories that are produced or created by various social media accounts or websites for entertainment and parody (Rubin, Chen, & Conroy, 2015). There are lots of these accounts, including the Daily Mash and the Waterford Whispers. Biased news constitutes stories that confirm our beliefs, and many people are often drawn to such news. As such, some individuals try to use this opportunity to spread deceptive news. Social media feeds bring news and articles that are usually based on personalized browsing searches.

How Fake News Spread

What makes fake news in today’s society most alarming is the speed with which it spreads. Technology, especially social media platforms such as Twitter, plays major roles in spreading misleading news (Vosoughi, Roy, & Aral, 2018). Twitter is perceived as one of the social media sites where false stories, especially politically related news, broadly spread fast. The speed of the spread of deceptive news is probably tied to various issues such as political and financial incentives. The political incentives are clear as voters and non-voters attempt to spread fake news either to pursue and defend their political agenda or wish for their preferred candidate to win an election. The financial incentives, however, have complicated matters online, but the misleading news producers tend to spread false information just for their financial gain. For instance, the widely discussed case of “Macedonian teenagers” illustrates the issue of financial incentives (Allcott & Gentzkow, 2017). These Macedonian teenagers created website content in very many areas, such as sports and health. However, they found out that the U.S politics paid the most ad revenue and decided to create fake news related to pro-Trump stories for their financial benefits.

Effects of Misleading News

The effects of misleading news in specific situations, such as the 2016 U.S. presidential election, are enormous. Social media was an important source of news for many Americans during the election campaign period. Online surveys and analyzed browsing data attempted to quantify the amount of exposure to and engagement the Americans had with these false stories, and the surveys concluded that the fake news was intended to feed bias to people. As a result, many people believed in various stories that favored their preferred presidential candidate. This clearly shows how fake news could have affected the results of the presidential election. Deceptive news has become a much more general problem as it instills a negative impression in people, and as such, it is hard to believe in stories posted online or aired in television and radio broadcasts. This problem has so far proved to be a problem for a country’s democracy as it attempts to erode the general public’s trust in public institutions.

Approaches to Political Misleading News Problem

The causes, spread, and effects of fake news on the public are all complicated issues. As such, there is a need to take multiple approaches and efforts to address this issue that has become a menace in today’s society. These approaches should empower individuals on how to evaluate potential political misleading news they may come across and structural changes needed to eliminate the spread of political fake news. The approaches may include educating the public, performing manual checking, analyzing and eliminating the spread of political deceptive news, and performing automatic fact-checking and classification.

Education efforts can be enhanced starting at the school level by equipping students with media literacy and general education to empower them on being responsible citizens. Concerning the forms of education, there is a need to concentrate on political news and their sources. As such, people need to educate and encourage the public to often examine the source of particular political news, read beyond the headlines to ensure that the content of that story is not false before spreading the information to other people. Moreover, educating journalists and the general public on how and providing them with tools to use in verifying information is another essential strategy.

Manual checking of false news plays a significant role in containing the spread of political misleading news and stories. Two broad efforts can be used to perform manual checking involving performing manual checking on various social media sites and using fact-checking websites. Social media sites, such as Facebook, have over the years responded to the social pressure of the common belief that they aid in the spread of political fake news by hiring more manual fact-checkers. Human monitoring has proved to be desirable because it has ensured that fake claims or stories are accurately verified before they spread again. However, it has a downside as it is based on the mental status of individuals performing the checking. As such, their minds may be exhausted due to lots of work and this may contribute to inaccurate checking of false information, hence contributing to the spread of political deceptive news again. Moreover, fast-checking websites tend to verify the information that they find or that users post. They have an advantage over the human manual checkers as they can verify claims and new stories (Shao, Ciampaglia, Varol, Flammini & Menczer, 2017). Besides, they ensure that existing political fake stories are not repeated, unlike human manual checkers who may at times be biased due to fatigue.

Political fake news spread faster and penetrates to a large extent across social media networks than credible news. This may be due to its ability to generate attention and capacity to reconfirm the preexisting biases of a reader. Part of the problem typically resides in the filter chamber or echo bubbles; this implies that whenever some people are exposed to a certain point of view, they find it easy to believe stories that reaffirm their point of view (Shu, Sliva, Wang, Tang, & Liu, 2017). As such, it is crucial to encourage communication online to address various ideological lines. Moreover, people tend to remember facts that have been consistently and repeatedly mentioned in various social media networks. Therefore, it is hard to distort the preexisting news or stories that may have reached the general public. It makes sense then to stop false information on its track before the general public can read it. For instance, Twitter has put in place a model that uses fact-checks to verify false information and contains the hoax before it spreads to the public.

Automatic checking has clear benefits as it can be done at scale and can save moderators from having out of every unpleasant or false political story and news. This form only checks the content of the story and not the source or rate of spread. Automatic checkers attempt to find unverified claims and check them against reliable sources (Pérez-Rosas, Kleinberg, Lefevre & Mihalcea, 2017). This can be done by transforming Wikipedia into a network of knowledge graphs where true information will be present as an edge of the graph, while false information will not be found connected in the graph. Another form of automatic checking may involve assessing the language used in a story. This may include finding the cues in the language of a story that possibly point to exaggerated claims.

Text Classification for Political Deceptive News

An intuitive framing of the political deceptive news problem in natural language processing (NLP) would ask how people can classify texts into fake and legitimate instances. This certainly applies to full texts rather than the short tweets and headlines encountered on social media networks. This is because the classification of texts mainly relies on the linguistic characteristics of longer texts rather than the shorter ones. Text classification is important mainly in political deceptive news detection, and two efforts are used to classify texts. They include feature-based approaches and deep-learning methods. Feature-based approaches involve the extraction and analysis of linguistic cues of a given content to identify the specific target phenomena. It is a very powerful model as it has relatively interpretable results. Some of the features involved in this model are such as lexical-semantic classes and n-grams (Wang, 2017). These features can be integrated into various traditional supervise algorithms used in the detection of political fake news. Moreover, deep-learning models, such as recurrent neural networks (RNN) and convolutional neural networks (CNN) are usually where large scale training data is available. RNN is often used to encode sequential information and it is certainly desirable for the modeling of short texts semantic. CNN is made up of convoluting and pooling layers that make it desirable for the longer text classification. Feature-based approaches and deep learning models are heavily data-driven (Vargo, Guo & Amazeen, 2018). As such, the first requirement for text classification is training data.  As such, quality training data should include a sufficiently diverse set of political credible and fake news articles.

Practices on How to Determine Political Misleading News

Most of the insights of deception usually originate from disciplines or fields without detection automation in mind. Various disciplines, such as computer-mediated communication, law enforcement and credibility assessments, interpersonal psychology, and NLP or text analytics can help to understand what deception is and how to find out liars. In computer-mediated communication, emails and forum of posts can be collected either through submission or elicitation of the pre-existing messages. Study participants can then be invited to contribute stories and interact under truthful or deceptive positions (Lazer et al., 2018). Deceptive interpersonal emails can then be re-purposed for text analytics. In law enforcement and credibility assessments, the court proceedings, testimonies, and pleas can be used to identify credible and fake political stories. The court ruling can then be used as the legitimate source in which the court hearings and testimonies can be verified if at all they are politically credible or fake. In interpersonal psychology, humans can conduct interviews and questionnaires to form the basis of the analysis of the credibility of given news or stories (Rashkin et al., 2017). In questionnaires and interviews, the reliability and the consistency of cues can help to identify a true and fake political story. However, this approach has a drawback as it may be biased depending on the mental status of individuals conducting the interviews or questionnaires (Chen, Conroy, & Rubin, 2015). In-text analytics, one can use some datasets that are readily available to determine the credibility of a particular story and classify the source of the fake news.

Based on these practices, the fake news or stories data should meet various conditions, such as the availability of both truthful and deceptive instances. Predictive methods should be able to find patterns in both instances. Moreover, there should be a digital textual format availability. In NLP, the text is the preferred medium for evaluation and analysis of political fake and legitimate news sources. Moreover, it should be done within a pre-defined time frame. This means that the corpus including emails and forum posts has to be collected within a well-set time frame.

The fake news problem has proved to be a major challenge in today’s society. It has instilled a negative impression on many people and it has eroded people’s trust in various institutions. There are various types of false news and stories including propaganda, sloppy journalism, and clickbait, among others. These forms of misleading news and stories often spread fast and over a large extent across the social media networks. As such, to stop or eliminate the quick spread of political misleading news, people should put into place various strategies such as educating the general public and using automatic checkers to fact-check claims and false stories before social media users post them on their accounts. To understand what political deceptive news is, people can engage in various activities, such as conducting interviews and questionnaires, to determine the cues related to both legitimate and credible news sources.

 

References

Allcott, H., & Gentzkow, M. (2017). Social media and fake news in the 2016 election. Journal of Economic Perspectives, 31(2), 211-36. Retrieved from https://www.aeaweb.org/articles?id=10.1257/jep.31.2.211

Chen, Y., Conroy, N. J., & Rubin, V. L. (2015). News in an online world: The need for an “automatic crap detector”. Proceedings of the Association for Information Science and Technology, 52(1), 1-4. Retrieved from https://asistdl.onlinelibrary.wiley.com/doi/full/10.1002/pra2.2015.145052010081

Lazer, D. M., Baum, M. A., Benkler, Y., Berinsky, A. J., Greenhill, K. M., Menczer, F., & Schudson, M. (2018). The science of fake news. Science, 359(6380), 1094-1096. Retrieved from https://science.sciencemag.org/content/359/6380/1094

Pérez-Rosas, V., Kleinberg, B., Lefevre, A., & Mihalcea, R. (2017). Automatic detection of fake news. arXiv preprint arXiv:1708.07104. Retrieved from https://arxiv.org/abs/1708.07104

Rashkin, H., Choi, E., Jang, J. Y., Volkova, S., & Choi, Y. (2017, September). Truth of varying shades: Analyzing language in fake news and political fact-checking. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing (pp. 2931-2937). Retrieved from https://www.aclweb.org/anthology/D17-1317/

Rubin, V. L., Chen, Y., & Conroy, N. J. (2015, November). Deception detection for news: Three types of fakes. In Proceedings of the 78th ASIS&T Annual Meeting: Information Science with Impact: Research in and for the Community (p. 83). American Society for Information Science. Retrieved from https://dl.acm.org/doi/10.5555/2857070.2857153

Shao, C., Ciampaglia, G. L., Varol, O., Flammini, A., & Menczer, F. (2017). The spread of fake news by social bots. arXiv preprint arXiv:1707.07592, 96-104. Retrieved from https://www.andyblackassociates.co.uk/wp-content/uploads/2015/06/fakenewsbots.pdf

Shu, K., Sliva, A., Wang, S., Tang, J., & Liu, H. (2017). Fake news detection on social media: A data mining perspective. ACM SIGKDD Explorations Newsletter, 19(1), 22-36. Retrieved from https://dl.acm.org/doi/10.1145/3137597.3137600

Tandoc Jr, E. C., Lim, Z. W., & Ling, R. (2018). Defining “fake news” A typology of scholarly definitions. Digital journalism, 6(2), 137-153. Retrieved from https://www.tandfonline.com/doi/abs/10.1080/21670811.2017.1360143

Vargo, C. J., Guo, L., & Amazeen, M. A. (2018). The agenda-setting power of fake news: A big data analysis of the online media landscape from 2014 to 2016. New media & society, 20(5), 2028-2049. Retrieved from https://journals.sagepub.com/doi/abs/10.1177/1461444817712086

Vosoughi, S., Roy, D., & Aral, S. (2018). The spread of true and false news online. Science, 359(6380), 1146-1151. Retrieved from https://science.sciencemag.org/content/359/6380/1146

Wang, W. Y. (2017). ” liar, liar pants on fire”: A new benchmark dataset for fake news detection. arXiv preprint arXiv:1705.00648. Retrieved from https://arxiv.org/abs/1705.00648