Social and Political Impacts of Web Search Techniques — An Overview

1.0 Introduction

2.0 Impact of Web search Techniques

2.1 Social and Political Impact of Search Engines

Robertson and Ronald [2], quantified partisan bias among searchers post President Donald Trump’s inauguration. Partisan bias has been shown to influence voting behaviors through newspapers, television (e.g., the “Fox News Effect”), social media (see also “digital gerrymandering”), and search engines (e.g., the “Search Engine Manipulation Effect (SEME)”). It was found that the partisan bias swayed election-related search ranking preferences of undecided voters by 20% or more. According to this study, the results placed toward the bottom of Google SERPs were more left-leaning than the results placed toward the top. The direction and magnitude of overall lean varied widely by search query, component type, and other factors. Further, Google’s ranking algorithm shifted the average lean of SERPs slightly to the right of their unweighted average. Embedded tweets in Google’s search results, likely amplified the reach of Donald Trump’s Twitter account because of its prominence near the top of search results. [2].

Though the exploration of misinformation spread has primarily been of a focus in social media, it is observed that social media in combination with trust in search engines could increase exposure to and consumption of misinformation. Metaxa et al., [3] coined the word “search media” vis a vis algorithmically curated content meant to be consumed as media by search engine users. It highlights both the search algorithm’s workings and real-world events as factors affecting search media. Search media functions as “metamedia”, which reflects the state of the real-world media ecosystem. The algorithms used to curate search media are non-transparent and act as gatekeepers of information. The study strongly suggests the high risks of search results being consumed by the user akin to traditional media sources resulting in misinformation, political bias, and campaign agenda propagation.

2.2 Impact of Search Engines on News

The study also found that Top Stories box is more inclined to have left-leaning impressions than right-leaning ones, which could mean either one of two things, (1) the Google algorithm is biased in selecting left-leaning sources; or (2) there is more left/liberal news content being published online. The algorithm also appears to have a tendency to favor more recent news as top-ranked results, which could mean that news sources that refresh news more often even though they may not necessarily have better quality news would receive better visibility. The news sources in the Top Stories box is observed to receive significantly more traffic as opposed to others from Google.

An earlier study on Google’s knowledge panel component conducted by Lurie and Mustafaraj [9], also corroborates similar results on the impacts of the search engine algorithm and human-computer interaction have on how search users receive their news information. SERPs influence users’ decision making and news literacy. It is to be noted that google’s SERP is found to become an arena where algorithms, humans, and publishers meet and try to influence one another [9]. While deciding on the authenticity and trust of a news source on behalf of the user, search platforms such as Google, play a crucial role in influencing their decision, given the fact that users already place such trust in these platforms.

2.3 Impact of Search Engines on Health

2.4 Impact of Search Engines on Privacy

3.0 Observations and Discussion

· Rank Bias- The cognitive bias of search users towards top-ranked results being more accurate and trustworthy. A disproportionate number of clicks and attention go to the top results [1].

Another study indicated that college students trust Google’s ranking of SERP results and tend to click on the first couple of results even when more relevant links were ranked towards the bottom [9].

· Trust Bias- The unjustified trust search users have in the authenticity and accuracy of SERPs. It is observed that users believe that the search results reflect real-life opinions due to biased content. In particular, results can be interpreted as a consensus at a larger scale even though when they only reflect a certain point of view [7].

· Source Bias- It is the social obligation for a search engine to provide a range of perspectives and viewpoints and socio-political positions for the users. Source bias is much more profound in the case of news sources, as we observed in the previous section. SERPs seem to default to certain result sources, one prominent example being Wikipedia links.

· Misinformation- Search engines are inertly designed to produce documents/results which are algorithmically the most relevant, irrespective of these results having correct or incorrect information. The incorrect information translates to “fake news” in terms of news and politics and has much more dire consequences when it comes to average users with little health knowledge-seeking life-altering medical treatments and information online. It is found that users are highly influenced by misinformation, demonstrating a degree to which search biases can impact individual decision-making [7]. The direct answer box of Google has been shown to be prone to manipulation, thus transmitting misleading and false information [9].

· Search Components/ Visual Markers- Although from the point of view of user experience and quick and clear delivery of information, search components such as Google’s knowledge component, embedded twitter results, top stories box, people-ask, news-card, people-search, related-search and so on, and markup elements that add semantic meaning, provide good user experience, these elements have been found to construct bias and provide limited sources of information to the user. This was highlighted by Robertson et al.[1], where among all types of components the top 20% of the domains accounted for 96.1% of all domains of the sources of search components. This inequality is also paralleled among individual components [1]. In terms of news, it is found that publishers that had news articles in the Top Stories box received a significant boost in traffic (up to 1/6th more) as opposed to the ones placed in organic results in the SERP [8].

· Personalization- It is in the nature of search engine recommendation algorithms to learn user behavior and interests for suggesting content to users based on their user profile. This provides a tailored search experience to each user and also helps to produce top results that may be more relevant to the user. Since information relevance is highly subjective and majorly depends on the perception of the user of the information retrieval system, search engines seek to obtain some markers on users which will help them to increase their recall and precision of retrieved documents. However, this process may be counterproductive, when the user is a learner and the goal of information retrieval is knowledge discovery.
It may also be speculated that personalization creates a “filter bubble”, where only supporting information is retrieved, creating somewhat of selective exposure to information. This can be especially troublesome for health searchers. Schoenherr and White [5] highlighted that, past user queries do have a direct impact on producing search results that may be medically more concerning and serious. Political personalization can entrench users’ existing political beliefs by limiting exposure to cross-cutting information and alternative views and beliefs. The study [1] illustrates the measures of personalization with respect to political party inclination, president Trump’s ratings, and Google account sign-in.

4.0 Critical Analysis

In addition to the choice of a search engine, the platform on which the surveys and audits have been conducted is limited to desktop browsers and captures desktop results only, despite the evidence that the majority of user search activity is on handheld mobile devices. Further, there is limited research on how search activity performed by the Internet of Things (IoT) devices such as smart assistants impact search engine users and if the audits and analysis of traditional search correlate with that of IoT devices. It can, therefore, be concluded, that there is restricted source diversity for these studies.

Major search engines like Google perform very high-level Information Retrieval that involves the execution of complex algorithms. Any attempt on trying to encompass the entirety of the functioning of their algorithms is a difficult pursuit and not standardized. Adding to this, the non-transparency of Google’s source code and inner workings, questions the reliability of the audits and studies conducted so far that appear to have limited technical coverage.

There appears to be insufficient study of the relationship between social media and web search and how they influence each other. With the increasing number of social media search components appearing on SERPs, it is important to study the algorithms behind their rankings and availability, to better understand their implications on user search biases. For instance, there is no analysis of how results are ranked in Google’s twitter-card component, and what influences certain tweets to be given prominence over others. In addition, there is no formal study of how the visual design and placement of information within these search components affect user behavior on screen.

The data sample of any research plays a major role in determining the outcomes and can sometimes not present an accurate picture. For instance, in the Robertson, et al. study [2], results were aggregated and the participant sample was imbalanced in terms of demographics, political preferences and taken at different times of the day and it is common knowledge that web traffic can vary drastically over the course of a day. In this sense, studies performed around a major political event might have varying results from that of a normal scenario, analysis of which is limited. Another key data point is the search terms used, which is at the discretion of researchers and not of the general population.

One of the key factors for personalization employed by search engines is based on the searcher’s location. A search in one part of the world may vastly differ from another part of the world even on the same search platform. Out of all the studies discussed here, five of them [1,2,3,8,9] focused on the U.S. version of Google with U.S. centric search terms. Therefore, it is unclear if the results of these studies would vary across the world. In addition, there are no set ways to ensure de-personalization of search, as is the case in the Robertson, et al. study [1], which relies on using Chrome’s incognito mode to ensure this.
In the Ghenai, et al. study [7], the think-aloud method fails when a user’s need is unconscious which may be affected by various factors outside of the scope of the user.

5.0 Suggestions

From the point of view of search engine researchers, given the amount of misinformation that is prevalent in SERPs, more robust algorithms that not only consider relevance, but also consider the correctness, authenticity, authority, and truthfulness of results when evaluating pages is highly warranted. Just as non-relevant documents are given zero gain value, incorrect documents must be assigned negative gain in order to should shape their document ranking. An alternative to this approach may be to use visual markup elements to add semantic meaning to results with respect to their correctness in addition to their author and source might aid in mitigating some of the same problems. Lastly, tools can be designed to monitor the quality of SERPs with respect to social elements such as politics and news to detect misinformation even before it is spread.

Acronyms:

SERPs: search engine result pages

HUI: real-world healthcare utilization

IoT: Internet of Things

References:

2. Robertson, Ronald E., et al. “Auditing partisan audience bias within google search.” Proceedings of the ACM on Human-Computer Interaction 2.CSCW (2018): 1–22.

3. Metaxa, Danaë, et al. “Search media and elections: A longitudinal investigation of political search results.” Proceedings of the ACM on Human-Computer Interaction 3.CSCW (2019): 1–17.

4. Ghenai, Amira. “Health misinformation in search and social media.” Proceedings of the 2017 International Conference on Digital Health. 2017.

5. Schoenherr, Georg P., and Ryen W. White. “Interactions between health searchers and search engines.” Proceedings of the 37th international ACM SIGIR conference on Research & development in information retrieval. 2014.

6. Puspitasari, Ira. “The impacts of consumer’s health topic familiarity in seeking health information online.” 2017 IEEE 15th International Conference on Software Engineering Research, Management and Applications (SERA). IEEE, 2017.

7. Ghenai, Amira, Mark D. Smucker, and Charles LA Clarke. “A Think-Aloud Study to Understand Factors Affecting Online Health Search.” Proceedings of the 2020 Conference on Human Information Interaction and Retrieval. 2020.

8. Trielli, Daniel, and Nicholas Diakopoulos. “Search as news curator: The role of Google in shaping attention to news information.” Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems. 2019.

9. Lurie, Emma, and Eni Mustafaraj. “Investigating the Effects of Google’s Search Engine Result Page in Evaluating the Credibility of Online News Sources.” Proceedings of the 10th ACM Conference on Web Science. 2018.

10. Punagin, Saraswathi, and Arti Arya. “Privacy and Personalization Perceptions of the Indian Demographic with respect to Online Searches.” Proceedings of the Third International Symposium on Women in Computing and Informatics. 2015.

Full Stack Web Developer | Graduate student of MS in Computer Science at The University of Texas at Rio Grande Valley

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store