[French] Algorithmes: entretien et suggestions de lectures

Prologue

Un magazine grand public paru cette semaine m’a cité dans le cadre d’un dossier sur les algorithmes. Intitulé “Les algorithmes veulent-ils notre peau?”, il ne donne pas de réponse définitive à la question posée mais aborde le sujet sous plusieurs angles en donnant la parole à des spécialistes de différents domaines.

L’article a été rédigé peu avant les élections américaines mais son sujet ne pourrait guère être davantage au coeur de l’actualité: parmi d’autres thématiques brulantes (telles que notamment la responsabilité des médias et de leur approche journalistique, la fonction des sondages, les facteurs sociaux qui favorisent l’extremisme et l’autoritarianisme) le résultat surprenant de cette élection présidentielle a également attiré l’attention sur le rôle potentiellement joué par les plateformes en ligne et leur gestion algorithmique des news, vraies ou fausses.

Pour son dossier dans Femina, le journaliste Nicolas Poinsot m’avait posé sept questions dont seule une petite partie des réponses s’est retrouvée dans la version finale faute de place. Il m’a gentiment donné la permission de reproduire l’entretien dans son intégralité, que vous pouvez lire ci-dessous. La question de l’influence des plateformes numériques sur la politique actuelle n’y est pas abordée, mais en vue de l’actualité il me semble bon d’ajouter quelques propositions de lecture à la fin de ce billet.

L’entretien

– Quels domaines de notre vie sont concernés par les algorithmes?
AJ: Dès que nous utilisons internet, un outil numérique ou simplement un appareil automatisé, nous interagissons directement avec des systèmes algorithmiques. S’y ajoute l’influence indirecte des algorithmes, par exemple le fait que nous habitions un monde de plus en plus optimisé pour une gestion algorithmique, que nous en fassions usage ou non.

– Observe-t-on une augmentation de l’usage de ces algorithmes depuis ces dernières années. Et si oui pourquoi?
AJ: Oui, clairement, et c’est lié à la numérisation. Il convient d’en distinguer deux caractéristiques principales: d’un côté, les algorithmes numériques permettent d’automatiser un grand nombre de tâches et processus à coût relativement faible. De l’autre côté, il y a l’optimisation: grâce au traitement automatique des données numériques ces dernières peuvent être récoltées, stockées et exploitées de manière exhaustive et très ciblée.

– Quelles sont les évolutions et les excès possibles avec le “deep learning”?
Quand les algorithmes sont programmés pour décider eux-mêmes quelles données seront traitées, et quelle procédure de traitement y convient le mieux, il peut y avoir des résultats innovants et inattendus. Mais le processus pour obtenir ces résultats est devenu opaque, ce qui rend très difficile leur remise en question.

– Sur le web, les algorithmes n’ont-ils pas tendance à nous orienter, à nous conditionner, voire à nous enfermer?
AJ: Bien sûr, les algorithmes du web nous orientent, et ce n’est pas une mauvaise chose, bien au contraire. En vue de la masse d’informations il serait difficile de se repérer autrement.
Mais il est vrai également que ces algorithmes sont optimisés pour notre profil numérique et risquent donc effectivement de nous enfermer dans la bulle de ce que nous connaissons et/ou aimons déjà.

– Quelle part de libre-arbitre nous reste-t-il avec ces algorithmes qui nous proposent qui suivre, qui voir, qui aimer (comme sur Tinder)?
En tant que sociologue je ne peux pas me prononcer sur les aspects cognitifs, mais il est clair que nous dépendons des options offertes par les plateformes et services. Et souvent, les soucis ne résident pas forcément dans ce que ces algorithmes font, mais dans ce qu’ils ne font pas. Par exemple, vous n’avez aucune possibilité de savoir quelles suggestions (d’achat, de mise en contact, de résultat de recherche etc.) vous n’avez pas reçues.

– En quoi les algorithmes peuvent-ils s’avérer sexistes et causer du tort aux femmes?
AJ: De manière générale, tout préjudice ou inégalité sociale peut se retrouver encodé dans les algorithmes. Souvent, cela s’explique par le fait que les données qui informent les algorithmes ont été récoltées dans un monde inégal. Si vous automatisez, par exemple, le premier tri de votre recrutement sur la base de vos engagement antérieur, et que les dernières personnes engagées ont toutes été des hommes entre 30 et 40 ans, il y a peu de chances qu’un algorithme performant vous laissera le dossier d’une femme de 45 ans dans sa sélection pour le prochain tour. Et on pourrait citer un nombre infini d’exemples de ce type. Cependant, il faut garder en tête que ce ne sont pas les algorithmes qui s’inventent discriminatoire, mais qu’il y a des actions d’humains – des décisions individuelles mais aussi des actions collectives – qui les informent.

– Que sauront faire les algorithmes dans dix ans?
AJ: Je n’ai malheureusement pas ma boule de cristal sur moi, mais je crois qu’il y a en principe peu de limites à ce qui est techniquement possible. Donc au lieu de se demander ce que sauront faire les algorithmes je préférerais réfléchir à ce que nous voudrions qu’ils fassent.

Quelques propositions de lectures supplémentaires

 

Google Autocomplete revisited

Google autocomplete showing autocomplete suggestions for the search query "Google autocomplete re"«Did Google Manipulate Search for [presidential candidate]?» was the title of a video that showed up in my facebook feed. In it, the video host argued that upon entering a particular presidential candidate’s name into Google’s query bar, very specific autocomplete suggestions are not showing up although – according to the host – they should.

I will address the problems with this claim at a later point, but let’s start by noting that the argument was quickly picked up (and sometimes transformed) by blogs and news outlets alike, inspiring titles such as «Google searches for [candidate] yield favorable autocomplete results, report shows», «Did [candidate]’s campaign boost her image with a Google bomb?», «Google is manipulating search results in favor of [candidate]», and «Google Accused of Rigging Search Results to Favor [candidate]». (The perhaps most accurate title of the first wave of reporting is by the Washington Times, stating «Google accused of manipulating searches, burying negative stories about [candidate]».)

I could not help but notice the shift of focus from Google Autocomplete to Google Search results in some of the reporting, and there is of course a link between the two. But it is important to keep in mind that manipulating autocomplete suggestions is not the same as manipulating search results, and careless sweeping statements are no help if we want to understand what is going on, and what is at stake – which is what I had set out to do for the first time almost four years ago.

Indeed, Google Autocomplete is not a new topic. For me, it started in 2012, when my transition from entrepreneurship/consultant into academia was smoothed by a temporary appointment at the extremely dynamic, innovative DHLab. My supervising professor was a very rigorous mentor all while giving me great freedom to explore the topics I cared about. Between his expertise in artificial intelligence and digital humanities and my background in sociology, political economy and information management we identified a shared interest in researching Google Autocomplete algorithms. I presented the results of our preliminary study in Lincoln NE at DH2013, the annual conference of Digital Humanities. We argued that autocompletions can be considered “linguistic prosthesis” because they mediate between our thoughts and how we express these thought in written language. Furthermore, we underlined how mediation by autocompletion algorithms acts in a particularly powerful way because it intervenes before we have completed formulating our thoughts in writing and may therefore have the potential to influence actual search queries. A great paper by Baker & Potts, published in 2013, has come to the same conclusion and questions “the extent to which such algorithms inadvertently help to perpetuate negative stereotypes“.

Back to the video and its claim that, upon entering a particular presidential candidate’s name into Google’s query bar, very specific autocomplete suggestions are not showing up although they should. But why should they show up? The explanation offered in the video is based on two arguments: graphs of comparative search volume based on the Google Trends tool and comparison with autocomplete suggestions from the web search engines Yahoo and Bing.

However, Google Trend seems to have the same flaw as statistics: it can be very informative, but if you torture it long enough, it will confess to anything. Rhea Drysdale has published an informative piece that shows very clearly the manipulative nature of (mis-)using Google Trends as «anecdotal evidence» for «two random queries out of literally millions of variations» the way the authors of the video have. I cannot but encourage you to read Drysdale’s article. (One sentence resonates particularly with me because of what I am currently working on:  «Let’s see if mainstream media bothers to do their homework or simply picks up this completely bogus story spreading it further.» Previous experiences suggests the later.) She uses other queries and Google Trends to illustrate how a manipulation of search for another candidate could be just as easily “proved” and concludes that there is no manipulation with a political agenda, just Google’s algorithms at work.

Another article by Clayburn Griffin comes to the same conclusion. He reminds us that «Google Autocomplete is more complicated than you think. It’s not as simple as search volume, though that is an important factor.» But Griffin is convinced: «What I’ve seen, and what’s been reported in the video claiming Google is biased, doesn’t look like manipulation to me. It looks like Google Autocomplete working as intended

This is where it gets tricky, because I am as glad as the next person to learn more about how Google Autocomplete is intended to work. Then again, for the point I am trying to make there is no need to go into the anecdotical, no need to know which query for which demographic on which particular search engine in fact does or does not prompt a particular autocomplete suggestion. There are also methodological issues because not only are the algorithmic suggestions based on profiling and personalization, but the algorithms are ever changing.  More often than not, the focus on single trees has contributed to rendering the forest invisible. (Still, I must underline that within some great research the trees actually help illustrating the forest, or even the macrocosm – and in this regard: how very fitting that just before starting to write this article I’ve seen raving tweets about an ongoing presentation by Safyia Noble. Please check out her excellent work.)

… no manipulation with a political agenda, just Google’s algorithms at work… But no political agenda does not mean not political. “Google Autocomplete working as intended” is necessarily political, for the very simple reason that algorithmic systems are not neutral.

And although there may be indeed no particular candidate or cause favored: the very fact that we presume Google Autocomplete would be able to do so shows the very position of power it holds.

To be very clear, that power is not necessarily one of manipulating people into having one opinion rather than another, but rather a power of agenda setting. In this regard, it is similar to traditional media, which cannot necessarily dictate people what to think but can certainly impact what people think about. The information we are given in form of Google results may affect our opinions on certain topics, but it is Google Autocomplete that may actually influence on what topics we seek out information: the emergence of an autocomplete suggestion during the search process might make people decide to search for this suggestion although they didn’t have the intention to.

Impossible to address power and agenda setting in a digital context without drawing parallels to the controversy around Facebook Trends. Until recently, little was known about the logics that have made a topic “trending” on facebook. And although a few researchers have been addressing the power of “trending” topics and the lack of knowledge about it, it has not necessarily been considered an issue by journalists, politicians or the general public. But when suddenly some of these logics were revealed, discussions about their adequacy, neutrality and transparency have been sparked (and even a Senate’s Committee has gotten involved). Tarleton Gillespie addresses important issues with regard to the Facebook Trends controversy, many of which are just as relevant for Google Autocomplete.

It is not surprising that the dynamics around Google Autocomplete have followed a rather typical pattern: almost no interest whatsoever in how it works although nothing is known; suddenly, by learning something about Autocomplete, people learn that there actually is something to know; then they want to know more; finally, they demand accountability. That “something to know” may have been ignited by the video claiming political manipulation – or rather: re-ignited. In 2013 already, an ad campaign by UN Women has made people more aware of the sexism in our world Google’s autocomplete function.

Of course, knowing more about how Google Autocomplete works is a good starting point, which prevents us from confusing the (potential) manipulation of search queries with the manipulation of search results. As I wrote in 2013, it is interesting to learn that «autocompletion isn’t entirely automated. Google influences (“censors”, some say) autocompletion globally and locally through hardcoding, be it for commercial, legal or puritarian reasons. (Bing does so, too.)» But ultimately, I am not convinced that yet another trial-and-error reverse engineering attempt that reveals whether a particular expression is or is not suggested in a certain context will contribute to a greater overall understanding.

By the way, this is the main reason why a comparison of results/autocomplete suggestions/… between different web search engines has its limits: it will only offer comparative insights (which, admittedly, might reveal some of what could be otherwise) and mainly keeps feeding into the erroneous idea that there is a single, self-explanatory standard of how technology should work.

As long as we hold on to the idea that a fair, neutral search engine (then again: fair and neutral for whom?) is possible and simply defined by the absence of manipulation, we have not understood neither algorithmic systems nor society nor their intersection.

Euresearch: main slides and link list

Main slides

Link list

Social Media: the big picture + academic use

PEW Internet
Pew Research Center is an independent research institution based in the US. PEW Internet informs about facts and figures related to the internet.

Use of social media by the library: current practices and future opportunities [pdf]
This White Paper by Taylor & Francis White Paper contains a pertinent analysis of current practices and future opportunities. Although it addresses libraries in peculiar, most of it applies to any research-supporting institution

‘Feeling Better Connected’: Academics’ Use of Social Media. [pdf]
Report by Deborah Lupton (2014). Canberra: News & Media Research Centre, University of Canberra

Oxford University Press: Social Media Guidelines
Filed under ‘Marketing Resources for Authors‘ this website provides a great overview of potential uses (including helpful tips) of different SM platforms. (Simultaneously, it serves as cross-promotion for OUP’s channels.)

@AcademicsSay: The Story Behind a Social-Media Experiment
This article describes a compelling case of academic use of SM. Bonus: many links to SM research within the text.

What will the scholarly profile page of the future look like? Provision of metadata is enabling experimentation
Very pertinent benchmark of different ‘scholarly profile’ platforms on the LSE blog by Lambert Heller.

Risks & potentials of social media, advantages, disadvantages

Swiss Reporting and Analysis Centre for Information Assurance
The website contains many checklists and instructions, providing “practical help on the safe use of information and communication technologies”.

A Comprehensive Approach to Managing Social Media Risk and Compliance [pdf]
This document by Accenture has been created with financial institutions in mind, but its content applies broadly.

Managing Risk in a Social Media – Driven Society
A general overview by Protiviti from 2011, still valid today.

FBI: Internet Social Networking Risks
A short guide by the FBI aimed at individuals.

Youtube

18 tips how to increase YouTube subscribers
This article helps you evaluate where you currently stand, and gives advice for the next steps.

The 20 most effective ways to distribute your YouTube video
Tips for better marketing of a video on YouTube.

YouTube Ranking Factors
A recent article containing very complete instructions for a better video ranking.

5 facts about online video
Compiled by PEW research for YouTube’s 10th birthday.

Twitter

Best Colleges Online (content marketing) 99 Serious Twitter Tips for Academics (Updated)
Very helpful article because of its many links to advice and How-To-s in one place.

Twitter: Top tips for academia
University of Oxford’s Research Skills Toolkit regarding Twitter (links)

LinkedIn

Using Twitter & LinkedIn to promote your event
5 pieces of advice that can serve as a checklist

Nonprofit strategies for getting more out of LinkedIn
Tutorial for Non-Profits that has the advantage of not being centered on marketing and sales. Bonus: there is a great presentation embedded.

Other ‘social media’

illustration-social-networks-research

advertising-marketing-algorithms

Researching advertising algorithms

Almost two years ago, I published my personal contribution to the “Google’s Autocompletion algorithms discriminate against women” debate by adding some context about Google and about algorithms.

Today, I could write something very similar regarding the headlines informing us that, according to a recent study, Google’s advertising algorithms discriminate against women. And it is probably a handy opportunity to let you know that my phd research in social sciences – still ongoing – is precisely about interaction with Google’s advertising algorithms…

However, this blog post is not going to be about my research. But when I saw the headlines about “discriminating advertising algorithms” I simply couldn’t *not* blog about it.

Luckily, WIRED has already taken care of asking the very same question I asked in my 2013 blog post about Google’s autocompletion algorithms: who or what is to blame? In a short but discerning piece WIRED explains the complex configuration of Google AdSense:

Who—or What’s—to Blame?
While the study’s findings would suggest Google is enabling discrimination, the situation is much more complicated.

Currently, Google allows advertisers to target their ads based on gender. That means it’s possible for an advertiser promoting high-paying job listings to directly target men. However, Google’s algorithm may have also determined that men are more relevant for the position and made the decision on its own. And then there’s the possibility that user behavior taught Google to serve ads in this manner. It’s impossible to know if one party here is to blame or if it’s a combination of account targeting from all sources at play.

This configuration has allowed powerful companies to present their services as ‘platforms’, phenomenal and simultaneously neutral vessels of communication filled by numerous individual users’ actions only. The complexity of the algorithmic systems at hand – because it is never simply Google’s algorithm (sing.) lest we forget – contributes to make locating accountability impossible if we keep looking for intentionality.

However, the authors of the “discriminating advertising algorithms” research argue that the effects they have uncovered, whether intended or not, are a matter of concern in any case:

… we are comfortable describing the results as “discrimination”. From a strictly scientific view point, we have shown discrimination in the non-normative sense of the word. Personally, we also believe the results show discrimination in the normative sense of the word. Male candidates getting more encouragement to seek coaching services for high-paying jobs could further the current gender pay gap. Thus, we do not see the found discrimination in our vision of a just society even if we are incapable of blaming any particular parties for this outcome.

Furthermore, we know of no justification for such customization of the ads in question. Indeed, our concern about this outcome does not depend upon how the ads were selected. Even if this decision was made solely for economic reasons, it would continue to be discrimination. In particular, we would remain concerned if the cause of the discrimination was an algorithm ran by Google and/or the advertiser automatically determining that males are more likely than females to click on the ads in question. The amoral status of an algorithm does not negate its effects on society.

Automated Experiments on Ad Privacy Settings. A Tale of Opacity, Choice, and Discrimination [emphasis mine]

Btw, the idea of starting with a focus on the “effects on society” and working backward has also been suggested in a recent Atlantic article about Google’s search results (just ignore the arguable opposition of “expert” vs. “neutral” if you can). The article was brought to my attention by Philippe Wampfler who explicitly suggests Google should take responsibility for the company’s decisions by showing face and not hiding behind the ‘platform’ discourse.

And before everyone turns – deservedly – to the Great Glitch of July 8, let me share three more links from my online advertising bookmark folder:

It goes without saying that the three articles are recommended reading. They are all related to online advertising and approach the topic from very different angles.

Then again: several issues of the current ‘advertising algorithm debate’ resemble what has already been discussed, e.g. in the context of other Google algorithms (poke: my 2013 piece on autocompletion and the links within).

And one day I might write more specifically about Google and big data and demographics and targeting and profiling…Add demographic targeting Google advertising

 

Technology, innovation and society: five myths debunked

Recently, I held a lecture about the digital transformation for the franco-swiss CAS/EMBA program in e-tourism. The tourism industry not being my specialty, and the “social media” aspects having been thoroughly covered by colleagues,Media Technology old and new I had been specifically asked to convey a big picture view.

I chose to address some overall issues related to ICT (information & communication technology), innovation and society by debunking the following five myths:

  1. Ignoring the digital transformation is possible
  2. Technological progress is linear
  3. Connectivity is a given
  4. Virtual vs. “real” life
  5. Big Data – the answer to all our questions

Each of these points would deserve an treatise on its own, and I will not be able to go into much details in the scope of this article. I nevertheless wanted to share some of the links and references mentioned during my lecture and related to these issues. If you prefer reading the whole thing in French, please go to Enjeux technologiques et sociaux: cinq idées reçues à propos du numérique, which is the corresponding (but not literally translated) article in French.

Myth no. 1: Ignoring the digital transformation is possible

While discussions of online social networks have become mainstream, the digital transformation goes way beyond social media. It is about more than visible communication. It is about automation, computation, and algorithms. And as I have written before: algorithms are more than a technological issue because they involve not only automated data analysis, but also decision-making. In 1961 already, C.P. Snow said:

«Those who don’t understand algorithms, can’t understand how the decisions are made.»

In order to illustrate the vastness of computation and algorithmic automation I mentioned Frédéric Kaplan’s information mushroom (“champignon informationnel”), my explorations of Google Autocomplete, as well as the susceptibility of a job to be made redundant in the near future by machine learning and mobile robotics (cf. this scientific working paper, or the interactive visualisation derived from it).

Myth no. 2: Technological progress is linear

This point included a little history including sociology of knowledge and innovation studies.

Continue reading

Innovations: enjeux sociaux et technologiques (old and new technology)

[French] Enjeux technologiques et sociaux: 5 idées reçues à propos du numérique

Exceptionally, this article is in French. English speaking readers might want to head over to Technology, innovation and society: five myths debunked.

Cet article esquisse mon intervention dans un module de formation EMBA / CAS il y a quelques jours. Le but était de sensibiliser les participants aux enjeux des technologies de l’information comme sources d’innovations majeures et de les rendre attentifs à quelques enjeux sociaux des TIC. Afin qu’un tour d’horizon aussi vaste soit un tant soit peu digeste, j’ai décidé de le présenter en cinq chapitres qui démontent certaines idées reçues à propos du numérique:

  1. Il est possible d’ignorer le numérique
  2. Le progrès technologique est linéaire
  3. La connectivité est un acquis
  4. Il y a le virtuel et il y a la “vraie vie”
  5. Les “big data”: la solution à tout

En voici ci-dessous la présentation, et ensuite quelques phrases explicatives avec liens/références.

La présentation:

Idée reçue no. 1: Il est possible d’ignorer le numérique

Le domaine du numérique est souvent considéré uniquement dans une perspective communication/marketing, une perspective parfois réduite aux seuls sujets des sites web et des réseaux sociaux en ligne. Et alors qu’il est possible pour une entreprise notamment de se passer d’une page facebook en toute cohérence avec sa stratégie, il n’en est pas de même avec la dynamique et l’évolution numérique au sens large. Ce parce que la révolution numérique ne concerne de loin pas que les “social media”. Elle comprend toute sorte d’automatisation algorithmique. Une citation parlante à ce sujet a été dit par C.P. Snow en 1961 déjà et je l’avais reprise dans un billet précédent (en anglais) il y a deux ans et demi:

«Those who don’t understand algorithms, can’t understand how the decisions are made.»

Illustrant quelques enjeux d’automatisation algorithmique, j’ai mentionné le “champignon informationnel” de Frédéric Kaplan, mes explorations de Google Autocomplete, et les calculs de la “probabilité de remplaçabilité” d’un emploi (provenant d’un working paper scientifique, transformés en visualisation interactive) grâce aux avancées dans les domaines du machine learning et de la robotique mobile.

Idée reçue no. 2: Le progrès technologique est linéaire

Pour ce point, une petite plongée dans la sociologie de la connaissance et de la technologie:

Continue reading

Google’s autocompletion: algorithms, stereotypes and accountability

Google autocompletion algorithms questions xkcd

“questions” by xkcd

Women need to be put in their place. Women cannot be trusted. Women shouldn’t have rights. Women should be in the kitchen. …

You might have come across the latest UN Women awareness campaign. Originally in print, it has been spreading online for almost two days. It shows four women, each “silenced” with a screenshot from a particular Google search and its respective suggested autocompletions.

Researching interaction with Google’s algorithms for my phd, I cannot help but add my two cents and further reading suggestions in the links …

Google's sexist autocompletion UN Women

Women should have the right to make their own decisions

Guess what was the most common reaction of people?

They headed over to Google in order to check the “veracity” of the screenshots, and test the suggested autocompletions for a search for “Women should …” and other expressions. I have seen this done all around me, on sociology blogs as well as by people I know.

In terms of an awareness campaign, this is a great success.

And more awareness is a good thing. As the video autofill: a gender study concludes “The first step to solving a problem is recognizing there is one.” However, people’s reactions have reminded me, once again, how little the autocompletion function has been problematized, in general, before the UN Women campaign. Which, then, makes me realize how much of the knowledge related to web search engine research I have acquired these last months I already take for granted… but I disgress.

This awareness campaign has been very successful in making people more aware of the sexism in our world Google’s autocomplete function.

Google's sexist autocompletion UN Women

Women need to be seen as equal

Google’s autocompletion algorithms

At DH2013, the annual Digital Humanities conference, I presented a paper I co-authored with Frederic Kaplan about an ongoing research of the DHLab about Google autocompletion algorithms. In this paper, we explained why autocompletions are “linguistic prosthesis”: they mediate between our thoughts and how we express these thought in (written) language. So do related searches, or the suggestion “Did you mean … ?” But of all the mediations by algorithms, the mediation by autocompletion algorithms acts in a particularly powerful way because it doesn’t correct us afterwards. It intervenes before we have completed formulating our thoughts in writing. Before we hit ENTER. Continue reading