S J Doran: ChatGPT - The Private Search Engine

Image of a magnifying glass scrutinizing a $20 bill based on photography by Noelle Otto at pexels.com

Yesterday Adobe published an article in which it summarised research it had conducted regarding the use of ChatGPT as a search engine.

Two key pieces of data jumped out at me, first is that 77% of overall ChatGPT users in the United States use ChatGPT as a search engine, and that 24% of those users went to ChatGPT first for the answer before they looked elsewhere. The interesting thing about these two facts is that it isn't unique to any given generation, Adobe broke down the figures saying 74% of Baby Boomer users, 80% of Gen X, 75% of Millennials, and 77% of Gen Z users all use the platform as a search engine.

If you have ever taken a course in public speaking, one little nugget of information you learn quite early in the course is that presenting confidence is more about how you say something rather than what you say - this is usually backed up with some percentage statistic plucked from thin air like 20% what you say and 80% how you say it. There's no formal study that backs up those figures and they are something of an urban legend but there is a grain of truth buried within it, people are more willing to believe you if you sound confident in your response.

In a few of my recent posts I mentioned how you can manipulate people if you know in advance what their depth of reasoning is, that's defined as how many follow up questions someone will ask before they accept what you said is true. I mentioned how children have an infinite depth of reasoning because they will continuously ask "but why?" until you can't answer, that's something most people grow out of quite quickly but as a society if we are to survive then it might be something we have to try and preserve into adulthood rather than discourage.

Growing up in the 90s when the use of the Internet was spreading, I remember the oft touted advice not to trust what you read online because anyone can edit it, and not to trust what people said and take it at face value as they could be anyone pretending to be anyone they wanted. That distrust faded in time, almost in line with the rise of Google as a search engine, the idea that the Internet could give you definitive answers took hold.

It didn't take long for those answers to be scrutinised however, when Google's popularity passed beyond a given point the regulators started taking notice. From its inception Google was never intended to give you the right answer to a question, it was only intended to give you the answer most people wanted. PageRank from its inception, the algorithm that drove Google's search results was ultimately an algorithm that measured popularity with the assumption the most popular result was the one you wanted. That changed somewhat when particularly contentious searches drew the ire of politicians when the answers that ranked first weren't to their liking.

Google made notional attempts to incorporate "trust" into its results, adding summary boxes to the top of its search results that displayed additional information relating to a query, but as someone who has been a community moderator for Google in the past I can say most of those boxes were automatically generated and only amended if people complained, manual overrides were used reactively.

For a time Google also experimented with personal results, this was eventually abandoned because of the computational cost, but in effect your results would be influenced by your own patterns of web browsing, you could even X out some of the results and block individual websites as sources and Google would no longer present results from those sites. You could also index local files, and Google would show you local results on your computer mixed in with the web results, these features were killed off around the same time the axe fell on iGoogle, their personalised home page which allowed you to set a background and add widgets to display information you had an interest in.

Google's latest effort to "improve" its search results has been the AI powered summary box which displays Gemini's response - Google's LLM. As contentious as all of this may be, one thing holds true, the information displayed is public. You can search, and I can search, and for the most part we will see almost the exact same result, save for some regional variations and location aware responses. The key point here though is that this is in effect public, because anyone can view it. ChatGPT is a different matter entirely.

Initially when you signed up to use ChatGPT you had per-conversation instances of the LLM, each separate conversation with the LLM was an independent instance with only limited information retained in your user data that instances could access globally. The latest version of ChatGPT changed this and now has instance to user pairing, meaning that all conversations feed into a single instance associated with you as a user. This acts as a smaller training set in addition to the main training sets used by the LLM. The key consequence of this change is that over time, ChatGPT starts feeding you answers specific to you, it starts telling you what you want to hear, or it starts telling you things in ways you would understand, the issue with this as I have mentioned before is veracity, there's no requirement for what it tells you to be truthful or accurate right now.

Google as a search engine as it grew in popularity was visible publicly, information that appeared on it could be collectively verified and criticised. With ChatGPT because the information shared is done so on a user by user basis, it is in effect a private search engine, one where not only what you ask it is private, but what response it gives you is private too. Whilst most people wouldn't want their Google search history to be public, they would at least want to know that they're being given the same answer as everyone else when they ask a question.

The concept of a private search engine is interesting, it's also dangerous. Defamation laws, libel laws, Advertising Standards, SEC regulations, and many other legal frameworks exist to regulate conventional publishing platforms and attempt to prevent companies and individuals from intentionally misleading others. You can debate how effective they are at achieving these goals, but you can at least argue that the most egregious examples of lying outright in published works is at least somewhat curtailed. The concept of a personalised, private publishing platform, that can't be monitored and can't be controlled simultaneously offers a promise of democratisation of information, and also offers the most effective means of communicating propaganda that we have ever seen in human history.

The age of macro manipulation is dying, and the age of micro manipulation is taking its place. With a search engine that can learn who you are, what you do, what power and influence you have, and how to manipulate you, the question of how to control a population as a whole disappears and gets replaced by asking who within that population has the most influence and what information can you feed them to manipulate them. You don't have to spend billions trying to manipulate a herd of sheep if you can identify the shepherd.

Take a business for example, a local indie bakery that sells doughnuts. Right now if you googled that business you would see a search result and a list of reviews, the same result everyone else sees and the same reviews. The business owner can see what people say about it and they have the opportunity to respond publicly. With ChatGPT all of that changes. You have a platform feeding millions of people information, but with no way of seeing or knowing what it is telling each one about your business. Someone asks privately, they get an answer privately. The amount of trust you place in ChatGPT and its owners not to manipulate the results and the information is immense, and the ease with which you could discredit a business and the occlusion with which the platform enables you to do that is equally immense.

ChatGPT - The Private Search Engine

No comments:

Post a Comment