What is ChatGPT and how to use it?
This is ChatGPT, and why it might be the most important tool since the modern search engine
OpenAI has introduced a long-form question and answer AI called ChatGPT that can answer complex questions conversationally.
This is revolutionary technology because it is trained to understand what humans mean when they ask questions.
Many users were in awe of its ability to deliver human-quality responses, inspiring a feeling that it might eventually have the power to disrupt the way humans interact with computers and change how information is retrieved.
What is Chat GPT?
ChatGPT is a large language model chatbot developed by OpenAI based on GPT-3.5. It has the extraordinary ability to interact in a conversational format and provide surprisingly human responses.
Large language models perform the task of predicting the next word in a sequence of words.
Reinforcement Learning with Human Feedback (RLHF) is an additional training layer that uses human feedback to help ChatGPT learn the ability to follow instructions and generate human-satisfying responses.
Who built ChatGPT?
ChatGPT was created by OpenAI, a San Francisco-based artificial intelligence company. OpenAI Inc. is the not-for-profit parent company of for-profit OpenAI LP.
OpenAI is known for its famous DALL·E, a deep learning model that generates images based on textual instructions called prompts.
The CEO is Sam Altman, who was previously president of Y Combinator.
Microsoft is a $1 billion partner and investor. Together they developed the Azure AI platform.
Large language model
ChatGPT is a large language model (LLM). Large Language Models (LLMs) are trained using large amounts of data to accurately predict the next word that appears in a sentence.
It was found that increasing the amount of data improves the language model's ability to do more.
According to Stanford University:
"GPT-3 has 175 billion parameters and was trained on 570 GB of text. In comparison, its predecessor GPT-2 had 1.5 billion parameters, which is more than 100 times smaller.
This increase in size dramatically changes the behavior of the model—GPT-3 is able to perform tasks it was not explicitly trained on, such as translating sentences from English to French, with few training examples.
This behavior is almost non-existent in GPT-2. Furthermore, for some tasks GPT-3 outperforms models explicitly trained to solve these tasks, although on other tasks it falls short. "
LLM predicts the next word in a sequence of words in a sentence and the next sentence - a bit like autocomplete, but on a mind-bending scale.
This ability enables them to write paragraphs and full-page content.
But the limitation of LL.M.s is that they don’t always accurately understand human needs.
This is where ChatGPT improves on existing technology through the aforementioned Reinforcement Learning with Human Feedback (RLHF) training.
How is ChatGPT trained?
GPT-3.5 was trained on large amounts of code and message data from the Internet, including sources such as Reddit discussions, to help ChatGPT learn conversations and gain human-like responses.
ChatGPT is also trained using human feedback (a technique called human feedback reinforcement learning) so that the AI understands what humans expect when asking questions. Training an LLM in this way is revolutionary because it does more than just train the LLM to predict the next word.
A March 2022 article titled “Train language models to follow instructions from human feedback” research paper explains why this is a groundbreaking approach:
"This work is motivated by our goal to increase the positive impact of large language models by training them to do a given set of things humans want them to do.
By default, language models optimize for a next word prediction goal, which is just a proxy for what we want these models to do.
Our results show that our technique holds promise for making language models more useful, realistic, and harmless.
Making language models larger does not inherently make them better at following user intent.
For example, large language models may produce output that is unrealistic, toxic, or unhelpful to users.
In other words, these models are inconsistent with their users. "
The engineers who built ChatGPT hired contractors (called labellers) to rate the output of two systems, GPT-3 and the new InstructGPT (ChatGPT's "sibling model").
Based on the ratings, the researchers concluded the following:
"Taggers prefer InstructGPT output to GPT-3 output.
The InstructGPT model is improved over GPT-3 in terms of realism.
InstructGPT shows a slight improvement in toxicity over GPT-3, but no bias. "
The research paper concludes that the results of InstructGPT are positive. However, it also noted that there is room for improvement.
"Overall, our results show that fine-tuning large language models using human preferences can significantly improve their behavior across a wide range of tasks, although much work remains to improve their safety and reliability."
ChatGPT differs from simple chatbots in that it is specifically trained to understand human intent in questions and provide answers that are useful, truthful, and harmless.
Because of this training, ChatGPT may challenge certain questions and discard parts of the question that don't make sense.
Another research paper related to ChatGPT shows how they trained artificial intelligence to predict human preferences.
The researchers noticed that metrics used to evaluate the output of natural language processing AI resulted in machines scoring high on the metric but not in line with human expectations.
Here's how researchers explain the problem:
"Many machine learning applications optimize for simple metrics that are only rough representations of the designer's intent. This can lead to problems, such as YouTube recommendations promoting clickbait."
So the solution they devised was to create an artificial intelligence that could output answers optimized for human preferences.
To do this, they trained the AI using a dataset of humans comparing different answers so that the machine could better predict answers that humans would find satisfactory.
The paper shares that training was done by summarizing Reddit posts and was tested on summarizing news.
The February 2022 research paper is titledLearning to Summarize from Human Feedback.
The researchers wrote:
"In this work, we show that summary quality can be significantly improved by training models to optimize human preferences.
We collected a large, high-quality dataset of human comparative summaries, trained a model to predict human-preferred summaries, and used the model as a reward function to fine-tune summarization policies using reinforcement learning. "
What are the limitations of ChatGPT?
Limitations of toxic reactions
ChatGPT is specifically programmed not to provide toxic or harmful responses. So it avoids answering these kinds of questions.
The quality of the answer depends on the quality of the direction
An important limitation of ChatGPT is that the output quality depends on the input quality. In other words, expert guidance (tips) produces better answers.
The answer is not always correct
Another limitation is that because it is trained to provide answers that humans feel are correct, these answers may fool the human output into being correct.
Many users have found that ChatGPT can provide incorrect answers, including some very incorrect answers.
A moderator at coding question-and-answer site Stack Overflow may have discovered an unintended consequence of what humans think is the correct answer.
Stack Overflow is filled with user responses generated from ChatGPT that appear to be correct, but many are wrong answers.
Thousands of answers overwhelmed the team of volunteer moderators, prompting administrators to issue a ban on any user who posted answers generated by ChatGPT.
The large number of responses from ChatGPT led to a post titled: Temporary Policy: ChatGPT Banned:
“This is a temporary policy designed to slow the influx of answers and other content created using ChatGPT.
…The main problem is that while the answers produced by ChatGPT have a high error rate, they often “look” probably good…”
The experience of Stack Overflow moderators using incorrect ChatGPT answers that appeared to be correct is something OpenAI, the makers of ChatGPT, were aware of and warned about when announcing the new technology.
OpenAI explains the limitations of ChatGPT
The OpenAI announcement offers this warning:
“ChatGPT sometimes writes answers that appear reasonable but are incorrect or nonsensical.
Solving this problem is challenging because:
(1) During RL training, there is currently no source of truth;
(2) training the model to be more cautious causes it to reject questions that it can answer correctly; and
(3) Supervised training can mislead the model because the ideal answer depends on what the model knows, not what the human demonstrator knows. "
Is ChatGPT free to use?
ChatGPT is currently free during Research Preview.
The chatbot is currently open for users to try out and provide feedback on responses so that the AI can better answer questions and learn from its mistakes.
The official announcement states that OpenAI is eager to receive feedback on bugs:
“While we work hard to have the model reject inappropriate requests, it sometimes responds to harmful instructions or exhibits biased behavior.
We are using the Moderation API to warn or block certain types of unsafe content, but we expect it will have some false negatives and false negatives at the moment.
We're eager to collect user feedback to aid our ongoing work to improve the system. "
There is currently a contest offering $500 in ChatGPT points to encourage the public to rate responses.
"Users are encouraged to provide feedback via the UI on problematic model outputs, as well as false positives/negatives from external content filters (also part of the interface).
We are particularly interested in feedback about harmful outputs that may occur under real-world, non-adversarial conditions, as well as feedback that helps us discover and understand new risks and possible mitigations.
You can choose to participate in the ChatGPT Feedback Contest 3 for a chance to win up to $500 in API points.
Entries can be submitted via the feedback form linked in the ChatGPT interface. "
The current contest will end on December 31, 2022 at 11:59 PM PST.
Will language models replace Google search?
Google itself has created an AI chatbot called LaMDA. The Google chatbot's performance is so close to human conversation that one Google engineer claimed LaMDA is sentient.
Given how these large language models can answer so many questions, is it far-fetched that companies like OpenAI, Google, or Microsoft could one day replace traditional search with AI chatbots?
Some people on Twitter have already announced that ChatGPT will be the next Google.
The prospect that Q&A chatbots might one day replace Google terrifies those who make a living as search marketing professionals.
It sparked discussion in online search marketing communities such as the popular Facebook SEOSignals Lab,someone askedIs it possible for searches to shift from search engines to chatbots.
After testing ChatGPT, I have to admit that my fear of search being replaced by chatbots is not unfounded.
The technology still has a long way to go, but it’s possible to envision a future of search that hybridizes search and chatbots.
But the current implementation of ChatGPT seems to be a tool that at some point requires the purchase of credits to use.
How to use ChatGPT?
ChatGPT can write code, poetry, songs, and even short stories in the style of a specific author.
Expertise that follows the directions below elevates ChatGPT from a source of information to a tool that can be used to get things done.
This makes it useful for writing articles on almost any topic.
ChatGPT can be used as a tool to generate outlines for articles or even entire novels.
It can answer almost any task that can be answered with written text.
in conclusion
As mentioned earlier, ChatGPT was envisioned as a tool that the public would ultimately have to pay to use.
In the first five days after ChatGPT became available to the public, more than one million users signed up to use it.