AutoGPT explained: how the new groundbreaking AI technology actually works?

Hrant Davtyan
4 min readApr 19, 2023

AutoGPT takes the capabilities of GPT models to a whole new level. It allows the latest GPT models to write their own prompts, execute Python scripts, read/write files, browse the web, click hyperlinks, and solve tasks. The idea of AutoGPT is to chain GPT4 thoughts together and produce actions that complete a task with minimal human intervention. It asks users to provide inputs such as a chatbot name, description, and tasks, and then it builds a careful prompt around it. Let’s say you give it a name and tasks to complete. It will then engineer a prompt to achieve those tasks. This creates a virtual assistant that can browse the web, scrape information, and complete tasks on its own.

Image 1. Generated using SDXL Beta model

One of the most exciting things about AutoGPT is its potential to revolutionize how we interact with computers. With AutoGPT, we could have an AI-powered virtual assistant that could handle a wide variety of tasks for us without the need to carefully engineer prompts for each step of the task. It could search the web for information, write reports, create presentations, and more. This has huge implications for businesses and individuals alike.

AutoGPT in its full power: How GPT4 Can Do Basically Anything in a Digital World

With its advanced natural language processing capabilities, GPT4 can do basically anything in a digital world, from automating customer service conversations to creating personalized content for individual users. The potential applications of AutoGPT are virtually limitless, and it is only a matter of time before we start seeing its full power in action. Let’s go over a more complete step-by-step example to see AutoGPT works in action.

Step 1: Defining the Assistant

  • Give your assistant a name (e.g., “Bob”)
  • Provide a description of the assistant’s role (e.g., “Bob is a journalist doing market research for startups in Generative AI”)
  • Define the tasks you want your assistant to perform (e.g., “Task 1: browse the web to find hot startups working with Generative AI”, “Task 2: summarize your findings into a blog article”, “Task 3: write results to a file and shut down”)

Step 2: Engineering the Prompt

  • GPT4 will engineer a prompt for “Bob” that can achieve the defined tasks and feed it to GPT4, thus creating Bob

Step 3: Performing the First Task

  • Bob (GPT4) generates a search query for Generative AI startups and performs a search (powered by whatever engine creators have connected it with, say Google or Bing) to solve the first task.

Step 4: Generating the Next Action

  • Bob (GPT4) generates thoughts for the next action, such as “are the results of the search query enough to complete task 1?” If not, it will generate a new action that could be something like “generate Python code to scrape information from the following websites: [WEBSITES]”. The [WEBSITES] are basically search results from the previous step.

Step 5: Executing the Python Code

  • The Python code will be executed, websites will be scraped, and the content will be fed to Bob together with the original tasks.

Step 6: Evaluating Completion of Task 1

  • Bob generates thoughts again: “Do I have enough information to consider Task 1 completed?” If not, it will conduct more searches. If yes, it will move on to Task 2 and generate a prompt for that.

Step 7: Evaluating Completion of Task 2

  • Bob will generate a thought that will criticize whether the blog article is solving Task 2 or not. If not, it will generate a critique of why Task 2 is not solved and what can be improved. If yes, it will go to Task 3.

Step 8: Solving Task 3

  • Bob will generate a prompt for solving Task 3, which will probably be something like “generate Python code that will write the [CONTENT] into a file.”

Step 9: Writing Content to a File

  • The content will be written to a file. Bob will generate a thought to evaluate whether the task is done or not. Assume it is yes.

Step 10: Shutting Down

  • Bob will shut down.
Image 3. Example of AutoGPT in action [source]

As you can see, this example of AutoGPT is quite simplified, but it illustrates how AutoGPT can work. AutoGPT needs a lot of work on prompt engineering, search engine extensions, code execution capabilities, and more. However, the potential of AutoGPT is enormous. It can be used to build virtual assistants for any digital task, from data analysis to content creation. The possibilities are endless, and it will be exciting to see what developers will create with GPT4 and AutoGPT.

In conclusion, AutoGPT is an exciting technology that shows great potential in automating various tasks in the digital world. The example we have explored in this article demonstrates how AutoGPT can act as a virtual assistant, generating prompts, conducting searches, executing code, and providing feedback. However, it is important to note that achieving good results with AutoGPT requires a significant amount of API requests to GPT4, which can be costly. Despite this limitation, AutoGPT is a significant step towards AGI and could lead to further developments in open-source models such as LLaMA, Vicuna, and Flan-T5. With continued research and advancements, AutoGPT could prove to be a valuable tool in streamlining and automating tasks across various industries.

--

--

Hrant Davtyan

Founder @ Anania. I am writing about Data analytics, Generative AI, LLMs, Machine Learning.