使用大型語言模型(LLM)完成Text-to-SQL任務 - Langchain SQL Agent背後的ReAct Prompting

上一篇透過Langchain SQL Agent建立資料問答系統中使用了Langchain的create_sql_agent來建立SQL Agent。Langchain在實作這個SQL Agent背後的使用了create_react_agent，這是基於2022年的paper “ReAct: Synergizing Reasoning and Acting in Language Models”的實作，雖然這個版本是比較早期的實作現在並不適合拿來使用在production環境，但仍可以透過瞭解SQL Agent是怎麼基於ReAct prompting技巧運作的。

ReAct Prompting

ReAct Prompting包含了兩個部份，分別是Reasoning和Acting。主要是讓LLM能夠透過交錯的方式產生推理的過程(Reasoning traces)和執行特定任務的動作(task-specific actions)。在產生推理時可以讓LLM針對推理的結果，更新並產生新的執行任務的動作，同時允許LLM在執行任務時可以和外部的工具互動，以取得更具信賴且真實的資訊。

Hotspot QA

上面是論文中使用的問題「Aside from the Apple Remote, what other devices can control the program Apple Remote was originally designed to interact with?」讓LLM透過ReAct解決問題的推論與執行動作過程。可以看到在找到答案前，每次都會進行思考(Though)、行動(Action)和觀察(Observation)。所以模型會在每一次作完動作得到的觀察結果(例如搜尋結果)後，思考下一步該作什麼動作(例如搜尋某個資訊)。

AlfWorld

第二個例子是應用在決策任務中，在決策任務中會包含了更複雜的互動環境，因此更需要透過有效的推論過程和執行動作，讓模型最後能完成正確的任務。在解決AlfWorld的問題「You are in the middle of a room. Looking quickly around you, you see a cabinet 6, a cabinet 1, a coffee machine 1, a countertop 3, a stove burner 1, and a toaster 1. Your task is to: Put some pepper shaker on a drawer.」可以看到在中間加入思考推論後，能讓模型正確的把最後的目標拆解更小的目標來完成最終的任務。

ReAct Prompting in SQL Agent

接著我們來看一下在Langchain裡面SQL Agent用到的ReAct prompting預設是怎麼設計的。在prompt的template組成總共會有四個部份，分別為database prompt，tools，ReAct instructions和Begin and Scratchpad。

Database Prompt

第一個部份是整個prompt的prefix，和SQL Chain一樣由Database Prompt組成，會需要置換的參數為dialect、top_k，這裡和SQL Chain一樣。

Database Prompt

You are an agent designed to interact with a SQL database.
Given an input question, create a syntactically correct {dialect} query to run, then look at the results of the query and return the answer.
Unless the user specifies a specific number of examples they wish to obtain, always limit your query to at most {top_k} results.
You can order the results by a relevant column to return the most interesting examples in the database.
Never query for all the columns from a specific table, only ask for the relevant columns given the question.
You have access to tools for interacting with the database.
Only use the below tools. Only use the information returned by the below tools to construct your final answer.
You MUST double check your query before executing it. If you get an error while executing a query, rewrite the query and try again.

DO NOT make any DML statements (INSERT, UPDATE, DELETE, DROP etc.) to the database.

If the question does not seem related to the database, just return "I don't know" as the answer.

Tools

第二個部份為所有提供給模型使用的tool與描述，這裡總共提供四種tool，分別為執行SQL(sql_db_query)、查詢Schema(sql_db_schema)、查詢資料表(sql_db_list_tables)和驗證SQL查詢(sql_db_query_checker)。模型會在推論中決定下一個要使用哪一個tool來執行動作。

Toolkit

sql_db_query - Input to this tool is a detailed and correct SQL query, output is a result from the database. If the query is not correct, an error message will be returned. If an error is returned, rewrite the query, check the query, and try again. If you encounter an issue with Unknown column 'xxxx' in 'field list', use sql_db_schema to query the correct table fields.
sql_db_schema - Input to this tool is a comma-separated list of tables, output is the schema and sample rows for those tables. Be sure that the tables actually exist by calling sql_db_list_tables first! Example Input: table1, table2, table3
sql_db_list_tables - Input is an empty string, output is a comma-separated list of tables in the database.
sql_db_query_checker - Use this tool to double check if your query is correct before executing it. Always use this tool before executing a query with sql_db_query!

ReAct Instruction

第三個部份為ReAct prompting主要的格式，會告訴模型須要進行Thought-Action-Action Input-Observation的循環，直到找到最終的答案。這裡的tool_names參數即為第二部份所有的tool名稱集合。

模型會根據推論(Though)決定執行的行動(Action)，並產生行動輸入(Action Input)。在這裡執行的行動即為使用其中一個tool，而行動輸入就是tool的輸入參數。在執行完行動後的結果就是會產生觀察(Observation)，而模型根據觀察再進行下一步思考推論(Though)，並反覆直到完成任務。

ReAct Format and Instructions

Use the following format:

Question: the input question you must answer
Thought: you should always think about what to do
Action: the action to take, should be one of [{tool_names}]
Action Input: the input to the action
Observation: the result of the action
... (this Thought/Action/Action Input/Observation can repeat N times)
Thought: I now know the final answer
Final Answer: the final answer to the original input question

Begin and Scratchpad

第四個部份為Begin與Scratchpad，在這裡會有任務起始的資訊，包含了要回答的問題、最一開始的起始推論，還有儲存推論和行動的暫存器用來紀錄縱跡歷程。這裡的參數共有兩個，第一個input為輸入的問題，第二個agent_scratchpad會紀錄每次的Question-Thought-Action-Action Input-Observation紀錄。

Begin/Input/Scratchpad

Begin!

Question: {input}
Thought: I should look at the tables in the database to see what I can query.  Then I should query the schema of the most relevant tables.
{agent_scratchpad}

完整的SQL Agent Prompting

以下是使用sqlite範例的完整prompt，其中input會在invoke的時候帶入，而scratchpad會在模型每一次的推論與動作執行紀錄下來。

You are an agent designed to interact with a SQL database.
Given an input question, create a syntactically correct sqlite query to run, then look at the results of the query and return the answer.
Unless the user specifies a specific number of examples they wish to obtain, always limit your query to at most 10 results.
You can order the results by a relevant column to return the most interesting examples in the database.
Never query for all the columns from a specific table, only ask for the relevant columns given the question.
You have access to tools for interacting with the database.
Only use the below tools. Only use the information returned by the below tools to construct your final answer.
You MUST double check your query before executing it. If you get an error while executing a query, rewrite the query and try again.

DO NOT make any DML statements (INSERT, UPDATE, DELETE, DROP etc.) to the database.

If the question does not seem related to the database, just return "I don't know" as the answer.

You have access to the following tools:

sql_db_query - Input to this tool is a detailed and correct SQL query, output is a result from the database. If the query is not correct, an error message will be returned. If an error is returned, rewrite the query, check the query, and try again. If you encounter an issue with Unknown column 'xxxx' in 'field list', use sql_db_schema to query the correct table fields.
sql_db_schema - Input to this tool is a comma-separated list of tables, output is the schema and sample rows for those tables. Be sure that the tables actually exist by calling sql_db_list_tables first! Example Input: table1, table2, table3
sql_db_list_tables - Input is an empty string, output is a comma-separated list of tables in the database.
sql_db_query_checker - Use this tool to double check if your query is correct before executing it. Always use this tool before executing a query with sql_db_query!

Use the following format:

Question: the input question you must answer
Thought: you should always think about what to do
Action: the action to take, should be one of [sql_db_query, sql_db_schema, sql_db_list_tables, sql_db_query_checker]
Action Input: the input to the action
Observation: the result of the action
... (this Thought/Action/Action Input/Observation can repeat N times)
Thought: I now know the final answer
Final Answer: the final answer to the original input question

Begin!

Question: {input}
Thought: I should look at the tables in the database to see what I can query.  Then I should query the schema of the most relevant tables.
{agent_scratchpad}

Scratchpad的過程說明可以參考SQL Agent執行過程，完整的Scratchpad如下。

Thought: I should look at the tables in the database to see what I can query.  Then I should query the schema of the most relevant tables.
Action: sql_db_list_tables
Action Input: 
Observation: Album, Artist, Customer, Employee, Genre, Invoice, InvoiceLine, MediaType, Playlist, PlaylistTrack, Track
Thought: Thought: The `Employee` table likely contains information about employees. I'll check its schema.
Action: sql_db_schema
Action Input: Employee
Observation: 
CREATE TABLE "Employee" (
	"EmployeeId" INTEGER NOT NULL, 
	"LastName" NVARCHAR(20) NOT NULL, 
	"FirstName" NVARCHAR(20) NOT NULL, 
	"Title" NVARCHAR(30), 
	"ReportsTo" INTEGER, 
	"BirthDate" DATETIME, 
	"HireDate" DATETIME, 
	"Address" NVARCHAR(70), 
	"City" NVARCHAR(40), 
	"State" NVARCHAR(40), 
	"Country" NVARCHAR(40), 
	"PostalCode" NVARCHAR(10), 
	"Phone" NVARCHAR(24), 
	"Fax" NVARCHAR(24), 
	"Email" NVARCHAR(60), 
	PRIMARY KEY ("EmployeeId"), 
	FOREIGN KEY("ReportsTo") REFERENCES "Employee" ("EmployeeId")
)

/*
3 rows from Employee table:
EmployeeId	LastName	FirstName	Title	ReportsTo	BirthDate	HireDate	Address	City	State	Country	PostalCode	Phone	Fax	Email
1	Adams	Andrew	General Manager	None	1962-02-18 00:00:00	2002-08-14 00:00:00	11120 Jasper Ave NW	Edmonton	AB	Canada	T5K 2N1	+1 (780) 428-9482	+1 (780) 428-3457	andrew@chinookcorp.com
2	Edwards	Nancy	Sales Manager	1	1958-12-08 00:00:00	2002-05-01 00:00:00	825 8 Ave SW	Calgary	AB	Canada	T2P 2T3	+1 (403) 262-3443	+1 (403) 262-3322	nancy@chinookcorp.com
3	Peacock	Jane	Sales Support Agent	2	1973-08-29 00:00:00	2002-04-01 00:00:00	1111 6 Ave SW	Calgary	AB	Canada	T2P 5M5	+1 (403) 262-3443	+1 (403) 262-6712	jane@chinookcorp.com
*/
Thought: Thought: I can count the number of rows in the Employee table to find the number of employees.
Action: sql_db_query
Action Input: SELECT COUNT(*) FROM Employee;
Observation: [(8,)]
Thought: I now know the final answer\nFinal Answer: There are 8 employees.

參考資料:
ReAct: Synergizing Reasoning and Acting in Language Models
ReAct Prompting
create_react_agent