Transforms
- pydantic model retrieval_qa_benchmark.transforms.AgentRouter
Agent Routing with LangChain MRKL Agent Prompts
- Fields:
children (List[retrieval_qa_benchmark.schema.transform.BaseTransform | None])format_instructions (str)llm_model (Dict[str, Any])prefix (str)record_template (str)suffix (str)verbose (bool)
- field format_instructions: str = 'Use the following format:\n\nQuestion: the input question you must answer\nThought: you should always think about what to do\nAction: the action to take, should be one of [{tool_names}]\nAction Input: the input to the action\nObservation: the result of the action\n... (this Thought/Action/Action Input/Observation can repeat N times)\nThought: I now know the final answer\nFinal Answer: the final answer to the original input question'
Instruction to teach LLM what is the output format
- field llm_model: Dict[str, Any] [Required]
model configuration for transforms with LLM
- field prefix: str = 'Answer the following questions as best you can. You have access to the following tools:'
Template prefix for agent
- field record_template: str = '{question}\n{choices}'
Template to format records
- field suffix: str = 'Begin!\n\nQuestion: {input}\nThought:{agent_scratchpad}'
Template suffix for agent
- field verbose: bool = False
If true, then agent will print output to stdout
- build_agent_template() str
- chain(**kwargs: Any) Any
- execute_action(record: QARecord) Tuple[str, int, int]
execute action for agent components
- Parameters:
record (QARecord) – data record to be processed
- Returns:
(generated file, number of prompt tokens, number of generated tokens)
- Return type:
Tuple[str, int, int]
- format_agent_template(q: str, stacked: List[str]) str
- get_next_state(generated: str) Tuple[BaseTransform | None, str]
- parse_extra(generate: str) ToolHistory | None
- set_children(children: List[BaseTransform | None]) None
Set children for transform
- Parameters:
children (List[Union[BaseTransform, None]]) – next nodes to execute
- pydantic model retrieval_qa_benchmark.transforms.ContextWithElasticBM25
- Fields:
children (List[retrieval_qa_benchmark.schema.transform.BaseTransform | None])context_template (str)dataset_name (Sequence[str])el_auth (Tuple[str, str])el_host (str)num_selected (int)sep_chr (str)
- field context_template: str = '{title} | {paragraph}'
- field dataset_name: Sequence[str] = ['Cohere/wikipedia-22-12-en-embeddings']
- field el_auth: Tuple[str, str] [Required]
- field el_host: str [Required]
- field num_selected: int = 5
- field sep_chr: str = '\n'
- chain(**kwargs: Any) Any
- preproc_question4query(data: Dict[str, Any]) str
- transform_context(data: Dict[str, Any], **params: Any) List[str]
- pydantic model retrieval_qa_benchmark.transforms.ContextWithFaiss
_summary_
- Inherited-members:
- Parameters:
BaseContextTransform (_type_) – _description_
- Fields:
children (List[retrieval_qa_benchmark.schema.transform.BaseTransform | None])context_template (str)dataset_name (Sequence[str])embedding_name (str)index_path (str)nprobe (int)num_selected (int)sep_chr (str)
- field context_template: str = '{title} | {paragraph}'
- field dataset_name: Sequence[str] = ['Cohere/wikipedia-22-12-en-embeddings']
- field embedding_name: str = 'paraphrase-multilingual-mpnet-base-v2'
- field index_path: str = 'data/indexes/Cohere_mpnet/IVFSQ_L2.index'
- field nprobe: int = 128
- field num_selected: int = 5
- field sep_chr: str = '\n'
- chain(**kwargs: Any) Any
- preproc_question4query(data: Dict[str, Any]) str
- transform_context(data: Dict[str, Any], **params: Any) List[str]
- pydantic model retrieval_qa_benchmark.transforms.ContextWithFaissESHybrid
_summary_
- Inherited-members:
- Parameters:
BaseContextTransform (_type_) – _description_
- Fields:
children (List[retrieval_qa_benchmark.schema.transform.BaseTransform | None])context_template (str)dataset_name (Sequence[str])el_auth (Tuple[str, str])el_host (str)embedding_name (str)index_path (str)is_raw_rank (bool)nprobe (int)num_filtered (int)num_selected (int)sep_chr (str)
- field context_template: str = '{title} | {paragraph}'
- field dataset_name: Sequence[str] = ['Cohere/wikipedia-22-12-en-embeddings']
- field el_auth: Tuple[str, str] [Required]
- field el_host: str [Required]
- field embedding_name: str = 'paraphrase-multilingual-mpnet-base-v2'
- field index_path: str = 'data/indexes/Cohere_mpnet/IVFSQ_L2.index'
- field is_raw_rank: bool = True
- field nprobe: int = 128
- field num_filtered: int = 100
- field num_selected: int = 5
- field sep_chr: str = '\n'
- chain(**kwargs: Any) Any
- preproc_question4query(data: Dict[str, Any]) str
- transform_context(data: Dict[str, Any], **params: Any) List[str]
- pydantic model retrieval_qa_benchmark.transforms.ContextWithRRFHybrid
- Fields:
children (List[retrieval_qa_benchmark.schema.transform.BaseTransform | None])context_template (str)num_selected (int)rank_dict (dict)sep_chr (str)with_title (int)
- field context_template: str = '{title} | {paragraph}'
- field num_selected: int = 5
- field rank_dict: dict = {'bm25': 40, 'mpnet': 30}
- field sep_chr: str = '\n'
- field with_title: int = True
- chain(**kwargs: Any) Any
- preproc_question4query(data: Dict[str, Any]) str
- transform_context(data: Dict[str, Any], **params: Any) List[str]
- pydantic model retrieval_qa_benchmark.transforms.LangChainInfoSQLDB
- Fields:
children (List[retrieval_qa_benchmark.schema.transform.BaseTransform | None])descrption (str)name (str)url (str)verbose (bool)
- field descrption: str = "Input to this tool is a comma-separated list of tables, output is the schema and sample rows for those tables. Be sure that the tables actually exist by calling sql_db_list_tables first! Example Input: 'table1, table2, table3'"
prompt description to this tool
- field name: str = 'sql_db_schema'
name for this tool
- field url: str [Required]
URL string to create engines
- field verbose: bool = False
If true, then agent will print output to stdout
- chain(**kwargs: Any) Any
- execute_action(record: QARecord) Tuple[str, int, int]
execute action for agent components
- Parameters:
record (QARecord) – data record to be processed
- Returns:
(generated file, number of prompt tokens, number of generated tokens)
- Return type:
Tuple[str, int, int]
- get_next_state(generate: str) Tuple[BaseTransform | None, str]
- parse_extra(generate: str) ToolHistory | None
- set_children(children: List[BaseTransform | None]) None
Set children for transform
- Parameters:
children (List[Union[BaseTransform, None]]) – next nodes to execute
- pydantic model retrieval_qa_benchmark.transforms.LangChainListSQLDB
- Fields:
children (List[retrieval_qa_benchmark.schema.transform.BaseTransform | None])descrption (str)name (str)url (str)verbose (bool)
- field descrption: str = 'Input is an empty string, output is a comma separated list of tables in the database.'
prompt description to this tool
- field name: str = 'sql_db_list_tables'
name for this tool
- field url: str [Required]
URL string to create engines
- field verbose: bool = False
If true, then agent will print output to stdout
- chain(**kwargs: Any) Any
- execute_action(record: QARecord) Tuple[str, int, int]
execute action for agent components
- Parameters:
record (QARecord) – data record to be processed
- Returns:
(generated file, number of prompt tokens, number of generated tokens)
- Return type:
Tuple[str, int, int]
- get_next_state(generate: str) Tuple[BaseTransform | None, str]
- parse_extra(generate: str) ToolHistory | None
- set_children(children: List[BaseTransform | None]) None
Set children for transform
- Parameters:
children (List[Union[BaseTransform, None]]) – next nodes to execute
- pydantic model retrieval_qa_benchmark.transforms.LangChainQuerySQLDB
- Fields:
children (List[retrieval_qa_benchmark.schema.transform.BaseTransform | None])descrption (str)name (str)url (str)verbose (bool)
- field descrption: str = "Input to this tool is a detailed and correct SQL query, output is a result from the database. If the query is not correct, an error message will be returned. If an error is returned, rewrite the query, check the query, and try again. If you encounter an issue with Unknown column 'xxxx' in 'field list', using sql_db_schema to query the correct table fields."
prompt description to this tool
- field name: str = 'sql_db_query'
name for this tool
- field url: str [Required]
URL string to create engines
- field verbose: bool = False
If true, then agent will print output to stdout
- chain(**kwargs: Any) Any
- execute_action(record: QARecord) Tuple[str, int, int]
execute action for agent components
- Parameters:
record (QARecord) – data record to be processed
- Returns:
(generated file, number of prompt tokens, number of generated tokens)
- Return type:
Tuple[str, int, int]
- get_next_state(generate: str) Tuple[BaseTransform | None, str]
- parse_extra(generate: str) ToolHistory | None
- set_children(children: List[BaseTransform | None]) None
Set children for transform
- Parameters:
children (List[Union[BaseTransform, None]]) – next nodes to execute
- pydantic model retrieval_qa_benchmark.transforms.LangChainSQLAgentRouter
Agent Decision with LangChain SQL Agent Prompts
- Fields:
children ()format_instructions (str)llm_model ()prefix (str)record_template ()sql_dialect (str)sql_topk (int)suffix (str)verbose ()
- field children: List[BaseTransform | None] = [None, None]
list of next status
- field format_instructions: str = 'Use the following format:\n\nQuestion: the input question you must answer\nThought: you should always think about what to do\nAction: the action to take, should be one of [{tool_names}]\nAction Input: the input to the action\nObservation: the result of the action\n... (this Thought/Action/Action Input/Observation can repeat N times)\nThought: I now know the final answer\nFinal Answer: the final answer to the original input question'
Instruction to teach LLM what is the output format
- field llm_model: Dict[str, Any] [Required]
model configuration for transforms with LLM
- field prefix: str = 'You are an agent designed to interact with a SQL database.\nGiven an input question, create a syntactically correct {dialect} query to run, then look at the results of the query and return the answer.\nUnless the user specifies a specific number of examples they wish to obtain, always limit your query to at most {top_k} results.\nYou can order the results by a relevant column to return the most interesting examples in the database.\nNever query for all the columns from a specific table, only ask for the relevant columns given the question.\nYou have access to tools for interacting with the database.\nOnly use the below tools. Only use the information returned by the below tools to construct your final answer.\nYou MUST double check your query before executing it. If you get an error while executing a query, rewrite the query and try again.\n\nDO NOT make any DML statements (INSERT, UPDATE, DELETE, DROP etc.) to the database.\n\nIf the question does not seem related to the database, just return "I don\'t know" as the answer.\n'
Template prefix for agent
- field record_template: str = '{question}\n{choices}'
Template to format records
- field sql_dialect: str = 'SQL'
SQL dialect that helps the LLM understand which SQL its working on
- field sql_topk: int = 5
Maximum retrieved context from database
- field suffix: str = 'Begin!\n\nQuestion: {input}\nThought: I should look at the tables in the database to see what I can query. Then I should query the schema of the most relevant tables.\n{agent_scratchpad}'
Template suffix for agent
- field verbose: bool = False
If true, then agent will print output to stdout
- build_agent_template() str
- chain(**kwargs: Any) Any
- execute_action(record: QARecord) Tuple[str, int, int]
execute action for agent components
- Parameters:
record (QARecord) – data record to be processed
- Returns:
(generated file, number of prompt tokens, number of generated tokens)
- Return type:
Tuple[str, int, int]
- format_agent_template(q: str, stacked: List[str]) str
- get_next_state(generated: str) Tuple[BaseTransform | None, str]
- parse_extra(generate: str) ToolHistory | None
- set_children(children: List[BaseTransform | None]) None
Set children for transform
- Parameters:
children (List[Union[BaseTransform, None]]) – next nodes to execute
- pydantic model retrieval_qa_benchmark.transforms.LangChainSQLChecker
- Fields:
checker_prompt (str)children (List[retrieval_qa_benchmark.schema.transform.BaseTransform | None])descrption (str)llm_model (Dict[str, Any])name (str)sql_dialect (str)url (str)verbose (bool)
- field checker_prompt: str = '\n{query}\nDouble check the {dialect} query above for common mistakes, including:\n- Using NOT IN with NULL values\n- Using UNION when UNION ALL should have been used\n- Using BETWEEN for exclusive ranges\n- Data type mismatch in predicates\n- Properly quoting identifiers\n- Using the correct number of arguments for functions\n- Casting to the correct data type\n- Using the proper columns for joins\n\nIf there are any of the above mistakes, rewrite the query. If there are no mistakes, just reproduce the original query.\n\nOutput the final SQL query only.\n\nSQL Query: '
- field descrption: str = 'Use this tool to double check if your query is correct before executing it. Always use this tool before executing a query with sql_db_query!'
prompt description to this tool
- field llm_model: Dict[str, Any] [Required]
model configuration for transforms with LLM
- field name: str = 'sql_db_query_checker'
name for this tool
- field sql_dialect: str = 'SQL'
- field url: str [Required]
URL string to create engines
- field verbose: bool = False
If true, then agent will print output to stdout
- chain(**kwargs: Any) Any
- execute_action(record: QARecord) Tuple[str, int, int]
execute action for agent components
- Parameters:
record (QARecord) – data record to be processed
- Returns:
(generated file, number of prompt tokens, number of generated tokens)
- Return type:
Tuple[str, int, int]
- get_next_state(generate: str) Tuple[BaseTransform | None, str]
- parse_extra(generate: str) ToolHistory | None
- set_children(children: List[BaseTransform | None]) None
Set children for transform
- Parameters:
children (List[Union[BaseTransform, None]]) – next nodes to execute