Member-only story

Goodbye, Text2SQL: Why Table-Augmented Generation (TAG) is the Future of AI-Driven Data Queries!

Pavan Emani
Artificial Intelligence in Plain English
6 min readSep 11, 2024

Exploring the Future of Natural Language Queries with Table-Augmented Generation.

Photo by Choong Deng Xiang on Unsplash

Imagine you’re a business analyst, trying to understand why your company’s sales dropped last quarter. You query your database with a simple natural language question: “Why did sales drop last quarter?” The ideal scenario would be that the AI system instantly provides you with a context-rich, insightful answer — something that ties together all relevant data points, trends, and market insights. However, the reality is far from ideal.

Current AI methods for querying databases, such as Text2SQL and Retrieval-Augmented Generation (RAG), fall significantly short. These models are limited by their design, either only interpreting natural language as SQL queries or relying on simple lookups that fail to capture the complexity of real-world questions.

Why does this matter? Using Natural Language to query SQL databases is the new norm ever since LLMs started capturing the limelight! Businesses today are drowning in data but starving for insights. The inability of existing methods to effectively leverage both AI’s semantic reasoning and databases’ computational power is a major bottleneck in making data…

Published in Artificial Intelligence in Plain English

New AI, ML and Data Science articles every day. Follow to join our 3.5M+ monthly readers.

Written by Pavan Emani

Principal Data & ML Engineer with 19+ Years in Data Engineering and 6+ years in ML Engineering M.S. in Data Science, UC Berkeley

Responses (30)

Write a response

Easy to do this with a dataset of movies. Try doing this on a custom database. LLM won't know the details unless you plan to send every unique value from the database to the LLM which will be a strict no for most of the organizations unless you host…

The title is misleading. This is a framework to address txt2sql when unstructured data is involved

I doubt if it really works in real life. One raw table could be easily millions of rows and if you want to convert it to pandas table first, it is a nightmare, not to mention there are hundreds of tables in a database