Tukun.ai

Article

Tukun.ai: A Semantic-First Data Agent

Tukun.ai is a semantic-first data agent built around governed semantics, multiple data sources, multiple LLMs, runtime skills, and reusable analysis assets.

Natural language and structured data have always lived in different worlds.

Business questions usually arrive in natural language: Why did growth slow down? What caused a metric to move? How did different channels perform? But real analysis still has to return to the structured layer: data sources, tables, fields, metric definitions, time grains, filters, access boundaries, and reusable outputs.

Tukun.ai is designed to connect those two layers.

In one sentence:

Tukun.ai is a semantic-first data agent.

It is not a simple ChatBI interface. It is a Data Agent Harness built around semantics, data sources, models, skills, and reusable analysis assets.

Why semantic-first

The hard part of a data agent is not just translating a question into SQL.

In real business work, the same metric can have different definitions, the same dimension can come from different data sources, and the same question can imply different time grains or business boundaries. If semantics are not managed first, even a strong model can answer the wrong question fluently.

That is why Tukun.ai starts from this order:

  1. Manage business semantics.
  2. Connect data sources.
  3. Let the agent orchestrate models and tools.
  4. Preserve analysis outputs as reusable assets.

Metrics, dimensions, entities, and relationships are not treated as temporary prompt text. They are manageable, publishable, and traceable product objects.

Product structure

Tukun.ai is organized into several layers:

LayerRole
SemanticsManage metrics, dimensions, entities, relationships, and business definitions
DataConnect PostgreSQL, files, and other business data sources, then synchronize metadata
ModelsSupport multiple LLMs and switch models by task instead of binding the product to one provider
LanguageSupport multilingual usage and carry language into runtime context
SkillsExtend repeatable analysis workflows and domain capabilities
WorkbenchHandle questions, analysis, review, follow-up, and reuse
AssetsPreserve cards, charts, dashboards, skills, and semantic definitions

The goal is simple: analysis should not stop at a single answer. It should be something teams can inspect, follow up on, reuse, and improve.

A complete Data Agent Harness

The core of Tukun.ai is not a single page. It is an analysis runtime.

When a user request enters the system, the runtime is responsible for:

  • resolving intent
  • assembling prompts and context
  • selecting available tools and Skills
  • dispatching the right model
  • executing semantic queries or analysis tools
  • shaping the result into a reusable product artifact

This keeps product logic from being scattered across pages. Workbench, semantics, skills, and downstream assets are all organized around the same runtime path.

Semantic workflow

Tukun.ai uses synchronized metadata from data sources as the starting point, then uses LLMs to help generate semantic drafts that can align with MetricFlow structures.

The system does not automatically publish those definitions. The default path is AI-assisted draft generation followed by human review. The system improves the speed of initial modeling, while people keep control over business meaning.

This works well for data teams because:

  • metadata can enter the system automatically
  • semantic drafts can be generated with LLM assistance
  • metrics, dimensions, entities, and relationships remain editable
  • publish state and version history can be tracked

The point is not to make the model guess business definitions on every request. The point is to let the agent analyze on top of a more stable semantic layer.

Multiple data sources

Enterprise data rarely lives in one place.

Some data is in databases. Some is in files. Some comes from business systems or APIs. Tukun.ai is designed around multiple data sources from the beginning, so different sources can enter one analysis workflow.

In the current architecture, semantic assets and analysis context are scoped by data_source_id. This prevents metric definitions from different data sources from being mixed accidentally, and it gives the product a clear foundation for source-level governance and reuse.

Multiple LLMs

Different models are good at different jobs.

Some are better at reasoning. Some are better at tool use. Some are better for cost control. Some are more stable in specific language scenarios.

Tukun.ai treats models as configurable, governable product capabilities:

  • multiple providers
  • multiple models
  • plan-based model availability
  • default model and model preference settings
  • runtime refresh after configuration changes

This matters for a commercial product. Multi-model support is not only an API integration problem. It also touches billing, accounts, quota, cached input, output, and reasoning output.

Prompt Cache-friendly context design

To control long-term usage cost, Tukun.ai uses layered prompt assembly:

  1. Base System Prompt
  2. Core Runtime Rules
  3. Response Contract
  4. Evidence Rules
  5. Shared Memory
  6. Skill Prompts / Skill References
  7. Recent Turns
  8. Turn Context

These sections are grouped as stable, semistable, and volatile.

Stable content stays fixed as much as possible. Semistable content changes with capabilities and task shape. Volatile content stays close to the current turn. This structure makes it easier to benefit from provider Prompt Cache behavior and keeps frequent analysis workflows more cost-efficient.

Multilingual runtime

Multilingual support is not just interface translation.

For a data agent, language affects user questions, tool output, errors, analysis conclusions, and follow-up suggestions. Tukun.ai carries requested_locale into runtime context so prompt assembly and tool output can follow the user’s language environment.

The current product has been shaped around Chinese, English, and Japanese. Future languages should mainly require localized copy and language configuration, not a rewrite of the business workflow.

Skills extension

Beyond built-in analysis capabilities, Tukun.ai supports repeatable workflows through Skills.

Examples include:

  • industry-specific analysis templates
  • report generation from fixed templates
  • PPT generation from data results
  • team-specific analysis methods
  • domain-specific data processing and explanation flows

Skills participate in runtime prompt and tool context as capability bundles. They are not just UI shortcuts.

How it differs from traditional BI and generic chat assistants

ComparisonTraditional BIGeneric chat assistantTukun.ai
Entry pointDashboards / reportsChat boxSemantics + Workbench
Semantic managementOften scatteredMostly absentBuilt in
Data accessAvailable, but configuration-heavyWeakPart of the analysis workflow
Analysis processFixedTemporaryFollow-up, review, and reuse
Result preservationDashboard-firstContext-memory dependentCards, charts, dashboards, and semantic assets

Tukun.ai is not trying to replace every BI product, and it is not trying to become a generic chat entry point.

It focuses on one path: start from a natural language question, constrain it with governed semantics, execute with tools, and preserve the result as a reusable analysis asset.

Current stage

Tukun.ai already has the core framework in place:

  • Data Agent Harness
  • semantic-first workflow
  • multiple data source support
  • multi-LLM configuration
  • Prompt Cache-friendly context layering
  • multilingual runtime
  • Skills extension
  • Workbench and reusable analysis assets

Next, the product will keep improving around real analysis workflows: making semantic modeling steadier, making the analysis path easier to inspect, and making result reuse more natural.

The point of a data agent is not to make a model sound more conversational. The point is to make analysis more reliable, controllable, and cumulative.

That is the direction of Tukun.ai.

Previous
Tukuan.ai 0.11 Is Live: Back to One Main Path
Next
Tukun Agent Is Live