Collinear Data Catalog

Production-grade task environments for RL capability development. Each domain includes simulated enterprise applications, realistic seed data, NPC behavioral simulations, and automated verifiers.

Coding

Simulate coding tasks from simple bug fixes to complex multi-feature implementations, using the industry-standard tools engineers rely on every day.

Software & Product DevITSMKnowledge Work
GitHub
Python
Bash
MCP

Memory (Multi-Episode)

Tasks that span multiple sessions or episodes, requiring the agent to retain and apply context across interactions.

HRSoftware & Product DevKnowledge WorkHealthcare
Workday
Slack
Jira
Google Calendar
MCP

Long-Horizon (>100 tools)

Multi-step workflows requiring 100+ sequential tool calls, where the agent must maintain coherent state and goal-tracking across an extended action chain.

Software & Product DevKnowledge WorkHealthcare
Slack
Asana
Microsoft Office
Google Calendar
MCPCUA

Artifact Generation

Tasks requiring the agent to generate structured outputs: reports, spreadsheets, charts, and formatted documents ready for real-world use.

FinanceSales & ProcurementKnowledge Work
Microsoft Office
Google Workspace
SEC Edgar
Odoo ERP
MCPCUA

Error Recovery / Self-Correction

Reconciliation, debugging, and self-correction under failure states — tasks where the agent must detect what went wrong and fix it without human guidance.

FinanceCustomer ServiceITSMPersonal Assistant
SEC Edgar
Twelve Data
Salesforce CRM
Zendesk
MCP

Ambiguity Resolution

Classification and triage with incomplete or contradictory inputs, where the agent must make confident decisions despite missing information.

HRCustomer ServiceSoftware & Product DevSales & Procurement+1
Zendesk
Salesforce CRM
Workday
SAP SuccessFactors
MCP

Planning & Re-planning

Coordination, roadmaps, and scheduling with dynamic constraints — tasks where plans must adapt as new stakeholder inputs arrive.

HRSoftware & Product DevHealthcareKnowledge Work
Jira
Confluence
Slack
Google Calendar
MCP

Tool Creation

Tasks where the agent must build or extend its own tools to complete objectives that go beyond its default capabilities.

Software & Product Dev
GitHub
Python
Bash
MCP

Multi-turn

Tasks spanning multiple NPC personas and extended conversation turns, each requiring a different communication approach from the agent.

FinanceHRCustomer ServiceSoftware & Product Dev+4
Slack
Salesforce CRM
Workday
Jira
Zendesk
MCPCUA