Coding & Development Agents
Coding agents represent the most technically demanding category. They need sandboxed execution environments, file system access, shell access, and the ability to run tests and iterate. The compute requirements per session are substantial.
1. Devin (Cognition AI)
CodingThe first AI software engineer marketed as fully autonomous. Devin takes natural language task descriptions (“add OAuth to our Express app”), decomposes them into steps, writes code, runs tests, debugs failures, and iterates. Nubank reported 12x efficiency improvements and 20x cost savings when using Devin for multi-million-line codebase migrations.
Architecture: LLM agent loop in sandboxed cloud environment with shell, editor, and browser. Infrastructure: Dedicated compute per session, persistent file system, network access for package installation. Price dropped from $500/month to $20/month in late 2025.
2. SWE-Agent (Princeton NLP)
CodingAn open-source framework that turns any LLM into a software engineering agent. SWE-Agent resolves real GitHub issues by reading code, making edits, and running tests. It pioneered the Agent-Computer Interface (ACI) design pattern — a structured way for LLMs to interact with development environments.
Architecture: LLM + ACI with Docker-based environment isolation. Infrastructure: Docker host with GPU access for local models, or API keys for cloud LLMs. Each agent run spawns a container. Self-hostable on any Linux server.
3. Google Antigravity
CodingLaunched November 2025 with the highest verified SWE-bench scores at the time (76.2%). Antigravity coordinates up to 8 specialized sub-agents simultaneously for complex multi-component projects. Its Planning mode decomposes large tasks before any code is written.
Architecture: Multi-agent orchestration with specialized planners, coders, and reviewers. Infrastructure: Cloud-hosted by Google. Each session runs across multiple parallel compute units. Not self-hostable.
Customer Support Agents
The highest-volume category. Support agents integrate with ticketing platforms, query knowledge bases via RAG, and handle multi-turn conversations with context retention across sessions.
4. Zendesk AI Agents
SupportZendesk's native AI agents resolve tickets without human intervention using the company's knowledge base and historical ticket data. They handle intent detection, answer generation, and escalation routing. Enterprise deployments report 60-80% autonomous resolution rates on tier-1 tickets.
Architecture: RAG pipeline over knowledge base + intent classification + response generation. Infrastructure: Hosted by Zendesk. Custom deployments integrating external LLMs need persistent servers for the RAG pipeline and vector database.
5. Intercom Fin
SupportFin is Intercom's AI agent that answers customer questions using the company's existing help center content. It handles multi-turn conversations, provides cited sources for each answer, and seamlessly hands off to human agents when it cannot resolve an issue. Intercom reports that Fin resolves up to 50% of support volume immediately.
Architecture: LLM-based retrieval with source citation and confidence scoring. Infrastructure: Hosted SaaS. Teams building custom support agents with similar capabilities need a VPS for the LLM orchestration layer and a vector database for knowledge retrieval.
6. Custom RAG Support Agents
SupportMany companies build custom support agents using LangChain or LlamaIndex for RAG over their specific documentation, integrated with Slack, Discord, or custom chat interfaces. These offer more control over the retrieval pipeline, response formatting, and escalation logic than SaaS solutions.
Architecture: Custom RAG pipeline + LLM + webhook integrations. Infrastructure: Dedicated server with 8-16 GB RAM for the vector DB and orchestration layer. Persistent process for webhook listeners. Needs 24/7 uptime — a down support bot means unanswered customers.
Research & Analysis Agents
7. Perplexity Deep Research
ResearchPerplexity's agent system queries the web, synthesizes information from multiple sources, and generates cited research reports. Their latest “Computer” agent (launched February 2026) integrates 19 AI models into a single environment that can generate sub-agents for specialized tasks like financial analysis, data collection, and visualization generation.
Architecture: Multi-model orchestration with web scraping, citation extraction, and report generation. Infrastructure: Cloud-hosted with GPU clusters for parallel model inference. The $200/month Max tier reflects the compute intensity.
8. Custom Competitive Intelligence Agents
ResearchCompanies deploy agents that continuously monitor competitor pricing, product launches, job postings, and press releases. These agents scrape public data sources on a schedule, compare against historical baselines, and generate alerts when significant changes are detected. Unlike one-off research, these run 24/7.
Architecture: Scheduled scraping + LLM summarization + anomaly detection + notification pipeline. Infrastructure: Persistent server with cron-based scheduling, database for historical data, and enough RAM for the LLM inference or API budget for cloud LLM calls.
Trading & Finance Agents
9. Algorithmic Trading Agents
FinanceModern trading agents go beyond simple algorithmic strategies. They ingest tick data, news feeds, and alternative data signals, use RL-trained policies for execution optimization, and operate within strict latency and compliance boundaries. JPMorgan's LOXM system uses deep RL trained on billions of historical trades to optimize large equity order execution.
Architecture: RL policy network + market data ingestion + execution engine + risk management layer. Infrastructure: Ultra-low-latency dedicated servers, co-located with exchanges. GPU access for policy inference. Persistent state for portfolio tracking. Failure recovery measured in milliseconds.
10. Financial Analysis Agents
FinanceLLM-based agents that analyze earnings reports, SEC filings, and financial news to generate investment summaries and risk assessments. They parse structured financial data (balance sheets, income statements) and unstructured text (management commentary, analyst calls) to produce comprehensive analysis that would take a human analyst hours.
Architecture: Document parsing + structured data extraction + LLM reasoning + report generation. Infrastructure: Server with 16-32 GB RAM for document processing, persistent storage for financial data, and scheduled execution for earnings season batch processing.
DevOps & Infrastructure Agents
11. Incident Response Agents
DevOpsAgents that monitor infrastructure alerts, correlate symptoms across services, diagnose root causes, and execute runbooks automatically. Companies like Ciroos combine AI agents with SRE workflows using multi-agent architectures to automate incident management and reduce operational toil. When PagerDuty fires at 3 AM, the agent handles the first response.
Architecture: Alert ingestion + log correlation + runbook execution engine + escalation logic. Infrastructure: Must run independently of the infrastructure it monitors. Dedicated server with access to monitoring APIs, SSH keys for remediation, and its own monitoring (who watches the watcher).
12. CI/CD Pipeline Agents
DevOpsAgents that review pull requests, identify potential bugs and security vulnerabilities, suggest fixes, and auto-merge when confidence is high. They integrate with GitHub Actions or GitLab CI and use LLMs to understand code context beyond what static analysis catches. Some teams deploy agents that automatically fix failing tests by analyzing the error output and generating patches.
Architecture: Webhook listener + code analysis + LLM review + PR integration. Infrastructure: Persistent server for webhook endpoints, sufficient RAM for code context loading (large repos need 8-16 GB), API access to LLM providers.
Specialized Domain Agents
13. Healthcare Triage Agents
HealthcareAgents that perform initial patient symptom assessment, suggest possible conditions, and route to appropriate specialists. They maintain contextual awareness across patient records and support clinical decision-making. The healthcare AI agent market could save the US healthcare system up to $150 billion annually by reducing administrative burden.
Architecture: Symptom classifier + medical knowledge RAG + risk assessment + routing. Infrastructure: HIPAA-compliant hosting with encrypted storage, audit logging, and access controls. No shared infrastructure. Dedicated servers with tamper-proof logs are mandatory.
14. Legal Document Review Agents
LegalAgents that review contracts, flag non-standard clauses, compare terms against company policies, and generate risk summaries. A task that takes a junior associate 4 hours can be completed in minutes. These agents need extremely high accuracy — a missed liability clause has real financial consequences.
Architecture: Document parsing + clause extraction + policy comparison + risk scoring. Infrastructure: On-premises or dedicated servers for data sovereignty. 32 GB+ RAM for loading large contract corpora. Audit logging for regulatory compliance.
15. Manufacturing Quality Agents
ManufacturingAgents that analyze data from thousands of IoT sensors to optimize machine settings and detect early signs of equipment failure. They predict maintenance needs before breakdowns occur, reducing unplanned downtime by 30-50% in deployed facilities. Retailers apply similar agents for inventory optimization across warehouse networks.
Architecture: Sensor data ingestion + anomaly detection + predictive models + alerting. Infrastructure: Edge servers near production lines for low-latency sensor data processing, with cloud aggregation for fleet-wide analysis. 24/7 operation is mandatory — a missed sensor anomaly means equipment damage.
Infrastructure Requirements Comparison
| Agent Category | RAM | GPU | Uptime | Special Needs |
|---|---|---|---|---|
| Coding | 8–32 GB | Optional | Per-session | Sandboxed shell |
| Support | 8–16 GB | Optional | 24/7 | Vector DB |
| Research | 16–64 GB | Recommended | Scheduled | Web access |
| Trading | 16–64 GB | Required | 24/7 | Low latency |
| DevOps | 8–16 GB | Optional | 24/7 | Isolated from monitored infra |
| Healthcare/Legal | 16–32 GB | Optional | 24/7 | Compliance, audit logs |
What Fails Without Proper Hosting
Every category above shares common failure patterns when deployed on inadequate infrastructure:
Silent crashes
Without process supervision, a crashed agent stays dead until someone notices. For a support agent, this means unanswered customers. For a trading agent, this means missed market opportunities or unmanaged positions.
Memory exhaustion
Agents that accumulate context (support conversations, research data, trading state) grow in memory over time. Without cgroup-level limits, a single agent can consume all server RAM, taking down co-located services.
No audit trail
Healthcare, legal, and financial agents operate in regulated environments. Without tamper-proof audit logging, every agent action is unreviewable. osModa's SHA-256 audit ledger solves this at the infrastructure layer.
Frequently Asked Questions
What is the most common type of AI agent in production?
Customer support agents are the most widely deployed category. They handle tier-1 ticket triage, answer FAQs from knowledge bases, and escalate complex issues to human agents. The reason is straightforward economics: support teams handle high volumes of repetitive queries, and an agent that resolves 60-80% of tickets autonomously delivers immediate, measurable ROI. Most are built on LLM frameworks like LangChain or custom RAG pipelines, integrated with platforms like Zendesk, Intercom, or Freshdesk.
What infrastructure do production AI agents need?
At minimum: a dedicated server or VPS with 4-32 GB RAM (depending on whether you run local models or use APIs), persistent storage for agent state and logs, process supervision for automatic restart on failure, and network access for API calls. Agents running local LLMs additionally need GPU VRAM proportional to model size. The non-obvious requirement is reliability infrastructure — health checking, log rotation, and external monitoring. A crashed agent that nobody notices is worse than no agent at all.
Can AI agents run on shared hosting or serverless?
Simple, stateless agents (single API call, return result) can run on serverless functions. But most production agents are stateful and long-running — they maintain conversation context, execute multi-step workflows, and need persistent connections to external services. Serverless cold starts (500ms-5s) break real-time interactions. Shared hosting lacks the resource isolation needed to prevent one agent from affecting others. Dedicated servers or managed agent platforms like osModa provide the persistent execution, resource isolation, and process supervision that production agents require.
What frameworks are most used for building AI agents?
The ecosystem is fragmented but consolidating. LangGraph (from LangChain) dominates for complex, stateful agent workflows. CrewAI leads for multi-agent orchestration. OpenAI Agents SDK is popular for teams already in the OpenAI ecosystem. AutoGen (Microsoft) excels at multi-agent conversation patterns. For coding agents specifically, the SWE-Agent framework and Claude Code's agent mode are widely used. Most production teams use multiple frameworks — one for orchestration, another for specific agent capabilities.
How much do production AI agents cost to run?
Costs split into two categories: compute and API calls. Compute ranges from $15-150/month for a VPS or managed server (depending on RAM and CPU requirements). API costs vary enormously — a customer support agent making 1,000 GPT-4o calls per day might spend $30-100/month on tokens, while a research agent doing deep analysis could spend $500+/month. The key cost optimization is model routing: use small, cheap models for simple tasks and reserve expensive frontier models for complex reasoning. osModa plans start at $14.99/month for the infrastructure layer.
What is the biggest failure mode for AI agents in production?
Silent degradation. The agent continues running but its output quality drops — it starts hallucinating more frequently, missing edge cases, or entering infinite reasoning loops that burn through API credits without producing useful results. Unlike a crash (which is obvious), quality degradation requires active monitoring of output metrics: task completion rate, user satisfaction scores, error rates, and token consumption per task. Nearly 32% of teams cite quality as their top barrier to scaling agent deployments.
How do coding AI agents like Devin actually work?
Coding agents operate in a sandboxed environment with a shell, code editor, and browser. They receive a task (like 'add authentication to this app'), decompose it into sub-steps, generate code, run tests, observe errors, and iterate until the tests pass. The architecture is an LLM-based agent loop: plan, execute, observe, revise. The infrastructure requirements are significant — each agent session needs its own isolated compute environment with shell access, file system access, and network access for package installation. Devin specifically runs each session in a dedicated cloud sandbox.
Are AI agents replacing human workers?
In practice, production AI agents augment rather than replace. Customer support agents handle tier-1 tickets and escalate complex issues to humans. Coding agents generate boilerplate and fix bugs while developers focus on architecture and design. Research agents gather and summarize information while analysts make strategic decisions. The pattern across all 15 examples in this article is the same: agents handle high-volume, repetitive tasks with well-defined success criteria, and humans handle novel, ambiguous, or high-stakes decisions. Gartner predicts that by 2026, 40% of enterprise applications will include task-specific AI agents — as tools, not replacements.