Skip to main content

Agents for Software Development

· 2 min read

Software’s Impact

  • Software is transforming industries, as predicted by Marc Andreessen (2011).
  • Potential impact of enabling everyone to write software to achieve their goals.

Software Development Workflow

  • Time allocation:
    • 17% Coding
    • 36% Bugfixing
    • 10% Testing
    • 8% Documentation/Reviews
    • 14% Communication
    • 15% Other tasks

Development Tools

  • Copilots:
    • Synchronous support for writing code (e.g., GitHub Copilot).
  • Development Agents:
    • Autonomous tools for coding (e.g., SWE-Agent, Aider) and broader tasks (e.g., Devin, OpenHands).

Challenges in Coding Agents

  • Defining the environment.
  • Designing observation/action spaces.
  • File localization and code generation.
  • Planning, error recovery, and ensuring safety.

Software Development Environments

  • Actual Environments:
    • Source repositories, task management software, office tools, communication tools.
  • Testing Environments:
    • Focused on coding, sometimes includes browsing tasks.

Metrics and Datasets

  • Pass@K (Chen et al., 2021): Measures success rates of generated code passing unit tests.
  • Semantic Overlap Metrics:
    • BLEU, CodeBLEU, CodeBERTScore.
  • Key Datasets:
    • HumanEval, ARCADE, SWEBench, Design2Code.

Solutions for File Localization

  1. User Input: Relies on experienced users to specify files.
  2. Search Tools: Integrated search capabilities (e.g., SWE-Agent).
  3. Repository Mapping: Prebuilt maps (e.g., Aider repomap).
  4. Retrieval-Augmented Generation: Combine retrieved code and LMs.

Planning and Recovery

  • Hard-coded Processes: Predefined steps for file localization, patch generation, etc.
  • LLM-Generated Plans: Use LMs for planning and execution (e.g., CodeR).
  • Revisiting Errors: Automated fixes based on error messages (e.g., InterCode).

Safety Measures

  1. Sandboxing: Limit execution environments (e.g., Docker).
  2. Credentialing: Principle of least privilege.
  3. Post-hoc Auditing: Security analysis using LMs and other tools.

Future Directions

  • Enhance agentic training methods.
  • Expand human-in-the-loop approaches.
  • Address broader software tasks beyond coding.

Resources

  • OpenHands Repository: GitHub
Want to keep learning more?