Is it a wrap for Software Engineers? Devin autonomous AI software engineer...

Json

Superstar
Joined
Nov 21, 2017
Messages
12,475
Reputation
1,358
Daps
37,794
Reppin
Central VA
Not sure I understand the joy some of you get at the THOUGHT of jobs being eliminated.
I’m just playing

The situation is funny only in the sense that ironically a lot of times these tech innovations are targeted at unskilled labor.

Productivity monitors, drones delivery, robot kitchen, etc. so the fact the innovation eliminates many of the people making it is crazy.
 

bnew

Veteran
Joined
Nov 1, 2015
Messages
55,103
Reputation
8,185
Daps
155,725
AI : Mere days after software agent Devin is released, an open-source alternative, SWE-agent, is almost as good.




About​

SWE-agent: Agent Computer Interfaces Enable Software Engineering Language Models


swe-agent.com

Website & Demo | Discord | Paper [coming April 10th]

👋 Overview​

SWE-agent turns LMs (e.g. GPT-4) into software engineering agents that can fix bugs and issues in real GitHub repositories.

On the full SWE-bench test set, SWE-agent resolves 12.29% of issues, achieving the state-of-the-art performance on the full test set.




✨ Agent-Computer Interface (ACI)​

We accomplish these results by designing simple LM-centric commands and feedback formats to make it easier for the LM to browse the repository, view, edit and execute code files. We call this an Agent-Computer Interface (ACI) and build the SWE-agent repository to make it easy to iterate on ACI design for repository-level coding agents.


Just like how typical language models requires good prompt engineering, good ACI design leads to much better results when using agents. As we show in our paper, a baseline agent without a well-tuned ACI does much worse than SWE-agent.


SWE-agent contains features that we discovered to be immensly helpful during the agent-computer interface design process:

  1. We add a linter that runs when an edit command is issued, and do not let the edit command go through if the code isn't syntactically correct.
  2. We supply the agent with a special-built file viewer, instead of having it just cat files. We found that this file viewer works best when displaying just 100 lines in each turn. The file editor that we built has commands for scrolling up and down and for performing a search within the file.
  3. We supply the agent with a special-built full-directory string searching command. We found that it was important for this tool to succintly list the matches- we simply list each file that had at least one match. Showing the model more context about each match proved to be too confusing for the model.
  4. When commands have an empty output we return a message saying "Your command ran successfully and did not produce any output."
Read our paper for more details.

@misc{yang2024sweagent,<br> title={SWE-agent: Agent Computer Interfaces Enable Software Engineering Language Models}, <br> author={John Yang and Carlos E. Jimenez and Alexander Wettig and Shunyu Yao and Karthik Narasimhan and Ofir Press},<br> year={2024},<br>}<br>







1/4
SWE-Agent is an open-source software engineering agent with a 12.3% resolve rate on SWE-Bench!

Check out SWE-agent in action at SWE-Agent
Repo: GitHub - princeton-nlp/SWE-agent: SWE-agent: Agent Computer Interfaces Enable Software Engineering Language Models

2/4
The SWE-agent open-source repository provides a framework for turning general LMs into software engineering agents. SWE-agent lets LMs like GPT-4 interact with their own Docker container using an Agent Computer Interface (ACI) - allowing it to browse, search, edit, and run code.

3/4
It’s been amazing to work on this with such a great team:
@jyangballin
*,[/URL]
@_carlosejimenez
*,[/URL]
@_awettig
,[/URL]
@ShunyuYao12
,[/URL]
@karthik_r_n
,[/URL] and
@OfirPress




Keep[/URL] an eye out for the paper coming out April 10th!

4/4
SWE-agent is our new system for autonomously solving issues in GitHub repos. It gets similar accuracy to Devin on SWE-bench, takes 93 seconds on avg + it's open source!

We designed a new agent-computer interface to make it easy for GPT-4 to edit+run code
GitHub - princeton-nlp/SWE-agent: SWE-agent: Agent Computer Interfaces Enable Software Engineering Language Models

GKJ3eo0XkAEFJ1g.jpg

GKLQMyqXIAAw0gk.jpg










1/8
SWE-agent is our new system for autonomously solving issues in GitHub repos. It gets similar accuracy to Devin on SWE-bench, takes 93 seconds on avg + it's open source!

We designed a new agent-computer interface to make it easy for GPT-4 to edit+run code

2/8
SWE-agent works by interacting with a specialized terminal, which allows it to:
Open, scroll and search through files
Edit specific lines w/ automatic syntax check
Write and execute tests

This custom-built interface is critical for good performance!

2/N

3/8
Simply connecting an LM to a vanilla bash terminal does not work well.

Our key insight is that LMs require carefully designed agent-computer interfaces (similar to how humans like good UI design)

E.g. When the LM messes up indentation, our editor prevents it and gives feedback

4/8
Another example is that we discovered that for viewing files, letting SWE-agent only view 100 lines at a time was better than letting it view 200 or 300 lines and much better than letting it view the entire file.

Good agent-computer design is important even when using GPT-4.
4/N

5/8
SWE-agent can be easily configured and extended to improve future research on software engineering agents. Since SWE-agent is open source, anyone can experiment with and contribute new ways for agents to interact with computers.

5/N

6/8
Check out some cool demonstrations of SWE-agent fixing real GitHub issues at SWE-Agent!

6/N[/URL]

7/8
SWE-agent is a Princeton NLP collaboration by
@jyangballin
*,[/URL]
@_carlosejimenez
*,[/URL]
@_awettig
,[/URL]
@ShunyuYao12
,[/URL]
@karthik_r_n
,[/URL] and
@OfirPress


We’d[/URL] love to hear your thoughts, comments and questions! Here or on our discord at Join the SWE-agent Discord Server!

7/7[/URL]

8/8
Preprint coming next week!

GKJ3eo0XkAEFJ1g.jpg

GKJ3rOPWcAABvPi.jpg

GKJ37jOX0AEea52.jpg

GIlREebbgAMRwD1.jpg
 

bnew

Veteran
Joined
Nov 1, 2015
Messages
55,103
Reputation
8,185
Daps
155,725
AI is too expensive for most companies to sustain, it will affect jobs for awhile but once AI companies get leverage and sophistication is needed to operate these systems the costs will outweigh the benefits. You're seeing the same thing with Cloud services.

the code is barely optimized, have you heard of bitnet or 1bit models? these large language models will get much cheaper to run in a few years. the tech has a long way to go but a lot of people are working on making large language models cheaper to run.





 

bnew

Veteran
Joined
Nov 1, 2015
Messages
55,103
Reputation
8,185
Daps
155,725
If you've ever played a game with procedurally generated worlds you'll understand why this stuff isn't a threat...yet.


How does this "AI" fix its own bugs?

How do explain to the "AI" what you actually want in requirements?
Most devs don't even know how to track that properly as most clients don't ask for what they need they ask for what they want.

How does the "AI" receive UAT feedback?

Can it generate a feature that doesn't currently exist on any platform?



I ask "AI" to fix bugs for code it provides me on almost a daily basis, they'd program testing and automatically feeding errors back to it.
 

Serious

Veteran
Supporter
Joined
Apr 30, 2012
Messages
79,834
Reputation
14,192
Daps
189,870
Reppin
1st Round Playoff Exits
What's funny is that every time this topic comes up, it's full of people who either don't work in the fields mentioned or have no idea what they are talking about :russ: :russ: :russ:
Most companies are barely even databases correctly, let alone on cloud.

Matter of fact excel is still the primary database for a lot companies.

The hire ups in a lot companies actually prefer to
Manipulate the data and read reports from excel. :francis:

If you know you know. @breakfuss
 

Serious

Veteran
Supporter
Joined
Apr 30, 2012
Messages
79,834
Reputation
14,192
Daps
189,870
Reppin
1st Round Playoff Exits
the code is barely optimized, have you heard of bitnet or 1bit models? these large language models will get much cheaper to run in a few years. the tech has a long way to go but a lot of people are working on making large language models cheaper to run.







There’s also an energy capacity of limit to ai.

 

FishNGrits

Superstar
Joined
Dec 28, 2016
Messages
2,517
Reputation
610
Daps
15,092
They won't let that fly in Europe or anywhere else..reducing 90% of your workforce to replace them with AI will only go in the US, nowhere else.
A dollar no matter the cost :francis:
Nah if countries who use AI are out competing those who don’t you will just get left behind. Everyone will adopt it eventually
 
Top