Is it a wrap for Software Engineers? Devin autonomous AI software engineer...

Numpsay · Apr 8, 2024

Not sure I understand the joy some of you get at the THOUGHT of jobs being eliminated.

Json · Apr 8, 2024

Numpsay said:
Not sure I understand the joy some of you get at the THOUGHT of jobs being eliminated.

I’m just playing

The situation is funny only in the sense that ironically a lot of times these tech innovations are targeted at unskilled labor.

Productivity monitors, drones delivery, robot kitchen, etc. so the fact the innovation eliminates many of the people making it is crazy.

bnew · Apr 8, 2024

AI : Mere days after software agent Devin is released, an open-source alternative, SWE-agent, is almost as good.

GitHub - SWE-agent/SWE-agent: SWE-agent takes a GitHub issue and tries to automatically fix it, using your LM of choice. It can also be employed for offensive cybersecurity or competitive coding challenges. [NeurIPS 2024]

SWE-agent takes a GitHub issue and tries to automatically fix it, using your LM of choice. It can also be employed for offensive cybersecurity or competitive coding challenges. [NeurIPS 2024] - Gi...

github.com

About

SWE-agent: Agent Computer Interfaces Enable Software Engineering Language Models

SWE-Agent

Web site created using create-react-app

swe-agent.com

Website & Demo | Discord | Paper [coming April 10th]

Overview

SWE-agent turns LMs (e.g. GPT-4) into software engineering agents that can fix bugs and issues in real GitHub repositories.

On the full SWE-bench test set, SWE-agent resolves 12.29% of issues, achieving the state-of-the-art performance on the full test set.

Agent-Computer Interface (ACI)

We accomplish these results by designing simple LM-centric commands and feedback formats to make it easier for the LM to browse the repository, view, edit and execute code files. We call this an Agent-Computer Interface (ACI) and build the SWE-agent repository to make it easy to iterate on ACI design for repository-level coding agents.

Just like how typical language models requires good prompt engineering, good ACI design leads to much better results when using agents. As we show in our paper, a baseline agent without a well-tuned ACI does much worse than SWE-agent.

SWE-agent contains features that we discovered to be immensly helpful during the agent-computer interface design process:

We add a linter that runs when an edit command is issued, and do not let the edit command go through if the code isn't syntactically correct.
We supply the agent with a special-built file viewer, instead of having it just cat files. We found that this file viewer works best when displaying just 100 lines in each turn. The file editor that we built has commands for scrolling up and down and for performing a search within the file.
We supply the agent with a special-built full-directory string searching command. We found that it was important for this tool to succintly list the matches- we simply list each file that had at least one match. Showing the model more context about each match proved to be too confusing for the model.
When commands have an empty output we return a message saying "Your command ran successfully and did not produce any output."

Read our paper for more details.

@misc{yang2024sweagent, title={SWE-agent: Agent Computer Interfaces Enable Software Engineering Language Models}, author={John Yang and Carlos E. Jimenez and Alexander Wettig and Shunyu Yao and Karthik Narasimhan and Ofir Press}, year={2024}, }

1/4
SWE-Agent is an open-source software engineering agent with a 12.3% resolve rate on SWE-Bench!

Check out SWE-agent in action at SWE-Agent
Repo: GitHub - princeton-nlp/SWE-agent: SWE-agent: Agent Computer Interfaces Enable Software Engineering Language Models

2/4
The SWE-agent open-source repository provides a framework for turning general LMs into software engineering agents. SWE-agent lets LMs like GPT-4 interact with their own Docker container using an Agent Computer Interface (ACI) - allowing it to browse, search, edit, and run code.

3/4
It’s been amazing to work on this with such a great team:
@jyangballin
*,[/URL]
@_carlosejimenez
*,[/URL]
@_awettig
,[/URL]
@ShunyuYao12
,[/URL]
@karthik_r_n
,[/URL] and
@OfirPress

Keep[/URL] an eye out for the paper coming out April 10th!

4/4
SWE-agent is our new system for autonomously solving issues in GitHub repos. It gets similar accuracy to Devin on SWE-bench, takes 93 seconds on avg + it's open source!

We designed a new agent-computer interface to make it easy for GPT-4 to edit+run code
GitHub - princeton-nlp/SWE-agent: SWE-agent: Agent Computer Interfaces Enable Software Engineering Language Models

SWE-Agent
http://github.com/princeton-nlp/SWE-agent…
@jyangballin
@_carlosejimenez
@_awettig
@ShunyuYao12
@karthik_r_n
@OfirPress
https://twitter.com/jyangballin/status/1775114444370051582…

1/8
SWE-agent is our new system for autonomously solving issues in GitHub repos. It gets similar accuracy to Devin on SWE-bench, takes 93 seconds on avg + it's open source!

We designed a new agent-computer interface to make it easy for GPT-4 to edit+run code

GitHub - SWE-agent/SWE-agent: SWE-agent takes a GitHub issue and tries to automatically fix it, using your LM of choice. It can also be employed for offensive cybersecurity or competitive coding challenges. [NeurIPS 2024]

SWE-agent takes a GitHub issue and tries to automatically fix it, using your LM of choice. It can also be employed for offensive cybersecurity or competitive coding challenges. [NeurIPS 2024] - Gi...

github.com

2/8
SWE-agent works by interacting with a specialized terminal, which allows it to:
Open, scroll and search through files
Edit specific lines w/ automatic syntax check
Write and execute tests

This custom-built interface is critical for good performance!

2/N

3/8
Simply connecting an LM to a vanilla bash terminal does not work well.

Our key insight is that LMs require carefully designed agent-computer interfaces (similar to how humans like good UI design)

E.g. When the LM messes up indentation, our editor prevents it and gives feedback

4/8
Another example is that we discovered that for viewing files, letting SWE-agent only view 100 lines at a time was better than letting it view 200 or 300 lines and much better than letting it view the entire file.

Good agent-computer design is important even when using GPT-4.
4/N

5/8
SWE-agent can be easily configured and extended to improve future research on software engineering agents. Since SWE-agent is open source, anyone can experiment with and contribute new ways for agents to interact with computers.

5/N

6/8
Check out some cool demonstrations of SWE-agent fixing real GitHub issues at SWE-Agent!

6/N[/URL]

7/8
SWE-agent is a Princeton NLP collaboration by
@jyangballin
*,[/URL]
@_carlosejimenez
*,[/URL]
@_awettig
,[/URL]
@ShunyuYao12
,[/URL]
@karthik_r_n
,[/URL] and
@OfirPress

We’d[/URL] love to hear your thoughts, comments and questions! Here or on our discord at Join the SWE-agent Discord Server!

7/7[/URL]

8/8
Preprint coming next week!

http://github.com/princeton-nlp/SWE-agent…
SWE-Agent
@jyangballin
@_carlosejimenez
@_awettig
@ShunyuYao12
@karthik_r_n
@OfirPress
Join the SWE-agent Discord Server!
@yoheinakajima
@ChatGPTapp

discord.com
Join the SWE-agent Discord Server!
Check out the SWE-agent community on Discord - hang out with 4 other members and enjoy free voice and text chat.

xXMASHERXx · Apr 8, 2024

What's funny is that every time this topic comes up, it's full of people who either don't work in the fields mentioned or have no idea what they are talking about :russ:

BaggerofTea · Apr 8, 2024

these tools will be amazing for startups and organizations with small overheard/infrastructure

Paper Boi · Apr 8, 2024

nerds cannibalizing their own :mjlol:

bnew · Apr 8, 2024

AyBrehHam Linkin said:
AI is too expensive for most companies to sustain, it will affect jobs for awhile but once AI companies get leverage and sophistication is needed to operate these systems the costs will outweigh the benefits. You're seeing the same thing with Cloud services.

the code is barely optimized, have you heard of bitnet or 1bit models? these large language models will get much cheaper to run in a few years. the tech has a long way to go but a lot of people are working on making large language models cheaper to run.

Roger king · Apr 8, 2024

This is pure cap there will always been need for hands on work and human experience especially for engineering

bnew · Apr 8, 2024

Treblemaka said:
If you've ever played a game with procedurally generated worlds you'll understand why this stuff isn't a threat...yet.

How does this "AI" fix its own bugs?

How do explain to the "AI" what you actually want in requirements?
Most devs don't even know how to track that properly as most clients don't ask for what they need they ask for what they want.

How does the "AI" receive UAT feedback?

Can it generate a feature that doesn't currently exist on any platform?

I ask "AI" to fix bugs for code it provides me on almost a daily basis, they'd program testing and automatically feeding errors back to it.

The Intergalactic Koala · Apr 8, 2024

DJ Paul's Arm said:
Welp…

Thank you for this reply as I must have been the only smooth brain to think the same thing :dead:

6 cert big dikk IT breh: "NooOooO"

Devin the AI- "OooOooO yeahhhhHhH"

:francis:

i feel for the IT brehs as I wanted to be in the industry but God was like :hubie:

"nah chief, you got to sit this one out"

Serious · Apr 8, 2024

xXMASHERXx said:
What's funny is that every time this topic comes up, it's full of people who either don't work in the fields mentioned or have no idea what they are talking about

Most companies are barely even databases correctly, let alone on cloud.

Matter of fact excel is still the primary database for a lot companies.

The hire ups in a lot companies actually prefer to
Manipulate the data and read reports from excel. :francis:

If you know you know. @breakfuss

JT-Money · Apr 8, 2024

Tech companies are notorious pump fakers. They literally lie about everything.
:mjlol:

Serious · Apr 8, 2024

bnew said:
the code is barely optimized, have you heard of bitnet or 1bit models? these large language models will get much cheaper to run in a few years. the tech has a long way to go but a lot of people are working on making large language models cheaper to run.

There’s also an energy capacity of limit to ai.

As Use of A.I. Soars, So Does the Energy and Water It Requires

Generative artificial intelligence uses massive amounts of energy for computation and data storage and millions of gallons of water to cool the equipment at data centers. Now, legislators and regulators — in the U.S. and the EU — are starting to demand accountability.

e360.yale.edu

bnew · Apr 8, 2024

Black Hans said:
More like recursive for Software Engineers. Someone has to develop the AI software that will be developing software

AI will do that since it's doing some of that already today. a lot of models are being trained on synthetic data(data generated by AI).

FishNGrits · Apr 8, 2024

3rdWorld said:
They won't let that fly in Europe or anywhere else..reducing 90% of your workforce to replace them with AI will only go in the US, nowhere else.
A dollar no matter the cost

Nah if countries who use AI are out competing those who don’t you will just get left behind. Everyone will adopt it eventually

Is it a wrap for Software Engineers? Devin autonomous AI software engineer...

More options

Numpsay

Superstar

Json

Superstar

bnew

Veteran

GitHub - SWE-agent/SWE-agent: SWE-agent takes a GitHub issue and tries to automatically fix it, using your LM of choice. It can also be employed for offensive cybersecurity or competitive coding challenges. [NeurIPS 2024]

About

SWE-Agent

Overview

Agent-Computer Interface (ACI)

GitHub - SWE-agent/SWE-agent: SWE-agent takes a GitHub issue and tries to automatically fix it, using your LM of choice. It can also be employed for offensive cybersecurity or competitive coding challenges. [NeurIPS 2024]

xXMASHERXx

Superstar

BaggerofTea

Veteran

Paper Boi

Veteran

bnew

Veteran

Roger king

Superstar

bnew

Veteran

The Intergalactic Koala

Reporting for Duty

Serious

Veteran

JT-Money

Superstar

Serious

Veteran

As Use of A.I. Soars, So Does the Energy and Water It Requires

bnew

Veteran

FishNGrits

Superstar

Similar threads

Is it a wrap for Software Engineers? Devin autonomous AI software engineer...

Superstar

Superstar

Veteran

About​

Overview​

Agent-Computer Interface (ACI)​

Superstar

Veteran

Veteran

Veteran

Superstar

Veteran

Reporting for Duty

Veteran

Superstar

Veteran

Veteran

Superstar

Similar threads

About

Overview

Agent-Computer Interface (ACI)