Software engineering is on the brink of a revolution with the emergence of large language models (LLMs). LLMs are AI systems that have been trained on large amounts of data, allowing them to generate natural language text and source code.
LLMs allow developers to specify intent using prompts, rather than writing complex code, to have task executed, and they can also take on the task of writing and debugging code, enabling developers to focus on higher-level tasks.
We can draw a parallel between this shift in software engineering and the “bitter lesson” in reinforcement learning research described by Richard Sutton: simpler approaches that scale better with more compute will eventually and inevitably outperform more complex approaches.
In this blog post, we will explore how LLMs will change the way we approach software engineering. We will discuss the potential implications of this shift, and the opportunities it presents for software engineers.
Additionally, I have released a proof-of-concept package
in Python called llm-strategy
, based on langchain
,
which allows developers to use LLMs to implement functions and
interfaces in a more visible and direct way. This package includes a
decorator that connects to an LLM (such as OpenAI’s GPT-3) and uses the LLM to
implement abstract methods in interface classes. It does this by
forwarding requests to the LLM and converting the responses back to
Python data using Python’s @dataclass
es.
What has happened?
Language models (LLMs) have recently made significant progress in their ability to understand and follow human instructions.
As a result, software engineering is facing a potential revolution with the emergence of LLMs. These models have the potential to change the way we approach software engineering in two main ways:
- LLMs allow developers to specify their intent using prompts, rather than writing complex code.
- LLMs can take on the task of writing and debugging code, enabling developers to focus on higher-level tasks.
The llm-strategy
package can be useful for prototyping
applications without writing a lot of backend code and still have the
app react in meaningful ways. It uses the doc strings, type annotations,
and method/function names as prompts for the LLM, and can automatically
convert the results back into Python types (currently only supporting
@dataclass
es). It can also extract a data schema to send to
the LLM for interpretation. While the llm-strategy
package
still relies on too much Python code for serialization, there is the
potential to reduce the need for this code in the future by using
additional, cheaper LLM calls to automate the parsing of structured
data.
@llm_strategy(OpenAI)
def query_database(database: Database, query:str) -> Table:
"""Query the database using a natural language query `query` and return
the resulting table.
Example
=======
>>> query_database(database, "SELECT * FROM EMPLOYEES")
Table(columns=("employee_id", "name", "address", ...),
data=[["1123123", "John Miller", ...],[...]]
Arguments
=========
...
"""
raise NotImplementedError()
Example: Mock Customer Database
As an example, we used the llm-strategy
package to
create a mock customer database viewer using textual
as
console UI, which generates mock customer data using GPT-3 and
implements basic lookup functionality using GPT-3 as well. This was
achieved by defining the relevant interfaces and dataclasses and
providing the necessary doc strings. The rest was handled by calling
into LLMs using the “llm-strategy” package.
Here are some screenshots of the mock customer database viewer in action:
Here is an example of the Python code used to create the mock customer database viewer:
from dataclasses import dataclass
from llm_strategy import llm_strategy
from langchain.llms import OpenAI
@llm_strategy(OpenAI(max_tokens=256))
@dataclass
class Customer:
str
key: str
first_name: str
last_name: str
birthdate: str
address:
@property
def age(self) -> int:
"""Return the current age of the customer.
This is a computed property based on `birthdate` and the current year (2022).
"""
raise NotImplementedError()
@dataclass
class CustomerDatabase:
list[Customer]
customers:
def find_customer_key(self, query: str) -> list[str]:
"""Find the keys of the customers that match a natural language query best (sorted by closeness to the match).
We support semantic queries instead of SQL, so we can search for things like
"the customer that was born in 1990".
Args:
query: Natural language query
Returns:
The index of the best matching customer in the database.
"""
raise NotImplementedError()
def load(self):
"""Load the customer database from a file."""
raise NotImplementedError()
def store(self):
"""Store the customer database to a file."""
raise NotImplementedError()
@llm_strategy(OpenAI(max_tokens=1024))
@dataclass
class MockCustomerDatabase(CustomerDatabase):
def load(self):
self.customers = self.create_mock_customers(10)
def store(self):
pass
@staticmethod
def create_mock_customers(num_customers: int = 1) -> list[Customer]:
"""
Create mock customers with believable data (our customers are world citizens).
"""
raise NotImplementedError()
The full example is here.
Richard Sutton’s Bitter Lesson
The emergence of LLMs in software engineering can be seen as a manifestation of Richard Sutton’s “bitter lesson” in reinforcement learning research, which states that simpler approaches that scale well with more compute will eventually outperform more complex approaches. In the context of software engineering, LLMs offer a potentially simpler and more scalable approach to implementing complex tasks, such as writing and debugging code.
As LLMs continue to improve and become more widely adopted, it is likely that they will eventually surpass more traditional, complex approaches to software development in terms of efficiency and effectiveness. This shift towards simpler, more scalable approaches is similar to the trend that Sutton observed in reinforcement learning research, and highlights the importance of staying attuned to advancements in technology and continuously seeking out more efficient ways of solving problems.
LLMs offer a cost-effective and efficient way to realize intent in
software engineering. For example, the llm-strategy
package in Python allows developers to quickly prototype and
experiment, without having to write complex code themselves. LLMs can
also generate code quickly, enabling software engineers to focus their
time and resources on other aspects of the software development process,
such as debugging and fixing code that fails.
This shift towards using LLMs to encapsulate complexity and execute intent in software engineering is reminiscent of the deep learning revolution, which has unlocked new value and opportunities. As LLMs continue to improve and become more widely adopted, it is likely that they will eventually surpass more traditional, complex approaches to software development in terms of efficiency and effectiveness.
However, in the near term, developers still need to consider the trade-offs between using an LLM and manually writing code. For example, in cases where reproducibility or performance are important, developers still need to write code themselves. But in other (simpler) cases, using an LLM to execute intent and debug and fix code will be a more efficient and effective approach.
Language models are changing the way we write software, allowing us to focus more on intent and less on implementation. By adapting to these changes and leveraging the power of LLMs, we can avoid the “bitter lesson” of being left behind as simpler approaches outperform more complex ones.