LangChain vs. Substrate: Cognitive overhead

At Substrate, principled API design runs deep in our DNA: Ben was at Stripe for nearly a decade, crafting novel APIs like Stripe Terminal. We believe AI engineers deserve better abstractions.

We'll compare LangChain to Substrate by implementing a simple workflow. The Substrate implementation requires 3x less code.

Substrate: 12 lines

from substrate import ComputeText, Substrate, sb
 
s = Substrate(api_key="SUBSTRATE_API_KEY")
 
topic1 = "a magical forest"
topic2 = "a futuristic city"
story1 = ComputeText(prompt=f"Tell me a story about {topic1}")
story2 = ComputeText(prompt=f"Tell me a story about {topic2}")
summary = ComputeText(prompt=sb.format(
    "Summarize these two stories:\nStory 1: {story1}\nStory 2: {story2}",
    story1=story1.future.text,
    story2=story2.future.text)
)
response = s.run(summary)

LangChain: 36 lines

from langchain_core.output_parsers import StrOutputParser
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.runnables import RunnableParallel
from langchain_openai import ChatOpenAI
 
model = ChatOpenAI(
    base_url="https://api.together.xyz/v1",
    api_key="TOGETHER_API_KEY",
    model="mistralai/Mistral-7B-Instruct-v0.3",
)
 
story_prompt1 = ChatPromptTemplate.from_template(
    "Tell me a story about {topic1}")
story_prompt2 = ChatPromptTemplate.from_template(
    "Tell me a story about {topic2}")
summarize_prompt = ChatPromptTemplate.from_template(
    "Summarize these two stories:\nStory 1: {story1}\nStory 2: {story2}"
)
 
output_parser = StrOutputParser()
 
story_chain1 = story_prompt1 | model | output_parser
story_chain2 = story_prompt2 | model | output_parser
 
parallel_chain = RunnableParallel(
    story1={"topic1": lambda x: x["topic1"]} | story_chain1,
    story2={"topic2": lambda x: x["topic2"]} | story_chain2,
)
summarize_chain = summarize_prompt | model | output_parser
 
topics = {"topic1": "a magical forest", "topic2": "a futuristic city"}
 
parallel_result = parallel_chain.invoke(topics)
summary = summarize_chain.invoke({
    "story1": parallel_result['story1'],
    "story2": parallel_result['story2']
})

It's surprisingly complex to chain LLMs using LangChain. This simple task requires 3x more code than the Substrate implementation, and several extra abstractions:

  • ChatPromptTemplate – Substrate lets you simply use format strings, or even jinja.
  • StrOutputParser – Substrate uses idiomatic tools (like format strings) to transform outputs, follows the Unix Philosophy: "Don't clutter output with extraneous information".
  • RunnableParallel – Substrate automatically parallelizes your workflow. Because story1 and story2 don't depend on anything, Substrate runs them in parallel.

Substrate's unique approach lets you work at the conceptual layer, and trust Substrate to automatically optimize your workload when you call substrate.run:

  • First, we analyze your workload as a directed acyclic graph and rewrite it: for example, merging nodes that can be run in a batch.
  • Then, we schedule the graph with maximum parallelism. No more async programming: just connect nodes, and let us handle the parallelism.
  • Because Substrate runs your entire workload in the same cluster (often running all the inference on the same machine), you don't waste time on unnecessary data roundtrips and cross-region HTTP transport.

The beauty of Substrate is the lack of abstractions. Just computation, and ways to guide data.

You can try running this example yourself by forking on Replit:

Last updated on

Edit on GitHub

On this page