Regular updates on the latest VC-backed AI startups. Follow along to stay informed!
Chroma, a San Francisco-based AI-native open-source embedding database, raised $18M in Seed funding. Quiet Capital led the round, which was joined by angels including founders and leaders from Notion, MongoDB, Scale, Hugging Face, and more.
Problem to be Solved
Out-of-the-box large language models (LLMs) can’t be used for searches that require accurate knowledge about specific data and facts because they do not store long-term memory after training (i.e. they don’t learn from experience). The Chroma founders tried to give LLMs pluggable knowledge about their data, facts, and tools to prevent hallucinations through vector databases, but existing products were built for web-scale semantic search instead of AI.
How They Use AI
Embedding models work by mapping text to fixed-length vectors—essentially a list of thousands of numbers. Depending on the task (e.g. QA, similarity, etc.) the similarity of text A’s vector to text B’s vector tells us something about their semantic relationship (e.g. “What time are you open?” and “What are your hours?” will have similar embedding vectors). Chroma provides an efficient off-the-shelf database for storing and querying embedding vectors. While Chroma does not directly make embedding models, they plug right into OpenAI, Langchain, Cohere, etc.
Business Model
Chroma is building free, open-source software. This is speeding up adoption - the company reported they crossed 35k Python downloads in the first 5 weeks. Chroma will likely monetize with an open-source model like those popularized by RedHat, MongoDB, Hashicorp, and others.
LangChain, a San Francisco-based framework for developing LLM-powered applications, raised $10M in Seed funding. Benchmark was the lead investor in the deal.
Problem to be Solved
LLMs used in isolation are often not enough to create a truly powerful app. LangChain believes these apps can be more powerful and differentiated if they are combined with other things (like a private database or an API). The company aims to help with that by creating a collection of pieces developers might want to combine, a flexible interface for combining them into a “chain”, and a schema for easily saving and sharing those chains.
How They Use AI
Langchain does not train or host any of its own AI models but instead focuses on chaining various models, databases, and APIs together to make models more useful. They have abstracted the models from OpenAI, AI21, Cohere, and more so you can plug and play. Additionally, they have built-in workflows and prompts for common tasks, such as knowledge-augmented queries (e.g. using Chroma to retrieve information related to the customer query) and LLM agents (think Fixie last week).
Business Model
TBD. Today LangChain is an open-source library on GitHub. Founder Harrison Chase must have shared a plan worth $10M with Benchmark, but it hasn’t been announced publicly.
Fourthline, an Amsterdam-based compliance engine for the finance sector, raised $54M in Series B funding. Finch Capital led the round.
Problem to be Solved
Fintech companies and banks struggle to verify user identities to comply with “KYC” anti-money laundering rules. Malicious hackers and fraudsters pose a significant risk and make compliance even more difficult.
How They Use AI
Fourthline builds its own AI models to automate and enhance customer data collection to meet regulatory standards. As far as we can tell, this is mostly based on computer vision models for document parsing and verification. Additionally, Fourthline uses AI models to detect and prevent fraud. These are probably not deep learning approaches, but rather more traditional ML models for anomaly detection.
Business Model
SaaS. Fourthline makes money by charging fees for its services to banks and fintech companies. The company reported it has grown more than 80% yearly since its 2018 launch.