Businesses today are at a crossroads where they must make a pivotal decision that will shape their future: build a custom generative AI solution or adopt an existing one.
This decision has implications for businesses in various sectors. While tech giants like Meta and Google invest heavily in internal and external AI research to develop innovative solutions and build foundation models, the options for the majority of enterprises are building (fine-tuning) or buying (prompt tuning).
Serious value consideration must be given to each, as the cost, data required and technical challenges are different enough to make a significant impact on your business. Businesses must evaluate if the use case is so niche that a general model is inappropriate, and if the convenience of buying a solution outweighs the security and precision of a custom build.
What’s the main deciding factor in building vs. buying AI?
The alignment of the AI solution with the enterprise’s core business. A media company needing personalized content may prefer a tailor-made solution, whereas a retail business facings tasks that require less customization might want an off-the-shelf product.
How Do You Build an AI Model?
Way back in 2022, building a model meant assembling a large dataset custom-tailored to your business needs, designing and hyperparameter tuning a large model and training for extended periods on a large pool of GPUs.
This is a costly process for storage, computation and time. BLOOM, for example, trained with 350 billion tokens for three and a half months on 384 GPUs. Today, advances in training techniques allow for state-of-the-art, parameter-efficient fine-tuning of the model.
These techniques involve a mixture of reducing the parameter space, tuning only the embedding layers to add task-specific inputs and saving just the small delta of weights from the original large model, resulting in a reduced dataset size, quicker training and reduced memory and processing requirements, making this attainable for personal use. Still, assembling a training dataset for fine-tuning is a complex and resource-intensive endeavor.
For instance, Databricks gamified and crowd-sourced the data creation process to 5,000 employees for Dolly 2.0. So, if it’s possible to fit the use case to an existing model, using zero-shot learning and prompt tuning is an advisable first step.
Assess Your Core Business Needs and Expertise
The primary determinant in the build versus buy dilemma is the alignment of the AI solution with the enterprise’s core business. A media company needing personalized content may prefer a tailor-made solution for nuanced control. In contrast, a retail business might opt for an off-the-shelf product to generate product descriptions, as this task requires less customization.
The critical question is whether the generative AI output is part of the core product offering or just an ancillary business process. When the output requires high specificity, the decision leans towards building. Either way, people will be required to judge the quality of the output, often with frameworks such as human-in-the-loop, reinforcement learning with human feedback, or Direct Preference Optimization.
In addition to human testers, the availability of in-house technical expertise influences the decision process. Building a solution demands a team skilled in AI development, including data science, data engineering, MLOps and domain expertise — although this is being made less technically demanding every day — while buying a solution can largely be done with domain experts. While custom solutions enable flexibility to meet evolving business needs, commercial solutions often lack such adaptability.
What Are the Cost Implications?
Cost is a pivotal factor in this analysis. Developing generative AI in-house is resource-intensive throughout, requiring talent, computing, data management and ongoing maintenance expenses.
Buying a vendor solution can be initially cost-effective with predictable operating expenses. Determining and correctly labeling the data for fine-tuning is human-intensive, requiring people to have eyes on data creation as well as judging outputs, whereas prompt tuning only relies on people for output judgment.
Time to market is another critical consideration, as building solutions can take weeks or months versus days to weeks to deploy an off-the-shelf option. Microsoft made the ultimate buy decision by investing in OpenAI. It was expensive in dollars, but it was extremely fast, with ChatGPT integrated into Bing within days and Copilot infused throughout the product line within months.
A final cost consideration is the operating expense of the model. Here, adoption and throughput of the model are weighed against compute and storage. Buy solutions, or Model as a Service, often charge per inference; however, assessing the long-term total cost of ownership is essential, including fees for customization, upgrades and support.
The build solution costs include hosting a model, compute and storage. If the enterprise requires multiple models, a build solution could be cheaper than a buy, as it’s possible to only store the delta weights per model while sharing infrastructure.
Which Option Is More Secure?
Data privacy and security are paramount for sensitive industries like healthcare and finance. In-house development can enable tighter control over data and model governance, while third-party solutions may have ambiguous security policies and often rely on public cloud infrastructure.
Fine-tuning a commercial model requires sending sensitive data to the model host, so licensing and fair-notice policies come into play. Model as a Service providers are responding to concerns by offering compliance guarantees. For example, indemnifying users against legal issues arising from generative models, Microsoft has agreed to defend Copilot customers against potential copyright infringement claims.
Take a Case-by-Case Approach
Your decision to build or buy will depend on multiple factors: core business alignment, cost, timelines, expertise, data sensitivity and strategic goals around innovation. Ultimately, enterprises must evaluate these trade-offs to select the approach that best enables their enterprise objectives while managing risk.
If speed is important and the use cases are more general, then buying (prompt tuning) is likely your best bet; however, if a more precise output is needed and the model must be fit to the use case, then building (fine-tuning) could be the right choice. Due to the low upfront costs, businesses should focus on use case evaluation and attempt prompt tuning first.