Last year, we saw the explosion of LLM in the mainstream with users engaging with AI directly for the first time via software like ChatGPT. With the discussion of LLM adoption within enterprises already underway, this gear shift only made it more of a priority for them to access the tech and its benefits.
A McKinsey report found that many enterprises are already using LLM in production, however the variety of LLM-based solutions used in production is pretty low with GitHub Co-Pilot, ChatGPT Enterprise or Claude for Enterprise covering most of that usage.
When it comes to using and extracting true value from LLMs however, these use cases are extremely limited and barely scratch the surface of what’s possible; a Cledera report found that less than half of companies using AI are seeing tangible value showing that currently very few enterprises are taking full advantage of LLMs.
The use of additional tools and using LLMs for more use cases is where adoption and use slow down. In many cases, this is because of a combination of valid concerns and hurdles that are rooted in how early stage LLMs still are.
The factors at play for slow enterprise LLM adoption
Limitations for enterprise-grade LLM
When ChatGPT was first introduced, it looked (and still does look) like magic. It can provide answers to complex questions based on cross-referencing multiple data sources to uncover insights that were previously hidden. But when it comes to large organizations attempting to replicate this with their own private data, they soon realize that using LLMs for such use cases come with their own unique challenges. The most notable ones are:
- A higher level of confidence that the answer is based on actual company data is needed for many use cases, more so than use cases for tools like ChatGPT.
- In many cases, especially when sensitive data is involved or the company belongs to a regulated industry (such as healthcare and financial services), every response provided by an LLM needs to include a reference to the source of that response.
- The percentage of “hallucinating” responses (responses that appear to be detailed and well phrased, but in reality are completely made up by the LLM) is often too high to trust the tool.
- Company data is constantly being updated and LLM responses need to be based on the up-to-date state of that data and not obsolete.
- Feeding the LLM with proprietary data is not trivial - there is often too much of it to pass all of it in the context (concatenated to the questions) and fine-tuning a model using proprietary data is usually very hard and requires a very specific skill set that many companies lack.
To mitigate this, new methods were devised, for example RAG (Retrieval-Augmented Generation). However, introducing tools like RAG, while solving a lot of the challenges above, introduces unique challenges of their own.
When using a RAG in an LLM-based product, usually the retrieval mechanism would be responsible for a big part of the overall functionality. This would sometimes result in much more simplistic responses than letting the LLM generate the response from scratch, based on the complex ties between different pieces of data it was fed during training that humans might miss. This is the missing magic.
Additionally when it comes to software vendors who build new B2B LLM-based products or even well-established B2B software vendors that add an LLM-based capability to their existing offerings, they’re concerned with the risk they could be introducing for their customers. To reduce this, they limit the abilities of the LLM piece both in depth and scope..
Privacy and security are still unknowns
The hyperfocus on data privacy for enterprises and the significant consequences for it being used to train third-party AI models, makes LLMs and LLM-based solutions appear risky and harder to fully adopt and integrate into businesses. As most are provided in SaaS models, enterprise data is usually sent to the vendor’s environment and processed there, which limits an enterprises’ ability to ensure it’s not being used for model training, regardless of any assurances from the vendor.
Moreover, many LLM vendors either explicitly state that customer data is used to train their models, or they can go to the other extreme and be particularly vague about how the data is and isn’t used.
Either way, with privacy and clear understanding of data use and processing needed in a very vigilant world, organizations are yet to find a solution to meet their data privacy needs - particularly when it comes to any regulations, like GDPR, that they must be compliant with.
On the security side, new technologies like LLMs create new attack vectors that many security teams aren’t sure how to manage yet. Their processes and tools haven’t had the time to adjust to modern LLM-related threats and vulnerabilities, like prompt injections. And for enterprises, these are only the known ones. With LLMs still in their infancy, new vulnerabilities are being discovered on an almost daily basis increasing concerns around the unknown and making the business risk potentially too high to justify.
Cost structures are unpredictable
While LLMs costs have dropped significantly over the past year, they can still take up a large part of business costs at the end of the year. This is because most pricing models for LLMs are on a pay-per-token basis with an unpredictable number of tokens being used each time. Each token often has a complex relationship between the number of user queries, their content, and the content of the LLM-generated response, which makes forecasting nearly impossible.
Even with a newer outcome-based pricing model being used in some software applications whereby only successful results are charged, this still leaves enterprises in the dark as the LLM continues to learn. In fact, the factors that influence the actual cost of using an LLM could fluctuate as much as 1000% over time.
To add, many vendors that build LLM-based software tend to add their own margins on top of the basic cost of LLM tokens, further increasing the overall cost to their enterprise customers. Together, enterprises are in the difficult position of wanting to implement AI but its business case and costs aren’t quite clear cut enough.
What we still don’t know
Like with any new technology, there are plenty of unknowns to contend with in the early stages. In the case of LLMs, they have their own areas of concern.
Dealing with unpredictability
Unlike traditional software, it’s almost impossible to thoroughly pre-test LLM-based solutions in a sandbox environment before moving over to production because of the non-deterministic nature of LLMs, and the almost infinite options for real-life user requests.
For example, the same user query used in both staging and production could result in a different response. If in the latter, the query was phrased slightly differently or if the data ingested via a RAG has changed, a brand new outcome is possible.
Another example is if the LLM powering a solution is moved to a different model vendor for any reason. The behavior in the overall solution now can’t be predicted within this new ‘home’ and could change almost entirely from its previous way meaning that everything the customer learned about the LLM’s behavior before would be irrelevant in this new setting.
This can even happen without explicitly switching to a different model provider, but simply through a minor version upgrade of an existing provider. Something they often don’t even publicly mention or notify users about.
Currently, the tech itself and any available tooling don’t instill confidence that LLMs will consistently behave as they’re needed to. In essence, its ability to be “generative”, the very reason for its popularity, is at this moment something that is holding it back.
Skills needed to get past POC
It’s not new news that while building a POC on top of a LLM solution is relatively quick and easy, productizing it to a standard acceptable for an enterprise is the extremely difficult part. One of the main reasons for this is that it’s not common for companies to employ experts in LLM and its overall ecosystem, like RAGs and prompt engineering, who can build and maintain the system sufficiently.
A report for Enterprise AI adoption found that almost 60% of participants felt they were understaffed and didn’t have enough funding to drive true transformation.
Additionally, the landscape for such engineers is constantly changing. Best practices to build LLM-based products are ever-evolving and the tech stack and tooling needed to achieve a repeatable process to release production-grade solutions hasn’t been created yet. This means many projects tend to get stuck at the POC stage because the world hasn’t quite caught up yet.
Understanding the value
The beauty of LLM-based products is that they can be used in many different ways. However, this also means it’s often impossible to predict how users will choose to use them and therefore how valuable they are.
For example, if a user decides to use it to replace long and costly processes, such as quickly generating code an engineer would write, the value is higher. But if it’s used to replace a simple process like search functionality (akin to Google), the value may be negligible.
Even with these two extreme use cases in mind, some value is indirect and therefore harder to estimate. Let’s take a company that has introduced Co-Pilot: would it be improving the productivity of engineers or would it be reducing its in-house skills since the problem solving abilities of its software engineers were no longer in demand? Who’s to say.
The need to understand and quantify the value of a LLM solution is of even greater importance when it comes to ROI. Without this, ROI calculations are impacted making for more of a stab in the dark, which many CFOs won’t easily approve.
How to approach LLM adoption for enterprise organizations
It’s clear there are plenty of challenges for LLM adoption, but this only means a refreshed and thoughtful approach is needed to get it into your business. This also goes for business expectations as the technology goes through the development, testing and production phases.
Before even getting into the weeds of development though, successful adoption requires strong preparation. Here are the elements we consider to be the priority.
- Choose one use case
It’s tempting to implement LLM everywhere from the start but it’s unwise. Instead, choose a single use case and stick with it until you, other stakeholders and end users are confident with the end product. Then, armed with the lessons learned, you can expand. - Opt for accessible data use cases
To reduce data concerns and risk of unauthorized data access, for your first LLM implementation only choose a use case that involves (relatively) non-sensitive data that is already accessible to 100% of the users. The well-known and well-adhered to best practice of least privilege will be your guiding principle here. - Find the low-risk and obvious value
The goal at this stage is to build a strong business case that will act as the foundation for future LLM projects. So identify those with less risk but clear value first, for example, a chatbot that sits on the company’s knowledge management system or an internal IT support chatbot. A meeting summarization tool may be too low value while an external customer support chatbot may be too high risk. - Buy, don’t build
LLMs give the illusion they’re easy to use and integrate but as we’ve made clear, getting to an enterprise-grade state and maintaining it requires specialized expertise and skills that often either don’t exist within the organization or are already allocated to other projects. Instead, find a vendor specializing in the development of LLM products (or at least in a single LLM product that matches your exact use case). - Choose smaller and younger software vendors
These companies are often LLM-native whose core business makes for generally more impressive and valuable solutions to your business. For your needs, this is often better than add-on LLM capabilities from traditional software companies who sometimes jump on the bandwagon of Gen-AI claiming the label without the value they advertise. - Look out for self-hosted options
To overcome privacy concerns, use a product that can be hosted in your own private cloud account.
Our final thoughts
LLM solutions and products are understandably of significant interest to enterprises who are continuously striving to be the most competitive in an increasingly demanding market. While adoption has started off well, it’s clear that there is a gap in enterprises able to make full use of their investment - and potentially a need to reassess what’s been done so far.
The reality is that LLMs are complex and therefore more preparation is required; these products simply aren’t like other innovative solutions businesses have used. For enterprises too, far more is at stake.
Understanding exactly what your use case is, why it makes sense and who your vendor will be (and why) will help to alleviate concerns around security, and to overcome the most significant barriers to entry like expertise and value.
We’ll discuss priorities for LLM adoption once in development and production in one of our next blog articles, so your business is in the best position to get the most out of LLMs now and in the future.
Verax was created to build a technology and Control Center to give organizations full control and visibility over their LLMs, remove barriers to AI adoption, and make verified, responsible, ethical AI the universal standard.