Safeguarding privacy in the age of AI: Exploring the potential of LLMs in data-sensitive workloads
12 March 2024Our digital landscape is evolving rapidly, and protecting our personal data has never been more important. In this blog, Dr Will Webberley, Director and Chief Technology Officer, and Dr John Barker, Associate Director of Innovation, at SimplyDo, discuss their work in this area, and describe how input from experts based at Cardiff University’s Hartree Centre Cardiff Hub has enriched their business.
Based at sbarc|spark, our SimplyDo team is motivated to grow our capabilities in generative artificial intelligence (AI) through large language models (LLMs). Our company’s core product, a platform for empowering innovative change via challenge-based innovation for large enterprise organisations, has been leveraging AI for key timesaving and data-processing enhancement tasks for several years.
We recently had the opportunity to work on a highly successful “Assist” with experts at the Hartree Centre Cardiff Hub, based at Cardiff University, which has enabled us to further develop these technologies with respect to their capabilities and their applicability to the market, and broadened our practical understanding of the contemporary AI landscape.
The importance of data security
With customers from highly sensitive industries, such as defence, policing, and healthcare, data security and sovereignty are vital factors for consideration in all we do at SimplyDo. In relation to security, we ensure that data flows are fully managed and predictable, with customer data being appropriately protected at each processing stage and in every component. This guarantees that the data is only made available when and where appropriate.
Sovereignty requirements not only mandate that customers remain in complete control of their own data through our services, but also that the data is processed and stored fully within pre-agreed geographic and virtualised locations. For example, many of our customers require that their data is fully managed solely within the UK – including all components from servers and databases through to email systems.
The changing landscape of AI
As a result of this, we must always strongly consider the technologies, processes, and providers we build and work with to ensure these needs can be met. In recent years the world has observed explosive growth of generative AI across a range of media, including text, images, voice, and even video. Many organisations have been rushing to leverage these exciting technologies to empower their own services, or even to build entirely new and previously impossible categories of standalone products.
So far, most of this type of capability seen in the market has been driven by just a few services, with many businesses opting to make use of conveniences available through OpenAI and hyperscalers, such as Amazon Web Services and Google Cloud. Given the expertise, time, expense, and computational complexity involved in training and utilising the kinds of models required for modern generative AI solutions, it isn’t hard to imagine why this is preferable over the DIY approach.
Striking a balance between convenience and security
At SimplyDo, we have been developing a number of proof–of-concept solutions into our products, built using a foundation of technologies provided by OpenAI’s GPT and Google’s AI services. Such features were backed by workloads highly characteristic of LLM capability – such as text generation, semantic understanding, and summarisation – and were built in areas of our product that were clearly demonstrated to provide real actionable benefits and save significant time in data management and comprehension of information signals at-scale. However, fully productionising these developments as-is would pose several challenges in relation to our responsibility to customer data.
Despite the convenience of off-the-shelf AI services, there are uncertainties involved in the opacity regarding how data inputs are handled behind the scenes. For example, in typical text summarisation, SimplyDo would need to expose customer data for it to be processed by the service, and it is not always clearly understood how such data will be processed. Today, some services re-use input data for retraining (potentially then exposing the data in subsequent models), log inputs (for audit or other purposes), or do anything else with the data such that it becomes otherwise or more widely available
For many companies, particularly those with consumer-facing products, this isn’t an issue, since the intention to use such services can be declared ahead-of-time or in privacy notices. However, for us, such uncertainties would reduce our ability to confidently safeguard our customers’ data with respect to both security and sovereignty.
Our ‘Assist’ Experience
Armed with our knowledge of the advantages AI could bring our customers, along with the challenges we had identified in bringing such capabilities to market, the opportunity to work on an “Assist” with the Cardiff Hartree Hub was perfectly placed to help in driving the realisation of our AI strategy, with an aim of gaining a team-wide richer understanding of the domain and how we might be able to safely and ethically deploy such technologies for the benefit of our customers.
An “Assist” is a 12-hour piece of work led by the Hartree Centre Cardiff Hub. They can take many forms, but in our case, we were able to get involved with a number of short in-person workshop-style sessions, in which we described our challenges (and our opportunity) within the context of specific use-cases. The data scientists at the Hub were clearly very highly experienced with strong expertise in a broad range of AI applications. Beyond this, their deep knowledge of the technology itself, and its potential and pitfalls was evident.
For us, the primary outcome was significant knowledge transfer; by the end of the “Assist”, our own team’s knowledge had grown enormously – to the extent that we would have the know-how and ability to deploy and run our own models whilst understanding the applications of different model types and trade-offs in model complexity (e.g. parameters) and computational cost. The Hub also wrote and provided us with sample code to help demonstrate some of the more low-level discussions we had during the “Assist”.
We are extremely grateful to have had the opportunity to work with the Hartree Centre Cardiff Hub. Throughout the experience, the team were highly positive and clearly listened and understood our challenges and were able to help us with relevant and practical insights that we have already been able to use in progressing AI deployments in our products. We hope to be able to continue working with the Hub in the future as part of a longer-term project to enable further detailed and practical exploration.
About the Hartree Centre Cardiff Hub
The Hartree Centre Cardiff Hub is a regional Hub of the Hartree National Centre for Digital Innovation and is funded by the Science and Technology Facilities Council.
The Hub is a partnership between colleagues at Cardiff University, including those from the university’s flagship innovation institutes the Digital Transformation Innovation Institute and the Security, Crime, and Intelligence Innovation Institute.
The Hub matches academic expertise with small and medium-sized businesses in the Cardiff and surrounding area who want to better understand how digital innovation can transform SME productivity, strengthen resilience, and enhance growth.
It builds on a vibrant AI, data science and high-performance computing research and innovation community centred at Cardiff University and enjoys immediate access to Cardiff Capital Region and Western Gateway SME clusters.
The hub team has deep, first-hand experience of running innovative impact-led research and development in collaborative environments, with strong industry relationships and tried-and-tested SME engagement approaches.
The Cardiff University hub is one of three across the UK to be awarded a share of £4.5 million as part of the Hartree National Centre for Digital Innovation programme. The sites, also in Newcastle University and Ulster University, will be funded for three years to establish a network of digital adoption support that is easily accessible and available locally to SMEs across the UK.