Kalv

Private AI

First off, I didn't make it to Argentina on the cheap. Had too much work and research to do.

Wanted to share more on what I referenced in my last post about AI. How I think private data models are something that we all are going to want. Whether as a company, agency, family or individuals.

I believe that LLMs that are trained on a corpus of data will eventually lead to other models being trained on the same data. The methods of building the model, queries or results will change but over time the results might end up being the same. Speed of result, memory footprint and other optimizations of course will improve over time but the output will start to all look the same. Today, if everyone uses or builds on top of GPT-x. Won’t it eventually lead to similar results? Lots of apps that will spit out the same things. We will loose the uniqueness of a creation.

Let’s take an illustration agency that is hired because of the art that they can create. Their content is unique and takes time to create. AI generators will be fast and cheap, so how does that agency retain their specialization. Perhaps they’d want to train their own model with their content, allowing them to scale up how many creations they do but still retaining their creativity magic. A private GPT.

I like to think this is going to the same with a household. An assistant that is tuned to their preferences, for example what recipes to suggest, cups vs grams, Sally’s baking, AllRecipes, Epicurious, vegan recipe sites, etc. Whether to reference Wikipedia or other unique sources. The same applies to political viewpoints, which religion references if any at all. To do this, would a household expose all their preference data to a cloud service?

What about speech. I'd be happy to provide better supervised learning but that’s my voice. I’d like to keep it. So can I not train a local version of that? I have the first world problem of sounding a blend of British and Canadian. People say Australian. The same goes with creating my voice, but I’m not sure I’d like something to speak on my behalf, but if we go there, I’d like to own my model of voice and own it.

We'd need the ability to train and build models for ourselves. This might not be possible today with the hardware a household owns, but perhaps we'd need to invest and work on that.

Not sure about others but we’re on the path to have 4 laptops (some have neural cores, GPUs), 4 mobile phones, heck our TV is running google android OS, probably not enough compute but some have chips optimizing 4K up scaling, so perhaps there is some compute available there. What if all this compute could be clustered and used locally to build models slowly over days/weeks. Anyone remember SETI at home. I think I ran it across all my school computers until I was told off because of power usage! Maybe we could look at building private models in this way.

This would ensure that the uniqueness of different minds can overcome the centralization of creativity.

Private LLMs are possible today and there are a number of services out there to train your own model, so we’re getting started. But they’re in the cloud and not optimized for the data that most have. Also quite technical to create, so naturally everyone will adopt what is already available.

It’s an exciting time. I just hope that we can consider private data and how it’s used in this new world.