Hi there, If I’m looking to use LLM AI in a similar way like Stable Diffusion, i.e. running it on my own PC using pre-trained models (checkpoints?) - where would I start?
If I would want to have access to it on my mobile devices - is this a possibility?
If I would then later want to create workflows using these AI tools - say use the LLM to generate prompts and automatically run them on Stable Diffusion - is this a possibility?
I’m consistently frustrated with ChatGPT seemingly not beeing able to remember a chat history past a certain point. Would a self-run model be better in that regard (i.e. will I be able to reference somethin in a chat thread that happened 2 weeks ago?)
Are there tools that would allow cross-thread referencing?
I have no expert knowledge whatsoever, but I don’t shy away from spending hours learning new staff. Will I be able to take steps working towards my own personal AI assistant? Or would this be way out of scope for a hobbyist?
I have a decent CPU and GPU with 12GB VRam - this should let me run the 7B at least, from what I have seen in the sticky post.
Beside downloading the model, what kind of UI should I start with? Are there good tutorials around, that you are aware of?
If you’re using llama.cpp it can split the work between GPU and CPU, which allows you to run larger models if you sacrifice a little bit of speed. I also have 12 GB vram and I’m mostly playing around with llama-2-13b-chat. llama.cpp more of a library than a program, but it does come with a simple terminal program to test things out. However many GUI/web programs use llama.cpp so I expect them to be able to do the same.
As for GUI programs I’ve seen gpt4all, kobold and silly tavern, but I never got any of them to run in docker with GPU acceleration.