There are a number of open weight open source models out there with all their data sourced from the public domain. Look up BLOOM and Falcon. There are others.
JetBrains’ AI code suggestions were only trained on code where authors gave explicit permission for it, but that’s the only one I know from the top of my head.
Most chat-oriented LLMs (ChatGPT, Claude, Gemini…) were almost certainly trained using corporate piracy.
Oh so that wasn’t a joke from their booth.
This seems really out of place, but locally ran auto subtitles from ethically sourced AI would be great.
It’s just that there’s two very big conditions in that sentence there.
Which AI is the ethically-sourced one
There are a number of open weight open source models out there with all their data sourced from the public domain. Look up BLOOM and Falcon. There are others.
JetBrains’ AI code suggestions were only trained on code where authors gave explicit permission for it, but that’s the only one I know from the top of my head. Most chat-oriented LLMs (ChatGPT, Claude, Gemini…) were almost certainly trained using corporate piracy.