mods_mum@lemmy.today to

AI@lemmy.ml · 5 months ago

How reliable are modern LLMs?

6

How reliable are modern LLMs?

mods_mum@lemmy.today to

AI@lemmy.ml · 5 months ago

I wanted to extract some crime statistics broken by the type of crime and different populations, all of course normalized by the population size. I got a nice set of tables summarizing the data for each year that I requested.

When I shared these summaries I was told this is entirely unreliable due to hallucinations. So my question to you is how common of a problem this is?

I compared results from Chat GPT-4, Copilot and Grok and the results are the same (Gemini says the data is unavailable, btw :)

So is are LLMs reliable for research like that?

Chat

ViaFedi@lemmy.ml
link
fedilink
arrow-up
3
arrow-down
5·
5 months ago
Solutions exist where you give the LLM a bunch of files e.g., PDFs which it then will solely base it’s knowledge on
- jet@hackertalks.com
  link
  fedilink
  English
  arrow-up
  8
  arrow-down
  2·
  5 months ago
  It’s still a probable token generator, you’re just training it on your local data. Hallucinations will absolutely happen.
  - slacktoid@lemmy.ml
    link
    fedilink
    English
    arrow-up
    1
    arrow-down
    1·
    edit-2
    5 months ago
    This isn’t training its called a RAG Workflow, as there is no training step per se

AI@lemmy.ml

artificial_intel@lemmy.ml

You are not logged in. However you can subscribe from another Fediverse account, for example Lemmy or Mastodon. To do this, paste the following into the search field of your instance: !artificial_intel@lemmy.ml

Artificial intelligence (AI) is intelligence demonstrated by machines, unlike the natural intelligence displayed by humans and animals, which involves consciousness and emotionality. The distinction between the former and the latter categories is often revealed by the acronym chosen.

Visibility: Public

This community can be federated to other instances and be posted/commented in by their users.

9 users / day
11 users / week
26 users / month
746 users / 6 months
15 local subscribers
4.25K subscribers
481 Posts
1.49K Comments
Modlog

mods: