Questions tagged [large-language-models]

Large Language Models (LLMs) is a collective term for large natural language models trained on large quantities of unlabelled text using self-supervised learning. Most notably, large language models include models such as BERT, GPT-(2, 3, 3.5, 4), LaMDA, Chinchilla, PaLM, and LLaMA. There is no formal definition for the term large language model.

144 questions
6
votes
3 answers

Why can't Lucene search be used to power LLM applications?

w.r.t. LLM applications using the RAG (retriever-augmented-generation) architecture, people have started taken it for granted that it will be powered by a vector database. e.g., see this: The most important piece of the preprocessing pipeline, from…
morpheus
  • 274
  • 6
2
votes
0 answers

Why are text models instructed in second person perspective?

Many text model system prompts begin with "You are an AI assistant. You are helpful. You …. [user input] [model output]", but looking at what might have been in the training material, I would expect other perspectives to be more useful, such as "I…
allo
  • 310
  • 1
  • 9
2
votes
0 answers

RAM Capacity of Mac Studio with M2 Ultra for inference of 65B LLM

How much RAM would be needed on Mac Studio M2 Ultra for inferring from a 65B LLM model. There are three options: 64GB, 128GB and 192GB. If using Apple M2 Ultra with 24‑core CPU, 76‑core GPU, 32‑core Neural Engine with 192GB Unified Memory, how would…
2
votes
2 answers

What is Function calling in openAI's chatGPT models?

OpenAI has recently added something that it calls Function calling to its chatGPT models/API. What is meant by function calling in the context of a large langage model? Does it allow us to invoke functions and feed them back immediately? What…
Bruce Adams
  • 223
  • 1
  • 9
0
votes
1 answer

What is better: train a model from scratch on your own data vs. fine-tune pretrained model?

Problem: I am interested in building a Q&A engine on top of my private data. I am only interested in asking questions related to my data. Options: I train a model from scratch on my own data I pick a pretrained large language model and fine-tune it…
morpheus
  • 274
  • 6
0
votes
1 answer

Using LLM to query specific databases - where can I find implementation examples

I've been doing some research on how to leverage a LLM model to "translate" English into a sql query that's capable to returning the desired results (like a little Q/A bot). For instance, one would ask "show me the top growth account in the last…
adjfac
  • 103
  • 4
0
votes
0 answers

How does vocabulary size affect quality?

I think the vocabulary size in LLMs makes two trade-offs: The bigger tokens you have, the less frequent they will be. The more tokens you have, the more parameters you dedicate to input and output. I'm looking for a chart of the effect of the…
0
votes
1 answer

Why do LLaMa and its variants have non-“round” numbers of parameters?

LLaMa was released in several sizes, with 7B, 13B, 33B, and 65B parameters. These values look a little weird, because they are very close to powers of two (8, 16, 32, 64) that would be more conventionally considered “round numbers” in software. Why…
amh
  • 1
0
votes
1 answer

In which process does filter and watermark take place?

In LLM, in order to avoid discrimination and abuse, the author will add filter and watermark functionalities. Could you tell me in which process these functionalities take place? Is it in weights(pertaining), output layer(transformer), or…
zzzgoo
  • 161
  • 4