LocalLLaMA

2327 readers

28 users here now

Community to discuss about LLaMA, the large language model created by Meta AI.

This is intended to be a replacement for r/LocalLLaMA on Reddit.

founded 2 years ago

MODERATORS

SkySyrup@sh.itjust.works

pax@sh.itjust.works

noneabove1182@sh.itjust.works

Llama 3.3 70b - End of open-weight pretrained models from Meta or just a better Llama 3.1 405b finetune? (lemmy.dbzer0.com)

submitted 2 weeks ago by hok@lemmy.dbzer0.com to c/localllama@sh.itjust.works

6 comments fedilink hide all child comments

People are talking about the new Llama 3.3 70b release, which has generally better performance than Llama 3.1 (approaching 3.1's 405b performance): https://www.llama.com/docs/model-cards-and-prompt-formats/llama3_3

However, something to note:

Llama 3.3 70B is provided only as an instruction-tuned model; a pretrained version is not available.

Is this the end of open-weight pretrained models from Meta, or is Llama 3.3 70b instruct just a better-instruction-tuned version of a 3.1 pretrained model?

Comparing the model cards: 3.1: https://github.com/meta-llama/llama-models/blob/main/models/llama3_1/MODEL_CARD.md 3.3: https://github.com/meta-llama/llama-models/blob/main/models/llama3_3/MODEL_CARD.md

The same knowledge cutoff, same amount of training data, and same training time give me hope that it's just a better finetune of maybe Llama 3.1 405b.

you are viewing a single comment's thread
view the rest of the comments

[–] hendrik@palaver.p3x.de 4 points 2 weeks ago (1 children)

On Huggingface, someone said it's still the same base model: https://huggingface.co/meta-llama/Llama-3.3-70B-Instruct/discussions/10

And I remember watching some interview with Zuckerberg this year, where he said releasing the models to the public, including base models, is what he wants and part of their strategy.

[–] hok@lemmy.dbzer0.com 2 points 2 weeks ago

Thank you so much, that exactly answers my question with the official response (that guy works at Meta) that confirms it's the same base model!

I was concerned primarily because in the release notes it strangely didn't mention it anywhere, and I thought it would have been important enough to mention.