Llama-3.1-8B-Stheno-v3.4-GGUF-IQ-Imatrix:可用于角色扮演和创意写作等场景，是Llama-3.1-8B-Stheno-v3.4的量化版本，采用Imatrix技术，支持多轮对话，兼容SillyTavern预设，推荐配合KoboldCpp使用。【此简介由AI生成】

Sao10K/Llama-3.1-8B-Stheno-v3.4 的量化版本。

建议访问他们的页面获取反馈和支持。

[!IMPORTANT] 量化过程：
Imatrix 数据是从 FP16-GGUF 生成的，并直接从 BF16-GGUF 转换而来。
希望这样可以避免转换过程中的损失。
要运行此模型，请使用 最新版本的 KoboldCpp。
如果发现任何问题，请在讨论区告诉我。

[!NOTE] 预设：
一些兼容的 SillyTavern 预设可以在 此处找到（Virt's Roleplay Presets - v1.9）。
查看 此类讨论 和 此讨论 以获取其他预设和采样器建议。
作者建议使用较低的 temperature 值，因此请务必进行尝试。

在 KoboldCpp 中的一般使用：
对于 8GB VRAM 的 GPU，我推荐使用 Q4_K_M-imat（4.89 BPW）量化版本，在不使用 --quantkv 的情况下，上下文大小可达 12288。
使用 --quantkv 1（≈Q8）甚至 --quantkv 2（≈Q4）可以将上下文大小提升至 32K，但缺点是与 Context Shifting 不兼容，这仅在您能够填满那么多上下文时才相关。
在此处的发布说明中了解更多信息。

image/png

点击此处查看原始模型卡片信息。

感谢 Backyard.ai 提供训练所需的计算资源。 :)

Llama-3.1-8B-Stheno-v3.4

该模型经历了多阶段的微调过程。

- 1st, over a multi-turn Conversational-Instruct
- 2nd, over a Creative Writing / Roleplay along with some Creative-based Instruct Datasets.
- - Dataset consists of a mixture of Human and Claude Data.

提示词格式：

- Use the L3 Instruct Formatting - Euryale 2.1 Preset Works Well
- Temperature + min_p as per usual, I recommend 1.4 Temp + 0.2 min_p.
- Has a different vibe to previous versions. Tinker around.

自上一版 Stheno 数据集以来的变更：

- Included Multi-turn Conversation-based Instruct Datasets to boost multi-turn coherency. # This is a seperate set, not the ones made by Kalomaze and Nopm, that are used in Magnum. They're completely different data.
- Replaced Single-Turn Instruct with Better Prompts and Answers by Claude 3.5 Sonnet and Claude 3 Opus.
- Removed c2 Samples -> Underway of re-filtering and masking to use with custom prefills. TBD
- Included 55% more Roleplaying Examples based of [Gryphe's](https://huggingface.co/datasets/Gryphe/Sonnet3.5-Charcard-Roleplay) Charcard RP Sets. Further filtered and cleaned on.
- Included 40% More Creative Writing Examples.
- Included Datasets Targeting System Prompt Adherence.
- Included Datasets targeting Reasoning / Spatial Awareness.
- Filtered for the usual errors, slop and stuff at the end. Some may have slipped through, but I removed nearly all of it.

个人观点：

- Llama3.1 was more disappointing, in the Instruct Tune? It felt overbaked, atleast. Likely due to the DPO being done after their SFT Stage.
- Tuning on L3.1 base did not give good results, unlike when I tested with Nemo base. unfortunate.
- Still though, I think I did an okay job. It does feel a bit more distinctive.
- It took a lot of tinkering, like a LOT to wrangle this.

以下是一些图表，供您查看。

对话轮次分布 # 1 轮次定义为 ShareGPT 格式中 1 组 Human/GPT 对话对。4 轮次意味着总共包含 1 行 System 内容 + 8 行 Human/GPT 内容。

Turn

token 数量直方图 # 基于 Llama 3 分词器

Turn

祝您一切顺利。

Source Image: https://www.pixiv.net/en/artworks/91689070

</details>

- 1st, over a multi-turn Conversational-Instruct - 2nd, over a Creative Writing / Roleplay along with some Creative-based Instruct Datasets. - - Dataset consists of a mixture of Human and Claude Data.

- Included Multi-turn Conversation-based Instruct Datasets to boost multi-turn coherency. # This is a seperate set, not the ones made by Kalomaze and Nopm, that are used in Magnum. They're completely different data. - Replaced Single-Turn Instruct with Better Prompts and Answers by Claude 3.5 Sonnet and Claude 3 Opus. - Removed c2 Samples -> Underway of re-filtering and masking to use with custom prefills. TBD - Included 55% more Roleplaying Examples based of [Gryphe's](https://huggingface.co/datasets/Gryphe/Sonnet3.5-Charcard-Roleplay) Charcard RP Sets. Further filtered and cleaned on. - Included 40% More Creative Writing Examples. - Included Datasets Targeting System Prompt Adherence. - Included Datasets targeting Reasoning / Spatial Awareness. - Filtered for the usual errors, slop and stuff at the end. Some may have slipped through, but I removed nearly all of it.

- Llama3.1 was more disappointing, in the Instruct Tune? It felt overbaked, atleast. Likely due to the DPO being done after their SFT Stage. - Tuning on L3.1 base did not give good results, unlike when I tested with Nemo base. unfortunate. - Still though, I think I did an okay job. It does feel a bit more distinctive. - It took a lot of tinkering, like a LOT to wrangle this.