WFGY/tools/wfgy_lambda_diverse_colab.ipynb
2025-08-07 13:02:25 +08:00

115 lines
3 KiB
Text
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# 🧪 WFGY λ_diverse Demo (v0.1)\n",
"Estimate how diverse multiple answers are for the same prompt."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Formula\n",
"Let **A = {a₁ … aₙ}** be *n* answers. \n",
"λ_diverse = 1 mean_pairwise_cosine(aᵢ, aⱼ) \n",
"Lower average similarity ⇒ higher diversity."
]
},
{
"cell_type": "code",
"metadata": { "id": "install" },
"source": [
"!pip -q install sentence-transformers --upgrade"
],
"execution_count": null,
"outputs": []
},
{
"cell_type": "code",
"metadata": { "id": "imports" },
"source": [
"from sentence_transformers import SentenceTransformer, util\n",
"import itertools, numpy as np"
],
"execution_count": null,
"outputs": []
},
{
"cell_type": "code",
"metadata": { "id": "model" },
"source": [
"model = SentenceTransformer('all-MiniLM-L6-v2')"
],
"execution_count": null,
"outputs": []
},
{
"cell_type": "code",
"metadata": { "id": "helper" },
"source": [
"def lambda_diverse(sent_list):\n",
" vecs = model.encode(sent_list, convert_to_tensor=True)\n",
" sims = []\n",
" for i, j in itertools.combinations(range(len(vecs)), 2):\n",
" sims.append(util.cos_sim(vecs[i], vecs[j]).item())\n",
" return 1 - np.mean(sims)"
],
"execution_count": null,
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## ✏️ Edit & run\n",
"Replace the prompt + answers list, press ▶️."
]
},
{
"cell_type": "code",
"metadata": { "id": "user" },
"source": [
"prompt = \"Give me a one-sentence summary of photosynthesis.\"\n",
"answers = [\n",
" \"Plants convert sunlight into chemical energy stored as sugar.\",\n",
" \"Using light, plants turn water and CO₂ into glucose and oxygen.\",\n",
" \"Through photosynthesis, green leaves make food from sunlight.\",\n",
" \"Plants harness solar energy to synthesize sugars from carbon dioxide.\",\n",
" \"Light drives the production of glucose in plants, releasing oxygen.\"\n",
"]\n",
"\n",
"ld = lambda_diverse(answers)\n",
"\n",
"print(f\"λ_diverse : {ld:.3f}\\n\")\n",
"if ld > 0.70:\n",
" label = \"High diversity ✅\"\n",
"elif ld > 0.40:\n",
" label = \"Medium diversity ⚠️\"\n",
"else:\n",
" label = \"Low diversity 🚨\"\n",
"print(label)"
],
"execution_count": null,
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"---\n",
"### Next Steps\n",
"* Compare diversity of *top-k* sampling vs nucleus sampling.\n",
"* Combine with **e_resonance** to pick “diverse *and* on-topic” answers.\n"
]
}
],
"metadata": {
"kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" },
"language_info": { "name": "python" }
},
"nbformat": 4,
"nbformat_minor": 5
}