talemate/docs/cleanup.py
veguAI 113553c306
0.29.0 (#167)
* set 0.29.0

* tweaks for dig layered history (wip)

* move director agent to directory

* relock

* remove "none" from dig_layered_history response

* determine character development

* update character sheet from character development (wip)

* org imports

* alert outdated template overrides during startup

* editor controls normalization of exposition

* dialogue formatting refactor

* fix narrator.clean_result forcing * regardless of editor fix exposition setting

* move more of the dialogue cleanup logic into the editor fix exposition handlers

* remove cruft

* change ot normal selects and add some margin

* move formatting option up

* always strip partial sentences

* separates exposition fixes from other dialogue cleanup operations, since we still want those

* add novel formatting style

* honor formatting config when no markers are supplied

* fix issue where sometimes character message formatting would miss character name

* director can now guide actors through scene analysis

* style fixes

* typo

* select correct system message on direction type

* prompt tweaks

* disable by default

* add support for dynamic instruction injection and include missing guide for internal note usage

* change favicon and also indicate business through favicon

* img

* support xtc, dry and smoothing in text gen webui

* prompt tweaks

* support xtc, dry, smoothing in koboldcpp client

* reorder

* dry, xtc and smoothing factor exposed to tabby api client

* urls to third party API documentation

* remove bos token

* add missing preset

* focal

* focal progress

* focal progress and generated suggestions progress

* fix issue with discard all suggestions

* apply suggestions

* move suggestion ux into the world state manager

* support generation options for suggestion generation

* unused import

* refactor focal to json based approach

* focal and character suggestion tweaks

* rmeove cruft

* remove cruft

* relock

* prompt tweaks

* layout spacing updates

* ux elements for removal of scenes from quick load menu

* context investigation refactor WIP

* context investigation refactor

* context investigation refactor

* context investigation refactor

* cleanup

* move scene analysis to summarizer agent

* remove deprecated context investigation logic

* context investigation refactor continued - split into separate file for easier maint

* allow direct specification of response context length

* context investigation and scene analyzation progress

* change analysis length config to number

* remove old dig-layered-history templates

* summarizer - deep analysis is only available if there is layered history

* move world_state agent to dedicated directory

* remove unused imports

* automatic character progression WIP

* character suggestions progress

* app busy flag based on agent business

* indicate suggestions in world state overview

* fix issue with user input cleanup

* move conversation agent to a dedicated submodule

* Response in action analyze_text_and_extract_context is too short #162

* move narrator agent to its own submodule

* narrator improvements WIP

* narration improvements WIP

* fix issue with regen of character exit narration

* narration improvements WIP

* prompt tweaks

* last_message_of_type can set max iterations

* fix multiline parsing

* prompt tweaks

* director guide actors based of scene analysis

* director guidance for actors

* prompt tweaks

* prompt tweaks

* prompt tweaks

* fix automatic character proposals not propagating to the ux

* fix analysis length

* support director guidance in legacy chat format

* typo

* prompt tweaks

* prompt tweaks

* error handling

* length config

* prompt tweaks

* typo

* remove cruft

* prompt tweak

* prompt tweak

* time passage style changes

* remove cruft

* deep analysis context investigations honor call limit

* refactor conversation agent long term memory to use new memory rag mixin - also streamline prompts

* tweaks to RAG mixin agent config

* fix narration highlighting

* context investgiation fixes
director narration guidance
summarization tweaks

* direactor guide narration progress
context investigation fixes that would cause looping of investigations and failure to dig into the correct layers

* prompt tweaks

* summarization improvements

* separate deep analysis chapter selection from analysis into its own prompt

* character entry and exit

* cache analysis per subtype and some narrator prompt tweaks

* separate layered history logic into its own summarizer mixin and expose some additional options

* scene can now set an overral writing style using writing style templates
narrator option to enable writing style

* narrate query writing style support

* scene tools - narrator actions refactor to handler and own component

* narrator query / look at narrations emitted as context investigation messages
refactor context investigation messaage display
scene message meta data object

* include narrative direction

* improve context investigation message prompt insert

* reorg supported parameters

* fix bug when no message history exists

* WIP make regenerate work nicely with director guidance

* WIP make regenerate work nicely with director guidance

* regenerate conversation fixes

* help text

* ux tweaks

* relock

* turn off deep analysis and context investigations by default

* long term memory options for director and summarizer

* long term memory caching

* fix summarization cache toggle not showing up in ux

* ux tweaks

* layered history summarization includes character information for mentioned characters

* deepseek client added

* Add fork button to narrator message

* analyze and guidance support for time passage narration

* cache based on message fingerprint instead of id

* configurable system prompts WIP

* configurable system prompts WIP

* client overrides for system prompts wired to ux

* system prompt overhaul

* fix issue with unknown system prompt kind

* add button to manually request dynamic choices from the director
move the generate choices logic of the director agent to its own submodule

* remove cruft

* 30 may be too long and is causing the client to disappear temporarly

* suppoert dynamic choice generate for non player characters

* enable `actor` tab for player characters

* creator agent now has access to rag tools
improve acting instruction generation

* client timeout fixes

* fix issue where scene removal menu stayed open after remove

* expose scene restore functionality to ux

* create initial restore point

* fix creator extra-context template

* didn't mean to remove this

* intro scene should be edited through world editor

* fix alert

* fix partial quotes regardless of editor setting
director guidance for conversation reminds to put speech in quotes

* fix @ instructions not being passed through to director guidance prompt

* anthropic mode list updated

* default off

* cohere model list updated

* reset actAs on next scene load

* prompt tweaks

* prompt tweaks

* prompt tweaks

* prompt tweaks

* prompt tweaks

* remove debug cruft

* relock

* docs on changing host / port

* fix issue with narrator / director actiosn not available on fresh install

* fix issue with long content classification determination result

* take this reminder to put speech into quotes out for now, it seems to do more harm than good

* fix some remaining issues with auto expositon fixes

* prompt tweaks

* prompt tweaks

* fix issue during reload

* expensive and warning ux passthrough for agent config

* layered sumamry analysation defaults to on

* what's new info block added

* docs

* what's new updated

* remove old images

* old img cleanup script

* prompt tweaks

* improve auto prompt template detection via huggingface

* add gpt-4o-realtime-preview
add gpt-4o-mini-realtime-preview

* add o1 and o3-mini

* fix o1 and o3

* fix o1 and o3

* more o1 / o3 fixes

* o3 fixes
2025-02-01 17:44:06 +02:00

166 lines
No EOL
6 KiB
Python

import os
import re
import subprocess
from pathlib import Path
import argparse
def find_image_references(md_file):
"""Find all image references in a markdown file."""
with open(md_file, 'r', encoding='utf-8') as f:
content = f.read()
pattern = r'!\[.*?\]\((.*?)\)'
matches = re.findall(pattern, content)
cleaned_paths = []
for match in matches:
path = match.lstrip('/')
if 'img/' in path:
path = path[path.index('img/') + 4:]
# Only keep references to versioned images
parts = os.path.normpath(path).split(os.sep)
if len(parts) >= 2 and parts[0].replace('.', '').isdigit():
cleaned_paths.append(path)
return cleaned_paths
def scan_markdown_files(docs_dir):
"""Recursively scan all markdown files in the docs directory."""
md_files = []
for root, _, files in os.walk(docs_dir):
for file in files:
if file.endswith('.md'):
md_files.append(os.path.join(root, file))
return md_files
def find_all_images(img_dir):
"""Find all image files in version subdirectories."""
image_files = []
for root, _, files in os.walk(img_dir):
# Get the relative path from img_dir to current directory
rel_dir = os.path.relpath(root, img_dir)
# Skip if we're in the root img directory
if rel_dir == '.':
continue
# Check if the immediate parent directory is a version number
parent_dir = rel_dir.split(os.sep)[0]
if not parent_dir.replace('.', '').isdigit():
continue
for file in files:
if file.lower().endswith(('.png', '.jpg', '.jpeg', '.gif', '.svg')):
rel_path = os.path.relpath(os.path.join(root, file), img_dir)
image_files.append(rel_path)
return image_files
def grep_check_image(docs_dir, image_path):
"""
Check if versioned image is referenced anywhere using grep.
Returns True if any reference is found, False otherwise.
"""
try:
# Split the image path to get version and filename
parts = os.path.normpath(image_path).split(os.sep)
version = parts[0] # e.g., "0.29.0"
filename = parts[-1] # e.g., "world-state-suggestions-2.png"
# For versioned images, require both version and filename to match
version_pattern = f"{version}.*{filename}"
try:
result = subprocess.run(
['grep', '-r', '-l', version_pattern, docs_dir],
capture_output=True,
text=True
)
if result.stdout.strip():
print(f"Found reference to {image_path} with version pattern: {version_pattern}")
return True
except subprocess.CalledProcessError:
pass
except Exception as e:
print(f"Error during grep check for {image_path}: {e}")
return False
def main():
parser = argparse.ArgumentParser(description='Find and optionally delete unused versioned images in MkDocs project')
parser.add_argument('--docs-dir', type=str, required=True, help='Path to the docs directory')
parser.add_argument('--img-dir', type=str, required=True, help='Path to the images directory')
parser.add_argument('--delete', action='store_true', help='Delete unused images')
parser.add_argument('--verbose', action='store_true', help='Show all found references and files')
parser.add_argument('--skip-grep', action='store_true', help='Skip the additional grep validation')
args = parser.parse_args()
# Convert paths to absolute paths
docs_dir = os.path.abspath(args.docs_dir)
img_dir = os.path.abspath(args.img_dir)
print(f"Scanning markdown files in: {docs_dir}")
print(f"Looking for versioned images in: {img_dir}")
# Get all markdown files
md_files = scan_markdown_files(docs_dir)
print(f"Found {len(md_files)} markdown files")
# Collect all image references
used_images = set()
for md_file in md_files:
refs = find_image_references(md_file)
used_images.update(refs)
# Get all actual images (only from version directories)
all_images = set(find_all_images(img_dir))
if args.verbose:
print("\nAll versioned image references found in markdown:")
for img in sorted(used_images):
print(f"- {img}")
print("\nAll versioned images in directory:")
for img in sorted(all_images):
print(f"- {img}")
# Find potentially unused images
unused_images = all_images - used_images
# Additional grep validation if not skipped
if not args.skip_grep and unused_images:
print("\nPerforming additional grep validation...")
actually_unused = set()
for img in unused_images:
if not grep_check_image(docs_dir, img):
actually_unused.add(img)
if len(actually_unused) != len(unused_images):
print(f"\nGrep validation found {len(unused_images) - len(actually_unused)} additional image references!")
unused_images = actually_unused
# Report findings
print("\nResults:")
print(f"Total versioned images found: {len(all_images)}")
print(f"Versioned images referenced in markdown: {len(used_images)}")
print(f"Unused versioned images: {len(unused_images)}")
if unused_images:
print("\nUnused versioned images:")
for img in sorted(unused_images):
print(f"- {img}")
if args.delete:
print("\nDeleting unused versioned images...")
for img in unused_images:
full_path = os.path.join(img_dir, img)
try:
os.remove(full_path)
print(f"Deleted: {img}")
except Exception as e:
print(f"Error deleting {img}: {e}")
print("\nDeletion complete")
else:
print("\nNo unused versioned images found!")
if __name__ == "__main__":
main()