chore: switch to ruff + pydoclint, deprecate .gitingest, and perform a repo-wide quality sweep (#329)
Some checks failed
CI / test (macos-latest, 3.10) (push) Has been cancelled
CI / test (macos-latest, 3.11) (push) Has been cancelled
CI / test (macos-latest, 3.12) (push) Has been cancelled
CI / test (macos-latest, 3.13) (push) Has been cancelled
CI / test (macos-latest, 3.8) (push) Has been cancelled
CI / test (macos-latest, 3.9) (push) Has been cancelled
CI / test (ubuntu-latest, 3.10) (push) Has been cancelled
CI / test (ubuntu-latest, 3.11) (push) Has been cancelled
CI / test (ubuntu-latest, 3.12) (push) Has been cancelled
CI / test (ubuntu-latest, 3.13) (push) Has been cancelled
CI / test (ubuntu-latest, 3.8) (push) Has been cancelled
CI / test (ubuntu-latest, 3.9) (push) Has been cancelled
CI / test (windows-latest, 3.10) (push) Has been cancelled
CI / test (windows-latest, 3.11) (push) Has been cancelled
CI / test (windows-latest, 3.12) (push) Has been cancelled
CI / test (windows-latest, 3.13) (push) Has been cancelled
CI / test (windows-latest, 3.8) (push) Has been cancelled
CI / test (windows-latest, 3.9) (push) Has been cancelled
OSSF Scorecard / Scorecard analysis (push) Has been cancelled

* **Pre-commit**: replace `black` & `darglint` with `ruff-check` / `ruff-format`;
  add `pydoclint` for docstring quality
* **Deps**: drop `tomli`; tighten `typing_extensions`; add `eval-type-backport`;
  remove `black`, `djlint`, `pylint` from `requirements-dev`
* **Ignore files**: deprecate TOML-based `.gitingest`; introduce
  `.gitingestignore` (git-wildmatch, parsed via `_parse_ignore_file`)
* **Config**: new unified `[tool.ruff]` (lint + format + isort); delete
  `[tool.black]`, keep minimal `[tool.isort]` for now
* **Refactor/style**: adopt `from __future__ import annotations`, kw-only args,
  richer types; reorder params & `__all__`; move type-only imports under
  `if TYPE_CHECKING`; extract `_CLIArgs` `TypedDict`, migrate form data to
  `pydantic.QueryForm`; deduplicate `cli.main` / `_async_main`; use `pathlib`,
  avoid file-IO in async; replace magic numbers with constants; delete
  `is_text_file` (logic now lives in `FileSystemNode.content`)
* **Bug fix**: remove silent error in `notebook_utils._process_cell`
* **Docs**: refresh README badges
* **Tests**: update fixtures & assertions

**BREAKING**: new `.gitingestignore` file replaces (now-deprecated) `.gitingest`.

No functional API or CLI changes.
This commit is contained in:
Filip Christiansen 2025-06-28 18:49:37 +02:00 committed by GitHub
parent b39ef5416c
commit 2f447ae632
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
54 changed files with 1767 additions and 1518 deletions

View file

@ -1,19 +1,18 @@
"""
Tests to verify that the query parser is Git host agnostic.
"""Tests to verify that the query parser is Git host agnostic.
These tests confirm that `parse_query` correctly identifies user/repo pairs and canonical URLs for GitHub, GitLab,
These tests confirm that ``parse_query`` correctly identifies user/repo pairs and canonical URLs for GitHub, GitLab,
Bitbucket, Gitea, and Codeberg, even if the host is omitted.
"""
from typing import List, Tuple
from __future__ import annotations
import pytest
from gitingest.query_parsing import parse_query
from gitingest.query_parser import parse_query
from gitingest.utils.query_parser_utils import KNOWN_GIT_HOSTS
# Repository matrix: (host, user, repo)
_REPOS: List[Tuple[str, str, str]] = [
_REPOS: list[tuple[str, str, str]] = [
("github.com", "tiangolo", "fastapi"),
("gitlab.com", "gitlab-org", "gitlab-runner"),
("bitbucket.org", "na-dna", "llm-knowledge-share"),
@ -25,7 +24,7 @@ _REPOS: List[Tuple[str, str, str]] = [
# Generate cartesian product of repository tuples with URL variants.
@pytest.mark.parametrize("host, user, repo", _REPOS, ids=[f"{h}:{u}/{r}" for h, u, r in _REPOS])
@pytest.mark.parametrize(("host", "user", "repo"), _REPOS, ids=[f"{h}:{u}/{r}" for h, u, r in _REPOS])
@pytest.mark.parametrize("variant", ["full", "noscheme", "slug"])
@pytest.mark.asyncio
async def test_parse_query_without_host(
@ -34,8 +33,7 @@ async def test_parse_query_without_host(
repo: str,
variant: str,
) -> None:
"""Verify that `parse_query` handles URLs, host-omitted URLs and raw slugs."""
"""Verify that ``parse_query`` handles URLs, host-omitted URLs and raw slugs."""
# Build the input URL based on the selected variant
if variant == "full":
url = f"https://{host}/{user}/{repo}"
@ -49,7 +47,7 @@ async def test_parse_query_without_host(
# For slug form with a custom host (not in KNOWN_GIT_HOSTS) we expect a failure,
# because the parser cannot guess which domain to use.
if variant == "slug" and host not in KNOWN_GIT_HOSTS:
with pytest.raises(ValueError):
with pytest.raises(ValueError, match="Could not find a valid repository host"):
await parse_query(url, max_file_size=50, from_web=True)
return