navidrome/persistence/sql_restful_test.go
Deluan Quintão 54de0dbc52
Some checks are pending
Pipeline: Test, Lint, Build / Lint i18n files (push) Waiting to run
Pipeline: Test, Lint, Build / Check Docker configuration (push) Waiting to run
Pipeline: Test, Lint, Build / Test JS code (push) Waiting to run
Pipeline: Test, Lint, Build / Get version info (push) Waiting to run
Pipeline: Test, Lint, Build / Lint Go code (push) Waiting to run
Pipeline: Test, Lint, Build / Test Go code (push) Waiting to run
Pipeline: Test, Lint, Build / Build (push) Blocked by required conditions
Pipeline: Test, Lint, Build / Build-1 (push) Blocked by required conditions
Pipeline: Test, Lint, Build / Build-2 (push) Blocked by required conditions
Pipeline: Test, Lint, Build / Build-3 (push) Blocked by required conditions
Pipeline: Test, Lint, Build / Build-4 (push) Blocked by required conditions
Pipeline: Test, Lint, Build / Build-5 (push) Blocked by required conditions
Pipeline: Test, Lint, Build / Build-6 (push) Blocked by required conditions
Pipeline: Test, Lint, Build / Build-7 (push) Blocked by required conditions
Pipeline: Test, Lint, Build / Build-8 (push) Blocked by required conditions
Pipeline: Test, Lint, Build / Build-9 (push) Blocked by required conditions
Pipeline: Test, Lint, Build / Build-10 (push) Blocked by required conditions
Pipeline: Test, Lint, Build / Push to GHCR (push) Blocked by required conditions
Pipeline: Test, Lint, Build / Push to Docker Hub (push) Blocked by required conditions
Pipeline: Test, Lint, Build / Cleanup digest artifacts (push) Blocked by required conditions
Pipeline: Test, Lint, Build / Build Windows installers (push) Blocked by required conditions
Pipeline: Test, Lint, Build / Package/Release (push) Blocked by required conditions
Pipeline: Test, Lint, Build / Upload Linux PKG (push) Blocked by required conditions
feat(server): implement FTS5-based full-text search (#5079)
* build: add sqlite_fts5 build tag to enable FTS5 support

* feat: add SearchBackend config option (default: fts)

* feat: add buildFTS5Query for safe FTS5 query preprocessing

* feat: add FTS5 search backend with config toggle, refactor legacy search

- Add searchExprFunc type and getSearchExpr() for backend selection
- Rename fullTextExpr to legacySearchExpr
- Add ftsSearchExpr using FTS5 MATCH subquery
- Update fullTextFilter in sql_restful.go to use configured backend

* feat: add FTS5 migration with virtual tables, triggers, and search_participants

Creates FTS5 virtual tables for media_file, album, and artist with
unicode61 tokenizer and diacritic folding. Adds search_participants
column, populates from JSON, and sets up INSERT/UPDATE/DELETE triggers.

* feat: populate search_participants in PostMapArgs for FTS5 indexing

* test: add FTS5 search integration tests

* fix: exclude FTS5 virtual tables from e2e DB restore

The restoreDB function iterates all tables in sqlite_master and
runs DELETE + INSERT to reset state. FTS5 contentless virtual tables
cannot be directly deleted from. Since triggers handle FTS5 sync
automatically, simply skip tables matching *_fts and *_fts_* patterns.

* build: add compile-time guard for sqlite_fts5 build tag

Same pattern as netgo: compilation fails with a clear error if
the sqlite_fts5 build tag is missing.

* build: add sqlite_fts5 tag to reflex dev server config

* build: extract GO_BUILD_TAGS variable in Makefile to avoid duplication

* fix: strip leading * from FTS5 queries to prevent "unknown special query" error

* feat: auto-append prefix wildcard to FTS5 search tokens for broader matching

Every plain search token now gets a trailing * appended (e.g., "love" becomes
"love*"), so searching for "love" also matches "lovelace", "lovely", etc.
Quoted phrases are preserved as exact matches without wildcards. Results are
ordered alphabetically by name/title, so shorter exact matches naturally
appear first.

* fix: clarify comments about FTS5 operator neutralization

The comments said "strip" but the code lowercases operators to
neutralize them (FTS5 operators are case-sensitive). Updated comments
to accurately describe the behavior.

* fix: use fmt.Sprintf for FTS5 phrase placeholders

The previous encoding used rune('0'+index) which silently breaks with
10+ quoted phrases. Use fmt.Sprintf for arbitrary index support.

* fix: validate and normalize SearchBackend config option

Normalize the value to lowercase and fall back to "fts" with a log
warning for unrecognized values. This prevents silent misconfiguration
from typos like "FTS", "Legacy", or "fts5".

* refactor: improve documentation for build tags and FTS5 requirements

Signed-off-by: Deluan <deluan@navidrome.org>

* refactor: convert FTS5 query and search backend normalization tests to DescribeTable format

Signed-off-by: Deluan <deluan@navidrome.org>

* fix: add sqlite_fts5 build tag to golangci configuration

Signed-off-by: Deluan <deluan@navidrome.org>

* feat: add UISearchDebounceMs configuration option and update related components

Signed-off-by: Deluan <deluan@navidrome.org>

* fix: fall back to legacy search when SearchFullString is enabled

FTS5 is token-based and cannot match substrings within words, so
getSearchExpr now returns legacySearchExpr when SearchFullString
is true, regardless of SearchBackend setting.

* fix: add sqlite_fts5 build tag to CI pipeline and Dockerfile

* fix: add WHEN clauses to FTS5 AFTER UPDATE triggers

Added WHEN clauses to the media_file_fts_au, album_fts_au, and
artist_fts_au triggers so they only fire when FTS-indexed columns
actually change. Previously, every row update (e.g., play count, rating,
starred status) triggered an unnecessary delete+insert cycle in the FTS
shadow tables. The WHEN clauses use IS NOT for NULL-safe comparison of
each indexed column, avoiding FTS index churn for non-indexed updates.

* feat: add SearchBackend configuration option to data and insights components

Signed-off-by: Deluan <deluan@navidrome.org>

* fix: enhance input sanitization for FTS5 by stripping additional punctuation and special characters

Signed-off-by: Deluan <deluan@navidrome.org>

* feat: add search_normalized column for punctuated name search (R.E.M., AC/DC)

Add index-time normalization and query-time single-letter collapsing to
fix FTS5 search for punctuated names. A new search_normalized column
stores concatenated forms of punctuated words (e.g., "R.E.M." → "REM",
"AC/DC" → "ACDC") and is indexed in FTS5 tables. At query time, runs of
consecutive single letters (from dot-stripping) are collapsed into OR
expressions like ("R E M" OR REM*) to match both the original tokens and
the normalized form. This enables searching by "R.E.M.", "REM", "AC/DC",
"ACDC", "A-ha", or "Aha" and finding the correct results.

* refactor: simplify isSingleUnicodeLetter to avoid []rune allocation

Use utf8.DecodeRuneInString to check for a single Unicode letter
instead of converting the entire string to a []rune slice.

* feat: define ftsSearchColumns for flexible FTS5 search column inclusion

Signed-off-by: Deluan <deluan@navidrome.org>

* feat: update collapseSingleLetterRuns to return quoted phrases for abbreviations

Signed-off-by: Deluan <deluan@navidrome.org>

* feat: implement extractPunctuatedWords to handle artist/album names with embedded punctuation

Signed-off-by: Deluan <deluan@navidrome.org>

* feat: implement extractPunctuatedWords to handle artist/album names with embedded punctuation

Signed-off-by: Deluan <deluan@navidrome.org>

* refactor: punctuated word handling to improve processing of artist/album names

Signed-off-by: Deluan <deluan@navidrome.org>

* feat: add CJK support for search queries with LIKE filters

Signed-off-by: Deluan <deluan@navidrome.org>

* feat: enhance FTS5 search by adding album version support and CJK handling

Signed-off-by: Deluan <deluan@navidrome.org>

* refactor: search configuration to use structured options

Signed-off-by: Deluan <deluan@navidrome.org>

* feat: enhance search functionality to support punctuation-only queries and update related tests

Signed-off-by: Deluan <deluan@navidrome.org>

---------

Signed-off-by: Deluan <deluan@navidrome.org>
2026-02-21 17:52:42 -05:00

238 lines
7.8 KiB
Go

package persistence
import (
"context"
"strings"
"github.com/Masterminds/squirrel"
"github.com/deluan/rest"
"github.com/navidrome/navidrome/conf"
"github.com/navidrome/navidrome/conf/configtest"
. "github.com/onsi/ginkgo/v2"
. "github.com/onsi/gomega"
)
var _ = Describe("sqlRestful", func() {
Describe("parseRestFilters", func() {
var r sqlRepository
var options rest.QueryOptions
BeforeEach(func() {
r = sqlRepository{}
})
It("returns nil if filters is empty", func() {
options.Filters = nil
Expect(r.parseRestFilters(context.Background(), options)).To(BeNil())
})
It(`returns nil if tries a filter with legacySearchExpr("'")`, func() {
DeferCleanup(configtest.SetupConfig())
conf.Server.Search.Backend = "legacy"
r.filterMappings = map[string]filterFunc{
"name": fullTextFilter("table"),
}
options.Filters = map[string]any{"name": "'"}
Expect(r.parseRestFilters(context.Background(), options)).To(BeEmpty())
})
It("does not add nill filters", func() {
r.filterMappings = map[string]filterFunc{
"name": func(string, any) squirrel.Sqlizer {
return nil
},
}
options.Filters = map[string]any{"name": "joe"}
Expect(r.parseRestFilters(context.Background(), options)).To(BeEmpty())
})
It("returns a '=' condition for 'id' filter", func() {
options.Filters = map[string]any{"id": "123"}
Expect(r.parseRestFilters(context.Background(), options)).To(Equal(squirrel.And{squirrel.Eq{"id": "123"}}))
})
It("returns a 'in' condition for multiples 'id' filters", func() {
options.Filters = map[string]any{"id": []string{"123", "456"}}
Expect(r.parseRestFilters(context.Background(), options)).To(Equal(squirrel.And{squirrel.Eq{"id": []string{"123", "456"}}}))
})
It("returns a 'like' condition for other filters", func() {
options.Filters = map[string]any{"name": "joe"}
Expect(r.parseRestFilters(context.Background(), options)).To(Equal(squirrel.And{squirrel.Like{"name": "joe%"}}))
})
It("uses the custom filter", func() {
r.filterMappings = map[string]filterFunc{
"test": func(field string, value any) squirrel.Sqlizer {
return squirrel.Gt{field: value}
},
}
options.Filters = map[string]any{"test": 100}
Expect(r.parseRestFilters(context.Background(), options)).To(Equal(squirrel.And{squirrel.Gt{"test": 100}}))
})
})
Describe("fullTextFilter function", func() {
var filter filterFunc
var tableName string
var mbidFields []string
BeforeEach(func() {
DeferCleanup(configtest.SetupConfig())
conf.Server.Search.Backend = "legacy"
tableName = "test_table"
mbidFields = []string{"mbid", "artist_mbid"}
filter = fullTextFilter(tableName, mbidFields...)
})
Context("when value is a valid UUID", func() {
It("returns only the mbid filter (precedence over full text)", func() {
uuid := "550e8400-e29b-41d4-a716-446655440000"
result := filter("search", uuid)
expected := squirrel.Or{
squirrel.Eq{"test_table.mbid": uuid},
squirrel.Eq{"test_table.artist_mbid": uuid},
}
Expect(result).To(Equal(expected))
})
It("falls back to full text when no mbid fields are provided", func() {
noMbidFilter := fullTextFilter(tableName)
uuid := "550e8400-e29b-41d4-a716-446655440000"
result := noMbidFilter("search", uuid)
// mbidExpr with no fields returns nil, so cmp.Or falls back to fullTextExpr
expected := squirrel.And{
squirrel.Like{"test_table.full_text": "% 550e8400-e29b-41d4-a716-446655440000%"},
}
Expect(result).To(Equal(expected))
})
})
Context("when value is not a valid UUID", func() {
It("returns full text search condition only", func() {
result := filter("search", "beatles")
// mbidExpr returns nil for non-UUIDs, so fullTextExpr result is returned directly
expected := squirrel.And{
squirrel.Like{"test_table.full_text": "% beatles%"},
}
Expect(result).To(Equal(expected))
})
It("handles multi-word search terms", func() {
result := filter("search", "the beatles abbey road")
// Should return And condition directly
andCondition, ok := result.(squirrel.And)
Expect(ok).To(BeTrue())
Expect(andCondition).To(HaveLen(4))
// Check that all words are present (order may vary)
Expect(andCondition).To(ContainElement(squirrel.Like{"test_table.full_text": "% the%"}))
Expect(andCondition).To(ContainElement(squirrel.Like{"test_table.full_text": "% beatles%"}))
Expect(andCondition).To(ContainElement(squirrel.Like{"test_table.full_text": "% abbey%"}))
Expect(andCondition).To(ContainElement(squirrel.Like{"test_table.full_text": "% road%"}))
})
})
Context("when SearchFullString config changes behavior", func() {
It("uses different separator with SearchFullString=false", func() {
conf.Server.Search.FullString = false
result := filter("search", "test query")
andCondition, ok := result.(squirrel.And)
Expect(ok).To(BeTrue())
Expect(andCondition).To(HaveLen(2))
// Check that all words are present with leading space (order may vary)
Expect(andCondition).To(ContainElement(squirrel.Like{"test_table.full_text": "% test%"}))
Expect(andCondition).To(ContainElement(squirrel.Like{"test_table.full_text": "% query%"}))
})
It("uses no separator with SearchFullString=true", func() {
conf.Server.Search.FullString = true
result := filter("search", "test query")
andCondition, ok := result.(squirrel.And)
Expect(ok).To(BeTrue())
Expect(andCondition).To(HaveLen(2))
// Check that all words are present without leading space (order may vary)
Expect(andCondition).To(ContainElement(squirrel.Like{"test_table.full_text": "%test%"}))
Expect(andCondition).To(ContainElement(squirrel.Like{"test_table.full_text": "%query%"}))
})
})
Context("edge cases", func() {
It("returns nil for empty string", func() {
result := filter("search", "")
Expect(result).To(BeNil())
})
It("returns nil for string with only whitespace", func() {
result := filter("search", " ")
Expect(result).To(BeNil())
})
It("handles special characters that are sanitized", func() {
result := filter("search", "don't")
expected := squirrel.And{
squirrel.Like{"test_table.full_text": "% dont%"}, // str.SanitizeStrings removes quotes
}
Expect(result).To(Equal(expected))
})
It("returns nil for single quote (SQL injection protection)", func() {
result := filter("search", "'")
Expect(result).To(BeNil())
})
It("handles mixed case UUIDs", func() {
uuid := "550E8400-E29B-41D4-A716-446655440000"
result := filter("search", uuid)
// Should return only mbid filter (uppercase UUID should work)
expected := squirrel.Or{
squirrel.Eq{"test_table.mbid": strings.ToLower(uuid)},
squirrel.Eq{"test_table.artist_mbid": strings.ToLower(uuid)},
}
Expect(result).To(Equal(expected))
})
It("handles invalid UUID format gracefully", func() {
result := filter("search", "550e8400-invalid-uuid")
// Should return full text filter since UUID is invalid
expected := squirrel.And{
squirrel.Like{"test_table.full_text": "% 550e8400-invalid-uuid%"},
}
Expect(result).To(Equal(expected))
})
It("handles empty mbid fields array", func() {
emptyMbidFilter := fullTextFilter(tableName, []string{}...)
result := emptyMbidFilter("search", "test")
// mbidExpr with empty fields returns nil, so cmp.Or falls back to fullTextExpr
expected := squirrel.And{
squirrel.Like{"test_table.full_text": "% test%"},
}
Expect(result).To(Equal(expected))
})
It("converts value to lowercase before processing", func() {
result := filter("search", "TEST")
// The function converts to lowercase internally
expected := squirrel.And{
squirrel.Like{"test_table.full_text": "% test%"},
}
Expect(result).To(Equal(expected))
})
})
})
})