refactor: sitemap throttle + SEF URLs (#100) and batch cursor pagination (#106) #116

Merged
jmiller merged 1 commits from feat/sitemap-batch-redesign into dev 2026-06-29 17:18:12 +00:00
Owner

Implements the two deferred redesigns. 4 files, all php -l clean.

#106 — Batch cursor pagination

BatchController::process() previously always queried offset 0 and relied on IS NULL to skip done rows — so a row that failed to insert was re-fetched on every chunk forever, and the created > 0 stop-guard masked it by halting the batch early.

Now it paginates by id cursor (WHERE c.id > :lastId ORDER BY c.id):

  • Failed rows fall behind the cursor and are never revisited → the loop is provably terminating
  • process() returns examined + last_id; the editor JS passes the cursor each chunk and stops when a chunk examines 0 rows (reaches 100% even with failures)

#100 — Sitemap throttle + SEF URLs

(The security part — public-access filtering + atomic write — shipped earlier this cycle. These are the remaining two items.)

  • ThrottleonContentAfterSaveRebuildSitemap regenerates at most once per 60s (SITEMAP_MIN_INTERVAL), so bulk edits/imports don't rebuild the whole sitemap on every save. Eventually consistent within the window.
  • SEF URLs — each article is routed via Route::link('site', …) so the sitemap matches the canonical URLs the plugin emits, with a try/catch fallback to the non-SEF index.php URL if routing fails. Worst case = today's behavior; never a broken URL.

⚠️ Verification note

Static-only (php -l). Two things warrant a runtime check on Joomla 6: that Route::link('site', …) produces correct SEF URLs from the admin/save context (else the fallback kicks in), and the cursor batch reaching 100% on a site with eligible articles. Both are designed to fail safe.

Remaining minor follow-up: per-language sitemaps (optional).

Closes #100, #106.

🤖 Generated with Claude Code

Implements the two deferred redesigns. 4 files, all `php -l` clean. ## #106 — Batch cursor pagination `BatchController::process()` previously always queried **offset 0** and relied on `IS NULL` to skip done rows — so a row that failed to insert was re-fetched on every chunk forever, and the `created > 0` stop-guard masked it by halting the batch early. Now it paginates by **id cursor** (`WHERE c.id > :lastId ORDER BY c.id`): - Failed rows fall behind the cursor and are never revisited → the loop is provably terminating - `process()` returns `examined` + `last_id`; the editor JS passes the cursor each chunk and stops when a chunk examines 0 rows (reaches 100% even with failures) ## #100 — Sitemap throttle + SEF URLs (The security part — public-access filtering + atomic write — shipped earlier this cycle. These are the remaining two items.) - **Throttle** — `onContentAfterSaveRebuildSitemap` regenerates at most once per **60s** (`SITEMAP_MIN_INTERVAL`), so bulk edits/imports don't rebuild the whole sitemap on every save. Eventually consistent within the window. - **SEF URLs** — each article is routed via `Route::link('site', …)` so the sitemap matches the canonical URLs the plugin emits, with a **try/catch fallback** to the non-SEF `index.php` URL if routing fails. Worst case = today's behavior; never a broken URL. ## ⚠️ Verification note Static-only (`php -l`). Two things warrant a runtime check on Joomla 6: that `Route::link('site', …)` produces correct SEF URLs from the admin/save context (else the fallback kicks in), and the cursor batch reaching 100% on a site with eligible articles. Both are designed to fail safe. Remaining minor follow-up: per-language sitemaps (optional). Closes #100, #106. 🤖 Generated with [Claude Code](https://claude.com/claude-code)
jmiller added 1 commit 2026-06-29 17:17:59 +00:00
refactor: sitemap throttle + SEF URLs (#100) and batch cursor pagination (#106)
Generic: Project CI / Lint & Validate (pull_request) Successful in 13s
Universal: PR Check / Branch Policy (pull_request) Successful in 1s
Universal: PR Check / Secret Scan (pull_request) Successful in 6s
Universal: PR Check / Validate PR (pull_request) Failing after 5s
Branch Cleanup / Delete merged branch (pull_request) Failing after 1s
RC Revert / Rename rc/ back to dev/ (pull_request) Has been skipped
Joomla: Metadata Validation / Validate Joomla Metadata (pull_request) Failing after 10s
Generic: Project CI / Tests (pull_request) Has been cancelled
Universal: PR Check / Build RC Package (pull_request) Has been cancelled
Universal: PR Check / Report Issues (pull_request) Has been cancelled
7532446e46
#106 — BatchController now paginates by id cursor (WHERE c.id > :lastId)
instead of always querying offset 0. A row that fails to insert falls behind
the cursor and is not re-fetched, so the batch always terminates and reaches
100% even with persistent failures. process() returns examined + last_id; the
editor JS drives the cursor and stops when a chunk examines 0 rows.

#100 — Sitemap:
- Throttle: regenerate at most once per 60s on content save (SITEMAP_MIN_INTERVAL)
  so bulk edits/imports don't rebuild the whole file every save
- SEF URLs: route each article via Route::link('site', ...) with a fallback to
  the non-SEF index.php URL if routing fails (worst case = prior behavior)
(access-level filtering + atomic write were done earlier in the cycle)
jmiller merged commit cb0e5596ea into dev 2026-06-29 17:18:12 +00:00
jmiller deleted branch feat/sitemap-batch-redesign 2026-06-29 17:18:12 +00:00
Sign in to join this conversation.
No Reviewers
Priority -
Type -
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: MokoConsulting/MokoSuiteOpenGraph#116