aboutsummaryrefslogtreecommitdiffhomepage
path: root/src
Commit message (Collapse)AuthorAgeFilesLines
* add children_headings to document abstractionRalph Amissah5 days3-12/+22
| | | | | | | | | | | | | | | | | Add int[] children_headings field to DocObj_MetaInfo_ and compute it in the post-processing pass of metadoc_from_src.d, right after last_descendant_ocn. Single O(n) pass builds a parent_ocn -> child heading OCNs map, then assigns to each heading object. Useful for tree-structured output. The .ssp serializer now reads directly from the abstraction field instead of pre-computing its own map. metadoc_object_setter.d: +1 line (field declaration) metadoc_from_src.d: +17 lines (computation) create_abstraction_txt.d: -10 lines (simplified) Co-Authored-By: Anthropic Claude Opus 4.6 (1M context)
* add --pod2 flag, decouple --show-abstraction from --podRalph Amissah5 days2-31/+45
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Finer-grained control over when .ssp files are produced: --show-abstraction writes .ssp to OUTPUT/lang/abstraction/ independently of any pod flag --pod builds pod without .ssp bundled --pod2 builds pod with .ssp in media/abstraction/ Changes to spine.d: - show_abstraction() now only responds to its own flag and pod2, no longer triggered by source_or_pod - Add pod2 to opts init, getopt, OptActions - pod() returns true for both --pod and --pod2 - source_or_pod() includes pod2 Changes to source_pod.d: - Remove per-document pod directory (rmdirRecurse) before regeneration, ensuring clean slate on every run. This prevents stale content from previous runs (e.g. a --pod2 run followed by --pod would otherwise leave an outdated media/abstraction/ directory) - Gate abstraction directory creation and .ssp bundling on pod2 flag specifically Tested: --pod (no .ssp), --pod2 (.ssp in pod + zip), --show-abstraction (standalone .ssp), --pod after --pod2 (stale abstraction cleaned up). All 35 sample documents pass. Co-Authored-By: Anthropic Claude Opus 4.6 (1M context)
* .ssp: omit empty-value array property entriesRalph Amissah5 days1-3/+6
| | | | | | | | | | | | | Add empty-string guards to array property loops (.stow_link, .lev4_subtoc, .anchor_tag) so entries with zero-length values are not emitted. Empty properties have no value for PEG parsing - absent lines are faster to skip than matching a property name to find an empty value. Removes 1488 empty .anchor_tag: lines from Wealth of Networks .ssp alone. Co-Authored-By: Anthropic Claude Opus 4.6 (1M context)
* .ssp: add .children property for heading tree navigationRalph Amissah5 days1-0/+19
| | | | | | | | | | | | | | | | | | | | | | | | | | - Add explicit child heading OCN lists to heading objects, pre-computed in a single O(n) pass over the body section before serialization. This makes the document tree directly navigable without scanning - each heading lists its direct sub-heading OCNs. - Example output for a chapter heading: [10] heading :1 .last_descendant: 65 .children: 14 24 42 57 - Implementation: builds an int[][int] map (parent_ocn -> child heading OCNs) from one pass over the body objects, then emits .children: during serialization for headings that have entries in the map. - The tree was already reconstructable from parent_ocn + last_descendant_ocn, but .children makes it immediate - no scanning required to find a heading's sub-structure. - Tested against all 35 sample documents - zero failures. Co-Authored-By: Anthropic Claude Opus 4.6 (1M context)
* .ssp serializer: include all ObjGenericComposite fieldsRalph Amissah5 days1-9/+120
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | - Make the .ssp format a complete representation of the document abstraction by serializing all remaining fields from ObjGenericComposite (only omitting ptr.* runtime indices which are meaningless outside the in-memory context). - New fields added: .ancestors_collapsed: - collapsed level ancestor chain .dom_status: - DOM structure markedup tags status[8] .dom_status_collapsed: - DOM structure collapsed status[8] .heading_lev_collapsed: - collapsed heading level .parent_lev: - parent heading level (markup) .o_n_type: - object numbering type (0=ocn, 1=non, 2=bkidx) .is_of_type: - para/block type classification .attrib: - general attributes string .meta_lang: - block language (group/block/quote) .meta_syntax: - codeblock syntax from metainfo .sha256: - hex-encoded SHA-256 digest of object content .has: images_no_dim - image without dimensions flag .table_aligns: - column alignment array .table_walls: - table walls/borders flag .stow_link: - extracted URLs (one per line) .heading_lev_anchor: - heading level anchor tag .segment_epub: - EPUB segment anchor tag .heading_ancestors_text: - pipe-separated ancestor headings .lev4_subtoc: - sub-table-of-contents entries (one per line) .anchor_tag: - additional anchor tags (one per line) - Tested against all 35 sample documents - zero failures. Co-Authored-By: Anthropic Claude Opus 4.6 (1M context)
* .ssp serializer: omit identifier when it equals OCNRalph Amissah5 days1-3/+6
| | | | | | | | | | | | | | | - For heading objects, the identifier was always emitted on the declaration line (e.g. "[10] heading :1 10") even when it was just the OCN repeated. Now only emits the identifier when it differs from the OCN (i.e. when there is a named segment like "acknowledgments" or "a1"), reducing redundancy. Before: [10] heading :1 10 After: [10] heading :1 Named segments still appear: [0] heading :1 a1 Co-Authored-By: Anthropic Claude Opus 4.6 (1M context)
* include .ssp document abstraction in source podRalph Amissah5 days3-1/+51
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | - When --source/--pod is used, automatically generate the .ssp document abstraction and bundle it into the pod at media/abstraction/{doc_uid}.{lang}.ssp - This makes show_abstraction implicitly true when source_or_pod is active, so the .ssp file is generated before the pod assembler runs (abstraction runs before outputHub, and source_or_pod is the first task in outputHub). - Changes: paths_source.d: Add abstraction_root() path helper to _PodPaths struct, following the same pattern as image_root(). Produces paths like pod/media/abstraction/ for both zpod (inside zip) and filesystem_open_zpod (open directory). source_pod.d: - Create media/abstraction/ directory in podArchive_directory_tree - Bundle .ssp file in pod_zipMakeReady: reads from the abstraction output directory, copies to open pod directory, adds to zip archive, computes SHA-256 digest - Write .ssp digest in zipArchiveDigest alongside sstm and ssi digests spine.d: Make show_abstraction() return true when source_or_pod is active (previously only returned true for explicit --show-abstraction flag). - The .ssp is always included when building pods - no exclusion flag for this experimental feature to keep things simple. Not generated for non-pod outputs (--text, --html, etc.) unless --show-abstraction is explicitly passed. - Tested against all 35 sample documents - zero failures. Co-Authored-By: Anthropic Claude Opus 4.6 (1M context)
* document abstraction as per document sqlite dbRalph Amissah5 days2-0/+373
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | --show-abstraction-db flag to write per-document - SQLite database of document abstraction (Claude-Code primary assist) - Add a new output mode that serializes the in-memory document abstraction to a per-document SQLite database. This complements the .ssp text format (--show-abstraction) with a queryable database representation of the same data. - Schema: metadata table - key/value pairs for document metadata (title, creator, dates, rights, classify, identifiers, language, notes, make settings, doc_has counts) objects table - one row per document object with columns: section, seq (position within section), ocn, is_a, is_of_part, is_of_type, heading_level, identifier, parent_ocn, last_descendant_ocn, ancestors, indent/bullet/lang, has_* flags, segment/anchor tags, table/code properties, text content Indexed on: section, ocn, parent_ocn, is_a, heading_level - Uses prepared statements via d2sqlite3 (existing dependency) for safe and efficient insertion. Each document produces a standalone .abstraction.db file in the abstraction/ output directory. - New files: src/sisudoc/io_out/create_abstraction_db.d Follows the same pattern as create_abstraction_txt.d. Creates schema, populates metadata via key/value inserts, then iterates all sections writing objects with prepared statements within a single transaction. - Changes to spine.d: - Add "show-abstraction-db" to opts init, getopt, OptActions - Add to abstraction(), require_processing_files(), and meta_processing_general() gates - Insert call at both spineAbstraction sites - Tested against all 35 sample documents (including 9-language live-manual) - zero failures. Works standalone or combined with --show-abstraction and other output flags. - Example queries the database supports: SELECT ocn, heading_level, text FROM objects WHERE is_a = 'heading' AND section = 'body'; SELECT * FROM objects WHERE parent_ocn = 10; SELECT key, value FROM metadata WHERE key LIKE 'title.%'; Co-Authored-By: Anthropic Claude Opus 4.6 (1M context)
* .ssp document abstraction as PEG parsable textRalph Amissah5 days2-0/+322
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | --show-abstraction flag to write .ssp document abstraction files - Add a new output mode that serializes the in-memory document abstraction (produced by spineAbstraction) to a human-readable, line-oriented text format (.ssp). This captures the full object model after parsing and abstraction but before output generation. - The .ssp format uses unambiguous line prefixes: @section { } - section boundaries (head/toc/body/endnotes/...) [N] type - object declaration with OCN .name: value - object properties (only non-defaults) | content - text content lines % comment - comments - New files: src/sisudoc/io_out/create_abstraction_txt.d Serializer module following the same template pattern as metadoc_show_summary.d. Walks doc.abstraction() section by section, writing metadata preamble (@meta, @make, @doc_has) then each object with its properties and text content. Output goes to {output_path}/{lang}/abstraction/{doc}.ssp - Changes to spine.d: - Add "show-abstraction" to opts initialization, getopt, and OptActions struct - Add show_abstraction to abstraction(), require_processing_files(), and meta_processing_general() so the flag triggers full document processing - Insert call at both spineAbstraction sites (parallel and serial branches), gated by show_abstraction flag, following the same pattern as show_config/show_summary/show_make - Tested against all 35 sample documents (including multilingual live-manual in 9 languages) - zero failures. Works standalone (--show-abstraction) or combined with other output flags (--show-abstraction --html --text). No effect on existing code paths when the flag is not used. Co-Authored-By: Anthropic Claude Opus 4.6 (1M context)
* upkeep, update a few pathssisudoc-spine_v0.18.0Ralph Amissah5 days1-10/+10
|
* spine may be run against a zipped spine-pod urlRalph Amissah14 days2-2/+147
| | | | | | - claude contributed src - processes zip from url using (system installed) curl for download
* spine may be run against a document-markup zip podRalph Amissah14 days2-2/+457
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | - claude contributed src - Opens the zip with std.zip.ZipArchive (reads the whole file into memory) - Locates pod.manifest inside the archive to discover document paths and languages - Extracts markup files (.sst/.ssm/.ssi) as in-memory strings - Extracts images as in-memory byte arrays - Extracts conf/dr_document_make if present - Presents these to the existing pipeline as if they were read from the filesystem - Some security mitigations: - Zip Slip / Path Traversal: Reject entries containing `..` or starting with `/`; canonicalize resolved paths and verify they fall within extraction root - Zip Bomb: Check `ArchiveMember.size` before extracting; enforce per-file (50MB) and total size limits (500MB) - Entry Count: Limit number of entries (a pod should have at most ~100 files) - Path depth: limit (Maximum 10 path components). - Symlinks: Verify no symlinks in extracted content before processing (post-extraction recursive scan) - Filename Validation: Only allow expected characters; reject null bytes - Malformed Zips: Catch `ZipException` from `std.zip.ZipArchive` constructor - Cleanup on error
* latex minor improvements and fixes, require testingRalph Amissah2026-04-061-9/+12
| | | | | - FIXES issue with .tex files and xetex finding image paths when run within latex/ output directory
* 2026Ralph Amissah2026-01-0948-49/+49
|
* text output, improve various (including no-ocn)Ralph Amissah2025-10-144-36/+35
| | | | - revisit links (fix later)
* abstraction metainfo, provide endnote parent ocnRalph Amissah2025-10-134-12/+8
| | | | | | - preferable, endnote parent object number available for use (as here in text output, compare "endnotes, add caller ocn" commit)
* latex quote object, quick fixRalph Amissah2025-10-081-2/+22
|
* text output, endnotes, add caller ocn (& some cleaning)Ralph Amissah2025-10-084-25/+45
|
* a text output (and skel an outline)Ralph Amissah2025-10-0310-41/+885
| | | | - spine --text [--output=output path] [markup source]
* terminal output verbosity levels, minor reworkRalph Amissah2025-09-2514-107/+130
|
* spine.d tidyRalph Amissah2025-09-231-109/+88
|
* dub_describe.json + other minor miscRalph Amissah2025-09-142-49/+49
|
* src/ext_deplends d-yaml updated (v0.10.0)Ralph Amissah2025-08-28687-5103/+7987
|
* imports, make line searchableRalph Amissah2025-07-1529-432/+359
|
* source & pod (fix build from non-pod source)Ralph Amissah2025-06-122-23/+26
| | | | - appears to work, but needs review
* org ready ldc-1.41.0-beta1; flake using ldc-1.40.1Ralph Amissah2025-04-181-2/+2
| | | | - plus minor housekeeping/tidy
* minorRalph Amissah2025-04-021-1/+0
|
* sisudoc-spine upkeep, minor, a file renamedRalph Amissah2025-03-222-0/+1
|
* triple single-quote marks block identifier addedRalph Amissah2025-02-214-4/+229
| | | | | | | | - tics a bit cumbersome where single quotes work just as well - testing required (special cases not covered) - diverges from sisu markup which will need an update sometime
* doc (metadata & abstraction) struct follow throughRalph Amissah2025-02-195-275/+236
|
* document (metadata & abstraction) structRalph Amissah2025-02-198-107/+100
| | | | | | - struct replaces tuple - some direct naming of structs returned (instead of use of auto) - minor
* 2025Ralph Amissah2025-01-0146-47/+47
|
* refactor yaml extraction code fileRalph Amissah2024-12-212-671/+354
|
* nix build flake.nix fixRalph Amissah2024-12-032-0/+2
|
* pod zip fixesRalph Amissah2024-07-104-131/+120
| | | | | - serial processing (need to be built serially) - multilingual pods, copy all languages before zip
* [fn].digest.txt, sha256 of pod source files & podRalph Amissah2024-07-047-339/+379
|
* markup source digest to metadata.htmlRalph Amissah2024-07-011-3/+18
|
* markup source digests (write to terminal)Ralph Amissah2024-07-016-9/+30
|
* digest tuple rearrangeRalph Amissah2024-06-293-21/+27
|
* reduction in use of tuplesRalph Amissah2024-06-291-67/+70
|
* document digests and reduction in use of tuplesRalph Amissah2024-06-297-46/+83
|
* latex footers from document header make, a fixRalph Amissah2024-05-293-14/+17
|
* update fix default home text link infoRalph Amissah2024-05-171-6/+6
| | | | - used e.g. in html text home button
* dub (dlang) prefer dub run to dub buildRalph Amissah2024-05-072-28/+0
|
* 0.16.0 sisudoc (src/sisudoc sisudoc spine)sisudoc-spine_v0.16.0-devRalph Amissah2024-04-1047-308/+308
| | | | | - src/sisudoc (replaces src/doc_reform) - sisudoc spine (used more)
* 0.15.0doc-reform_v0.15.0Ralph Amissah2024-04-1037-74/+74
|
* mark modules as @safe: (& identify what is not)Ralph Amissah2024-03-1240-341/+380
|
* org, ocda (ongoing) split file, separate functionsRalph Amissah2024-01-254-5408/+5526
|
* org, ocda (ongoing)Ralph Amissah2024-01-211-1641/+1558
|
* org, ocda (ongoing)Ralph Amissah2024-01-021-26/+13
|