125 lines
7.2 KiB
Markdown
125 lines
7.2 KiB
Markdown
# Learnings — ProPresenter Parser
|
|
|
|
## Conventions & Patterns
|
|
|
|
(Agents will append findings here)
|
|
|
|
## Task 1: Project Scaffolding — Composer + PHPUnit + Directory Structure
|
|
|
|
### Completed
|
|
- ✅ Created PHP 8.4 project with Composer
|
|
- ✅ Configured PSR-4 autoloading for both namespaces:
|
|
- `ProPresenter\Parser\` → `src/`
|
|
- `Rv\Data\` → `generated/Rv/Data/`
|
|
- ✅ Installed PHPUnit 11.5.55 with google/protobuf 4.33.5
|
|
- ✅ Created phpunit.xml with strict settings
|
|
- ✅ Created SmokeTest.php that passes
|
|
- ✅ All 5 required directories created: src/, tests/, bin/, proto/, generated/
|
|
|
|
### Key Findings
|
|
- PHP 8.4.7 is available on the system
|
|
- Composer resolves dependencies cleanly (28 packages installed)
|
|
- PHPUnit 11 runs with strict mode enabled (beStrictAboutOutputDuringTests, failOnRisky, failOnWarning)
|
|
- Autoloading works correctly with both namespaces configured
|
|
|
|
### Verification Results
|
|
- Composer install: ✅ Success (28 packages)
|
|
- PHPUnit smoke test: ✅ 1 test passed
|
|
- Autoload verification: ✅ Works correctly
|
|
- Directory structure: ✅ All 5 directories present
|
|
|
|
## Task 3: RTF Plain Text Extractor (TDD)
|
|
|
|
### Completed
|
|
- ✅ RtfExtractor::toPlainText() static method — standalone, no external deps
|
|
- ✅ 11 PHPUnit tests all passing (TDD: RED → GREEN)
|
|
- ✅ Handles real ProPresenter CocoaRTF 2761 format
|
|
|
|
### Key RTF Patterns in ProPresenter
|
|
- **Format**: Always `{\rtf1\ansi\ansicpg1252\cocoartf2761 ...}`
|
|
- **Encoding**: Windows-1252 (ansicpg1252), hex escapes `\'xx` for non-ASCII
|
|
- **Soft returns**: Single backslash `\` followed by newline = line break in text
|
|
- **Text location**: After last formatting command (often `\CocoaLigature0 `), before final `}`
|
|
- **Nested groups**: `{\fonttbl ...}`, `{\colortbl ...}`, `{\*\expandedcolortbl ...}` — must be stripped
|
|
- **German chars**: `\'fc`=ü, `\'f6`=ö, `\'e4`=ä, `\'df`=ß, `\'e9`=é, `\'e8`=è
|
|
- **Unicode**: `\uNNNN?` where NNNN is decimal codepoint, `?` is ANSI fallback (skipped)
|
|
- **Stroke formatting**: Some songs have `\outl0\strokewidth-40 \strokec3` before text
|
|
- **Translation boxes**: Same RTF structure, different font size (e.g., fs80 vs fs84)
|
|
|
|
### Implementation Approach
|
|
- Character-by-character parser (not regex) — handles nested braces correctly
|
|
- Strip all `{...}` nested groups first, then process flat content
|
|
- Control words: `\word[N]` pattern, space delimiter consumed
|
|
- Non-RTF input passes through unchanged (graceful fallback)
|
|
|
|
### Testing Gotcha
|
|
- PHP single-quoted strings: `\'` = escaped quote, NOT literal backslash-quote
|
|
- Use **nowdoc** (`<<<'RTF'`) for RTF test data with hex escapes (`\'xx`)
|
|
- Regular concatenated strings work for RTF without hex escapes (soft returns `\\` are fine)
|
|
|
|
- 2026-03-01 task-2 proto import resolution: copied full `Proto7.16.2/` tree (including `google/protobuf/*.proto`) into `php/proto/`; imports already resolve with `--proto_path=./php/proto`, no path rewrites required.
|
|
- 2026-03-01 task-2 version extraction: `application_info.platform_version` from Test.pro = macOS 14.8.3; `application_info.application_version` = major 20, build 335544354.
|
|
- 2026-03-01 task-6 binary fidelity baseline: decode->encode byte round-trip currently yields `0/169` identical files (`168` non-empty from `all-songs` + `Test.pro`); first mismatches typically occur early (~byte offsets 700-3000), indicating systematic re-serialization differences rather than isolated corruption.
|
|
|
|
## Task 5: Group + Arrangement Wrapper Classes (TDD)
|
|
|
|
### Completed
|
|
- ✅ Group.php wrapping Rv\Data\Presentation\CueGroup — getUuid(), getName(), getColor(), getSlideUuids(), setName(), getProto()
|
|
- ✅ Arrangement.php wrapping Rv\Data\Presentation\Arrangement — getUuid(), getName(), getGroupUuids(), setName(), setGroupUuids(), getProto()
|
|
- ✅ 30 tests (16 Group + 14 Arrangement), 74 assertions — all pass
|
|
- ✅ TDD: RED confirmed (class not found errors) → GREEN (all pass)
|
|
|
|
### Protobuf Structure Findings
|
|
- CueGroup (field 12) has TWO parts: `group` (Rv\Data\Group with uuid/name/color) and `cue_identifiers` (repeated UUID = slide refs)
|
|
- Arrangement (field 11) has: uuid, name, `group_identifiers` (repeated UUID = group refs, can repeat same group)
|
|
- UUID.getString() returns the string value; UUID.setString() sets it
|
|
- Color has getRed()/getGreen()/getBlue()/getAlpha() returning floats
|
|
- Group also has hotKey, application_group_identifier, application_group_name (not exposed in wrapper — not needed for song parsing)
|
|
|
|
### Test.pro Verified Structure
|
|
- 4 groups: Verse 1 (2 slides), Verse 2 (1 slide), Chorus (1 slide), Ending (1 slide)
|
|
- 2 arrangements: 'normal' (5 group refs), 'test2' (4 group refs)
|
|
- All groups have non-empty UUIDs
|
|
- Arrangement group UUIDs reference valid group UUIDs (cross-validated in test)
|
|
|
|
## Task 4: TextElement + Slide Wrapper Classes (TDD)
|
|
|
|
### Completed
|
|
- TextElement.php wraps Graphics Element: getName(), hasText(), getRtfData(), setRtfData(), getPlainText()
|
|
- Slide.php wraps Cue: getUuid(), getTextElements(), getAllElements(), getPlainText(), hasTranslation(), getTranslation(), getCue()
|
|
- 24 tests (10 TextElement + 14 Slide), 47 assertions, all pass
|
|
- TDD: RED confirmed then GREEN (all pass)
|
|
- Integration tests verify real Test.pro data
|
|
|
|
### Protobuf Navigation Path (Confirmed)
|
|
- Cue -> getActions()[0] -> getSlide() (oneof) -> getPresentation() (oneof) -> getBaseSlide() -> getElements()[]
|
|
- Slide Element -> getElement() -> Graphics Element
|
|
- Graphics Element -> getName() (user-defined label), hasText(), getText() -> Graphics Text -> getRtfData()
|
|
- Elements WITHOUT text (shapes, media) have hasText() === false, must be filtered
|
|
|
|
### Key Design Decisions
|
|
- TextElement wraps Graphics Element (not Slide Element) for clean text-focused API
|
|
- Slide wraps Cue (not PresentationSlide) because UUID is on the Cue
|
|
- Translation = second text element (index 1); no label detection needed
|
|
- Lazy caching: textElements/allElements computed once per instance
|
|
- Test.pro path from tests: dirname(__DIR__, 2) . '/ref/Test.pro' (2 levels up from php/tests/)
|
|
|
|
## Task 7: Song + ProFileReader Integration (TDD)
|
|
|
|
### Completed
|
|
- ✅ Added `Song` aggregate wrapper (Presentation-level integration over Group/Slide/Arrangement)
|
|
- ✅ Added `ProFileReader::read(string): Song` with file existence and empty-file validation
|
|
- ✅ Added integration-heavy tests: `SongTest` + `ProFileReaderTest` (12 tests, 44 assertions)
|
|
|
|
### Key Implementation Findings
|
|
- Song constructor can eager-load all wrappers safely: `cue_groups` -> Group, `cues` -> Slide, `arrangements` -> Arrangement
|
|
- UUID cross-reference resolution works best with normalized uppercase lookup maps (`groupsByUuid`, `slidesByUuid`) because UUIDs are string-based
|
|
- Group/arrangement references can repeat the same UUID; resolution must preserve order and duplicates (important for repeated chorus)
|
|
- `ProFileReader` using `is_file` + `filesize` correctly handles UTF-8 paths and catches known 0-byte fixture before protobuf parsing
|
|
|
|
### Verified Against Fixtures
|
|
- Test.pro: name `Test`, 4 groups, 5 slides, 2 arrangements
|
|
- `getSlidesForGroup(Verse 1)` resolves to slide UUIDs `[5A6AF946..., A18EF896...]` with texts `Vers1.1/Vers1.2` and `Vers1.3/Vers1.4`
|
|
- `getGroupsForArrangement(normal)` resolves ordered names `[Chorus, Verse 1, Chorus, Verse 2, Chorus]`
|
|
- Diverse reads validated through ProFileReader on 6 files, including `[TRANS]` and UTF-8/non-song file names
|