7.2 KiB
Learnings — ProPresenter Parser
Conventions & Patterns
(Agents will append findings here)
Task 1: Project Scaffolding — Composer + PHPUnit + Directory Structure
Completed
- ✅ Created PHP 8.4 project with Composer
- ✅ Configured PSR-4 autoloading for both namespaces:
ProPresenter\Parser\→src/Rv\Data\→generated/Rv/Data/
- ✅ Installed PHPUnit 11.5.55 with google/protobuf 4.33.5
- ✅ Created phpunit.xml with strict settings
- ✅ Created SmokeTest.php that passes
- ✅ All 5 required directories created: src/, tests/, bin/, proto/, generated/
Key Findings
- PHP 8.4.7 is available on the system
- Composer resolves dependencies cleanly (28 packages installed)
- PHPUnit 11 runs with strict mode enabled (beStrictAboutOutputDuringTests, failOnRisky, failOnWarning)
- Autoloading works correctly with both namespaces configured
Verification Results
- Composer install: ✅ Success (28 packages)
- PHPUnit smoke test: ✅ 1 test passed
- Autoload verification: ✅ Works correctly
- Directory structure: ✅ All 5 directories present
Task 3: RTF Plain Text Extractor (TDD)
Completed
- ✅ RtfExtractor::toPlainText() static method — standalone, no external deps
- ✅ 11 PHPUnit tests all passing (TDD: RED → GREEN)
- ✅ Handles real ProPresenter CocoaRTF 2761 format
Key RTF Patterns in ProPresenter
- Format: Always
{\rtf1\ansi\ansicpg1252\cocoartf2761 ...} - Encoding: Windows-1252 (ansicpg1252), hex escapes
\'xxfor non-ASCII - Soft returns: Single backslash
\followed by newline = line break in text - Text location: After last formatting command (often
\CocoaLigature0), before final} - Nested groups:
{\fonttbl ...},{\colortbl ...},{\*\expandedcolortbl ...}— must be stripped - German chars:
\'fc=ü,\'f6=ö,\'e4=ä,\'df=ß,\'e9=é,\'e8=è - Unicode:
\uNNNN?where NNNN is decimal codepoint,?is ANSI fallback (skipped) - Stroke formatting: Some songs have
\outl0\strokewidth-40 \strokec3before text - Translation boxes: Same RTF structure, different font size (e.g., fs80 vs fs84)
Implementation Approach
- Character-by-character parser (not regex) — handles nested braces correctly
- Strip all
{...}nested groups first, then process flat content - Control words:
\word[N]pattern, space delimiter consumed - Non-RTF input passes through unchanged (graceful fallback)
Testing Gotcha
-
PHP single-quoted strings:
\'= escaped quote, NOT literal backslash-quote -
Use nowdoc (
<<<'RTF') for RTF test data with hex escapes (\'xx) -
Regular concatenated strings work for RTF without hex escapes (soft returns
\\are fine) -
2026-03-01 task-2 proto import resolution: copied full
Proto7.16.2/tree (includinggoogle/protobuf/*.proto) intophp/proto/; imports already resolve with--proto_path=./php/proto, no path rewrites required. -
2026-03-01 task-2 version extraction:
application_info.platform_versionfrom Test.pro = macOS 14.8.3;application_info.application_version= major 20, build 335544354. -
2026-03-01 task-6 binary fidelity baseline: decode->encode byte round-trip currently yields
0/169identical files (168non-empty fromall-songs+Test.pro); first mismatches typically occur early (~byte offsets 700-3000), indicating systematic re-serialization differences rather than isolated corruption.
Task 5: Group + Arrangement Wrapper Classes (TDD)
Completed
- ✅ Group.php wrapping Rv\Data\Presentation\CueGroup — getUuid(), getName(), getColor(), getSlideUuids(), setName(), getProto()
- ✅ Arrangement.php wrapping Rv\Data\Presentation\Arrangement — getUuid(), getName(), getGroupUuids(), setName(), setGroupUuids(), getProto()
- ✅ 30 tests (16 Group + 14 Arrangement), 74 assertions — all pass
- ✅ TDD: RED confirmed (class not found errors) → GREEN (all pass)
Protobuf Structure Findings
- CueGroup (field 12) has TWO parts:
group(Rv\Data\Group with uuid/name/color) andcue_identifiers(repeated UUID = slide refs) - Arrangement (field 11) has: uuid, name,
group_identifiers(repeated UUID = group refs, can repeat same group) - UUID.getString() returns the string value; UUID.setString() sets it
- Color has getRed()/getGreen()/getBlue()/getAlpha() returning floats
- Group also has hotKey, application_group_identifier, application_group_name (not exposed in wrapper — not needed for song parsing)
Test.pro Verified Structure
- 4 groups: Verse 1 (2 slides), Verse 2 (1 slide), Chorus (1 slide), Ending (1 slide)
- 2 arrangements: 'normal' (5 group refs), 'test2' (4 group refs)
- All groups have non-empty UUIDs
- Arrangement group UUIDs reference valid group UUIDs (cross-validated in test)
Task 4: TextElement + Slide Wrapper Classes (TDD)
Completed
- TextElement.php wraps Graphics Element: getName(), hasText(), getRtfData(), setRtfData(), getPlainText()
- Slide.php wraps Cue: getUuid(), getTextElements(), getAllElements(), getPlainText(), hasTranslation(), getTranslation(), getCue()
- 24 tests (10 TextElement + 14 Slide), 47 assertions, all pass
- TDD: RED confirmed then GREEN (all pass)
- Integration tests verify real Test.pro data
Protobuf Navigation Path (Confirmed)
- Cue -> getActions()[0] -> getSlide() (oneof) -> getPresentation() (oneof) -> getBaseSlide() -> getElements()[]
- Slide Element -> getElement() -> Graphics Element
- Graphics Element -> getName() (user-defined label), hasText(), getText() -> Graphics Text -> getRtfData()
- Elements WITHOUT text (shapes, media) have hasText() === false, must be filtered
Key Design Decisions
- TextElement wraps Graphics Element (not Slide Element) for clean text-focused API
- Slide wraps Cue (not PresentationSlide) because UUID is on the Cue
- Translation = second text element (index 1); no label detection needed
- Lazy caching: textElements/allElements computed once per instance
- Test.pro path from tests: dirname(DIR, 2) . '/ref/Test.pro' (2 levels up from php/tests/)
Task 7: Song + ProFileReader Integration (TDD)
Completed
- ✅ Added
Songaggregate wrapper (Presentation-level integration over Group/Slide/Arrangement) - ✅ Added
ProFileReader::read(string): Songwith file existence and empty-file validation - ✅ Added integration-heavy tests:
SongTest+ProFileReaderTest(12 tests, 44 assertions)
Key Implementation Findings
- Song constructor can eager-load all wrappers safely:
cue_groups-> Group,cues-> Slide,arrangements-> Arrangement - UUID cross-reference resolution works best with normalized uppercase lookup maps (
groupsByUuid,slidesByUuid) because UUIDs are string-based - Group/arrangement references can repeat the same UUID; resolution must preserve order and duplicates (important for repeated chorus)
ProFileReaderusingis_file+filesizecorrectly handles UTF-8 paths and catches known 0-byte fixture before protobuf parsing
Verified Against Fixtures
- Test.pro: name
Test, 4 groups, 5 slides, 2 arrangements getSlidesForGroup(Verse 1)resolves to slide UUIDs[5A6AF946..., A18EF896...]with textsVers1.1/Vers1.2andVers1.3/Vers1.4getGroupsForArrangement(normal)resolves ordered names[Chorus, Verse 1, Chorus, Verse 2, Chorus]- Diverse reads validated through ProFileReader on 6 files, including
[TRANS]and UTF-8/non-song file names