Ledger Balance History & Aggregation Design

Core Principles

1. Transactional Consistency

Atomic Updates: Balance changes and history records are updated in single transactions
Before/After States: Store both previous and new balances for auditability
Context Capture:
- Initiating user
- Source transaction
- Operation type (credit/debit/adjustment)
- Timestamps (processing time + value date)

2. Pre-Aggregated Snapshots

Materialized Views: Pre-computed aggregates for common intervals
Refresh Strategy:
- Real-time updates for current period
- Nightly refresh for historical data

Storage Optimization:

type BalanceSnapshot {
    Interval    string // hour/day/week/month/year
    OpenBalance int64
    CloseBalance int64
    MinBalance  int64
    MaxBalance  int64
    SampleCount int
}

3. Time-Bucket Caching

Rolling Window:
- 1-second resolution for last hour
- 1-minute resolution for last 7 days
- 1-hour resolution for last 90 days

Efficient Storage:

CREATE TABLE balance_buckets (
    ledger_id UUID,
    bucket_start TIMESTAMPTZ,
    resolution INTERVAL,
    aggregates JSONB
) PARTITION BY RANGE (bucket_start);

4. Statistical Sampling

Retention Policy:
- Raw data: 30 days
- Detailed aggregates: 2 years
- Statistical summaries: 7 years

Compression:

type BalanceStatistics {
    PeriodStart  time.Time
    MeanBalance  float64
    StdDev       float64
    Percentile5  int64
    Percentile95 int64
}

5. Flexible Ledger Identification

Multiple Access Patterns:
- UUID-based primary keys for system integrity
- Code-based lookup for human readability and compatibility with accounting practices
- Numeric ID lookup for legacy system integration

Resolution Strategy:

// Pseudocode for ledger resolution
func ResolveLedger(ledgerID uuid.UUID, ledgerCode string, ledgerNumID int64) (*Ledger, error) {
    // Try UUID first if valid
    if ledgerID != uuid.Nil {
        return FindByID(ledgerID)
    }
    
    // Try code lookup if provided
    if ledgerCode != "" {
        return FindByCode(ledgerCode)
    }
    
    // Try numeric ID as both exact and partial match
    if ledgerNumID > 0 {
        ledger, err := FindByExactCode(strconv.FormatInt(ledgerNumID, 10))
        if err == nil {
            return ledger, nil
        }
        
        // Try partial match within codes
        return FindByCodeContaining(strconv.FormatInt(ledgerNumID, 10))
    }
    
    return nil, errors.New("no valid ledger identifier provided")
}

Benefits:
- Supports diverse client integration scenarios
- Facilitates migration from legacy accounting systems
- Enables natural use of accounting codes in journal entries
- Simplifies API usage in environments where UUID handling is difficult

Aggregation Strategies

1. Materialized View Pattern

-- Daily balances example
CREATE MATERIALIZED VIEW balance_daily AS
SELECT
  ledger_id,
  date_trunc('day', effective_date) AS day,
  FIRST(balance) AS open,
  LAST(balance) AS close,
  MIN(balance) AS low,
  MAX(balance) AS high
FROM balance_history
GROUP BY 1, 2;

2. Continuous Aggregates

// Golang implementation
func MaintainAggregations() {
    ticker := time.NewTicker(15 * time.Minute)
    go func() {
        for range ticker.C {
            RefreshHourlyAggregates()
            if time.Now().Hour() == 3 {
                RefreshDailyAggregates() 
            }
        }
    }()
}

3. Hierarchical Rollup


### Best Practices for Ledger Balance History & Aggregation

1. **Transactional Consistency**
   - Ensure that balance updates and history records are part of the same database transaction. This guarantees atomicity and consistency, preventing partial updates that could lead to data discrepancies.
   - Capture both the previous and new balance states for each transaction. This allows for detailed audit trails and facilitates debugging and reconciliation processes.
   - Include contextual metadata such as the initiating user, source transaction ID, operation type (credit, debit, adjustment), and timestamps (both processing time and value date) to provide a comprehensive view of each balance change.

2. **Pre-Aggregated Snapshots**
   - Utilize materialized views to store pre-computed aggregates for common time intervals (e.g., hourly, daily, weekly, monthly, annually). This reduces the computational load on the database when querying historical data.
   - Implement a refresh strategy that updates materialized views in real-time for the current period and performs nightly refreshes for historical data. This ensures that the data remains up-to-date without impacting performance.
   - Optimize storage by using a `BalanceSnapshot` structure that includes fields for interval type, open and close balances, min and max balances, and sample count.

3. **Time-Bucket Caching**
   - Implement a rolling window cache with different resolutions for various time frames (e.g., 1-second resolution for the last hour, 1-minute resolution for the last 7 days, 1-hour resolution for the last 90 days). This allows for efficient storage and quick access to recent data.
   - Use partitioned tables to store time-bucketed data, which can improve query performance and manageability.

4. **Statistical Sampling and Retention Policies**
   - Define a retention policy that balances the need for detailed historical data with storage constraints. For example, keep raw data for 30 days, detailed aggregates for 2 years, and statistical summaries for 7 years.
   - Implement data compression techniques to reduce storage requirements while maintaining the ability to perform meaningful analysis. This can include storing statistical summaries such as mean balance, standard deviation, and percentiles.

5. **Flexible Ledger Identification**
   - Implement multiple lookup methods to accommodate different client needs and system integration scenarios.
   - Maintain a clear precedence order for resolving ledger identifiers: UUID > Code > Numeric ID.
   - For numeric ID lookups, provide both exact and partial matching capabilities to handle cases where the numeric ID might be embedded within a more complex ledger code.
   - Cache frequently accessed ledgers by all identification types to optimize performance, especially during high-volume transaction processing.
   - Ensure that all ledger lookup methods are properly indexed in the database to maintain performance.

6. **Hierarchical Rollup and Continuous Aggregates**
   - Use hierarchical rollup techniques to aggregate data at different levels (e.g., account, department, organization) for comprehensive reporting.
   - Implement continuous aggregates that automatically refresh at regular intervals (e.g., every 15 minutes) to ensure that the data remains current and accurate.

By following these best practices, you can ensure that your ledger balance history and aggregation system is reliable, efficient, and capable of supporting advanced financial analysis and reporting.

Ledger Balance History & Aggregation Design

On this page