*We know each team has their own needs and specifications. That is why we can modify the training outline per need.
Module 1: Finance data foundations and ingestion patterns
- Source landscape core banking, trading, policy, ERP, CRM, market data
- Ingestion choices batch, micro batch, streaming and when each fits
- Files, APIs, and message queues practical connectors and formats CSV, Parquet, JSON
- Landing zones, naming standards, and basic metadata
Module 2: ELT or ETL for analytics
- Staging, core, and presentation layers in a lakehouse or warehouse
- Dimensional and wide table patterns for financial analytics
- Slowly changing dimensions for products, customers, accounts
- Keys, deduplication, and late arriving data handling
Module 3: Data quality and validation
- Business rules completeness, validity, consistency, timeliness
- Automated checks in SQL or Python with thresholds and alerts
- Reconciliation patterns totals, balances, and control accounts
- Capture and remediate data quality incidents with lineage context
Module 4: Governance, security, and lineage basics
- Access control models roles, row level, column masking
- PII and sensitive fields tokenization and selective encryption
- Lineage capture from jobs and queries for traceability
- Documentation that auditors can follow facts, rules, owners
Module 5: Orchestrating reliable workflows
- Scheduling concepts dependencies, retries, SLAs, backfill
- Patterns with Airflow, Prefect, or cloud schedulers
- Idempotency and exactly once behavior for repeatable runs
- Parameterization for environments and date partitions
Module 6: Performance and cost awareness
- Partitioning, clustering, and pruning for large fact tables
- Caching, incremental models, and materialization choices
- Efficient joins and window functions for finance metrics
- Cost signals storage, compute, egress, and simple guardrails
Module 7: Automation with Python and SQL
- Reusable utilities for file handling, APIs, and schema drift
- Templating queries and macros for consistent calculations
- Automated tests for transformations and metrics
- Packaging and version control for repeatable deployments
Module 8: Monitoring and alerting
- Health indicators freshness, volume, failures, and anomalies
- Centralized logging and run history for investigations
- Alert routing and on call basics for data teams
- Post incident review and playbooks
Module 9: Core finance datasets and metrics
- Revenue and fee events, positions and trades, premiums and claims
- Balances, PnL, accruals, and FX effects
- Reference data products, hierarchies, calendars, and holidays
- KPI definitions with owner approved logic
Module 10: BI and forecasting integration
- Serving layers views, extracts, and semantic models
- Connecting to Power BI or Tableau and maintaining refresh
- Time series features and simple model ready datasets
- Self service patterns and guardrails for business teams
Module 11: Controls, compliance, and change management
- Regulatory expectations audit trail, reproducibility, access reviews
- Change control versioning, approvals, and promotion flows
- Data retention and deletion policies with exceptions
- Vendor and third party data controls contracts and SLAs
Module 12: Roadmap and handover
- Inventory of quick wins and high value gaps
- Standards checklist naming, coding, tests, documentation
- Operating rhythms daily checks, weekly reviews, monthly closing support
- Ninety day action plan with owners and milestones