New

Data Engineer

TowneBank
United States, Virginia, Norfolk
3 Commercial Place (Show on map)
Apr 28, 2026
Description The Data Engineer will design, build, and maintain batch ETL pipelines on a modern Databricks Lakehouse platform, delivering high-quality data solutions that support critical banking functions. You will take ownership of pipelines end-to-end, from ingestion through transformation, quality assurance, and delivery to downstream consumers, including Power BI dashboards and analytical models. The ideal candidate combines strong technical depth in Spark and Delta Lake with a natural orientation toward documentation, process improvement, and clear communication. This role requires the ability to work autonomously, prioritize effectively across competing demands, and contribute to the ongoing maturation of the team's engineering practices. Essential Responsibilities: Data Pipeline Development: Design, build, and maintain batch ETL pipelines that ingest data from diverse source systems into the Databricks environment. Own the full pipeline lifecycle including ingestion, transformation, serving, monitoring, and incident resolution. Data Quality and Integrity: Implement automated validation, reconciliation checks, and data quality gates across pipelines. Ensure data meets standards for accuracy, completeness, timeliness, and consistency. Maintain historical data for auditability and compliance. Performance Optimization: Optimize data processing performance on Databricks through efficient Spark SQL, partitioning strategies, and Delta Lake table maintenance. Data Modeling for Analytics: Design dimensional models (star schema, aggregation tables) that serve Power BI dashboards and self-service analytics effectively. Prepare the semantic layer for AI-powered analytics capabilities, including Databricks Genie Rooms, through clean business logic, well-documented table relationships, and intuitive naming conventions. Process and Documentation: Establish and maintain runbooks, deployment procedures, coding standards, and operational documentation. Contribute to code review practices, automated quality checks, and repeatable processes that enable the team to scale. Governance and Compliance: Adhere to enterprise data governance policies and implement security best practices for sensitive financial data. Enforce access controls, encryption, and data lineage tracking across pipelines in accordance with banking regulations. Cross-Team Collaboration: Work with data architects, analysts, and business stakeholders to gather requirements and translate business needs into scalable data solutions. Communicate technical constraints and timelines clearly to non-technical partners. Mentoring and Knowledge Sharing: Support the development of junior and mid-level engineers through code review, pairing, and in-context coaching. Contribute to a culture of continuous learning and shared technical ownership. Continuous Improvement: Identify and implement improvements to enhance pipeline stability, efficiency, and scalability. Evaluate and adopt emerging Databricks features and industry best practices as appropriate. Adhere to applicable federal laws, rules, and regulations including those related to Anti-Money Laundering (AML) and the Bank Secrecy Act (BSA). Other duties as assigned. Minimum Required Skills and Competencies Experience: Bachelor's degree in Computer Science or related field (or equivalent practical experience). 5+ years of experience as a data engineer in complex, large-scale data environments, preferably in the cloud. Databricks and Spark Proficiency: Strong hands-on expertise with Databricks and the Apache Spark ecosystem (PySpark, Spark SQL) for building and optimizing large-scale data pipelines. Production experience with Delta Lake tables and Lakehouse architectural patterns. Delta Lake Operations: Working experience with OPTIMIZE, VACUUM, Z-ordering, MERGE INTO for upserts, and time travel for debugging and auditing. Ability to articulate practical differences between Delta Lake and raw Parquet. Programming and SQL: Proficient in Python (including PySpark) for data processing. Strong SQL skills for complex querying and transformation. Emphasis on writing clean, well-structured, maintainable code. Data Modeling: Strong understanding of dimensional modeling and data warehousing concepts, including star schemas, slowly changing dimensions, and designing tables optimized for BI tool consumption (import vs. DirectQuery, pre-aggregation strategies). Pipeline Architecture: Experience designing end-to-end data pipeline architectures including orchestration, monitoring, alerting, and error handling. Familiarity with Databricks Jobs, Apache Airflow, or equivalent workflow tools. Data Quality and Testing: Hands-on experience implementing automated data validation, reconciliation checks, and quality gates in production ETL pipelines. Data Governance and Security: Knowledge of data governance standards and security best practices for managing sensitive data. Understanding of compliance requirements in banking (encryption, PII handling, auditing) and ability to enforce access controls and data lineage documentation. Version Control and CI/CD: Experience with Git-based workflows and CI/CD pipelines for deploying data pipeline code in a controlled, repeatable manner. Communication and Collaboration: Ability to translate technical concepts for non-technical audiences, document processes clearly, and collaborate effectively across engineering and business teams. Desired Skills and Competencies Banking and Financial Domain Knowledge: Experience in banking, financial services, or other regulated industries. Familiarity with credit risk data, financial reconciliation, or regulatory reporting requirements. Regulatory Data Pipelines: Experience building pipelines for compliance use cases with audit trail, data lineage, and accuracy requirements (e.g., CCAR, Basel, AML reporting). Unity Catalog: Experience with Databricks Unity Catalog for data governance, access control, and catalog management. Power BI and Semantic Modeling: Familiarity with Power BI development, DAX, or semantic modeling for analytics consumption. AI/BI Analytics: Exposure to Databricks AI/BI features, including Genie Rooms for natural-language analytics, or experience preparing data layers for AI-driven consumption. Streaming Data: Exposure to real-time data streaming and event-driven architectures. Knowledge of Spark Structured Streaming or Kafka alongside batch workflows. DataOps Practices: Understanding of DataOps techniques including automated testing, monitoring, and CI/CD for data pipelines. Certifications: Relevant certifications such as Databricks Certified Data Engineer or cloud platform data engineering certifications. Physical Requirements Express or exchange ideas by means of the spoken word via email and verbally. Exert up to 10 pounds of force occasionally, use your arms and legs, and sit most of the time. Have close visual acuity to perform activities such as analyzing data, viewing a computer terminal, reading, and preparing documentation. Not substantially exposed to adverse environmental conditions. Equal Opportunity Employer/Protected Veterans/Individuals with Disabilities This employer is required to notify all applicants of their rights pursuant to federal employment laws. For further information, please review the Know Your Rights notice from the Department of Labor.