Senior Data Engineer (AI)

Summary: Build robust data pipelines for AI/ML initiatives.

Responsibilities:

  • Develop ETL/ELT processes for structured/unstructured data.
  • Manage data lakes/warehouses (Snowflake, Databricks).
  • Ensure data quality and accessibility for model training.
    Skills:
  • Expertise in Spark, Kafka, SQL, and dbt.

Key Process: Data Pipeline Engineering for AI

  • Inputs: Raw data (structured/unstructured), storage requirements.
  • Activities:
    • Build scalable ETL pipelines.
    • Clean and preprocess data for model training.
    • Manage data versioning and lineage.
  • Outputs: Processed datasets, data catalogs, pipeline logs.
  • Stakeholders: Data scientists, analysts, AI engineers.
  • Tools: Apache Spark, Snowflake, dbt.

Leave a Comment

Your email address will not be published. Required fields are marked *