- Company Name
- Lighthouse
- Job Title
- Data Engineer
- Job Description
-
**Job title:** Data Engineer
**Role Summary:**
Design, build, deploy, and maintain large‑scale data pipelines ingesting ~100 TB/day from diverse sources (APIs, webhooks, SFTP, cloud storage). Architect modern, scalable, and reliable data architecture for the hospitality domain, ensuring high‑fidelity data products and operational excellence.
**Expectations:**
- Deliver end‑to‑end pipeline solutions for 3 trillion‑record hotel dataset.
- Transition legacy processes to modern, scalable architectures.
- Apply AI, automation, and observability to improve pipeline reliability and speed.
- Collaborate cross‑functionally with product, engineering, support, business, and external partners.
**Key Responsibilities:**
1. **Pipeline Engineering:**
- Design and implement scalable ingestion, transformation, and storage pipelines.
- Integrate polyglot data sources (APIs, webhooks, SFTP, cloud storage).
2. **Domain Modeling & Transformation:**
- Build complex domain models to unify heterogeneous data into high‑quality data products.
3. **Architectural Improvements:**
- Modernize legacy processes, adopt cloud-native data services, and optimize cost/throughput.
4. **Operational Excellence:**
- Develop advanced observability, automated testing, self‑healing mechanisms.
- Conduct deep‑dive root‑cause analysis of production incidents.
5. **Team Collaboration & Leadership:**
- Partner with product, engineering, support, business, and external stakeholders.
- Mentor junior engineers; influence engineering velocity and quality.
**Required Skills:**
- **Programming:** Python (large‑scale data processing).
- **Data Streaming & Messaging:** Kafka, Google Cloud Pub/Sub.
- **Data Warehousing & Cloud Databases:** BigQuery, Snowflake, Databricks, Spanner.
- **Cloud Platforms:** Proficient in GCP, AWS, or Azure; experience with Kubernetes.
- **Data Architecture:** Design, deployment, testing, monitoring of complex pipelines.
- **Observability & Automation:** Logging, metrics, automated testing, self‑healing.
- **AI & Automation:** Use AI‑assisted tools for code quality and debugging.
- **Communication:** Fluent English, stakeholder management, technical storytelling.
**Required Education & Certifications:**
- Bachelor’s or Master’s degree in Computer Science, Engineering, or related technical field.
- Certifications in cloud platforms (e.g., GCP Professional Data Engineer, AWS Big Data Specialty) preferred but not mandatory.