- Company Name
- Scribd, Inc.
- Job Title
- Software Engineer II (Backend + Data pipelines)
- Job Description
-
**Job Title:** Software Engineer II (Backend + Data pipelines)
**Role Summary:**
Design, develop, and optimize distributed systems that extract, enrich, and process metadata at scale. Integrate LLM-powered capabilities into production pipelines, collaborate with ML engineers and product stakeholders, and ensure high performance, reliability, and data quality across global content streams.
**Expectations:**
- Deliver scalable, production‑ready backend solutions for millions of documents, images, and audio files.
- Integrate and maintain ML/LLM services within metadata workflows.
- Continuously improve system performance, reliability, and maintainability.
- Actively participate in code reviews, testing, and automation of data validation and monitoring.
- Manage infrastructure and data pipeline security in a cloud environment.
**Key Responsibilities:**
1. Design and build scalable metadata extraction and enrichment pipelines.
2. Leverage LLMs for summarization, classification, extraction, and enrichment tasks.
3. Collaborate with cross‑functional teams (ML engineers, product managers) to deliver efficient, reliable metadata solutions.
4. Optimize and refactor existing systems for performance, scalability, and reliability.
5. Ensure data accuracy, integrity, and quality via automated validation and monitoring.
6. Conduct code reviews and enforce best coding practices.
7. Maintain data pipelines, infrastructure, security, and compliance.
8. Deploy and manage services on AWS (Lambda, ECS, EKS) using IaC tools.
**Required Skills:**
- 4+ years of professional software engineering experience.
- Proficiency in Python, Scala, Ruby, or a comparable language.
- Experience designing and building distributed systems at scale.
- Hands‑on experience with AWS services (Lambda, ECS, EKS, SQS, ElastiCache, SageMaker).
- Familiarity with infrastructure‑as‑code (Terraform or similar).
- Experience with big‑data processing frameworks (Spark, Databricks).
- Strong testing, profiling, and optimization skills.
- Knowledge of HTTP APIs and CI/CD pipelines.
**Required Education & Certifications:**
- Bachelor’s degree in Computer Science or equivalent professional experience.
- (Optional) Certifications in cloud platforms (AWS Certified Solutions Architect, etc.) or ML model deployment.