Job Specifications
Data scientists +ML engineer or DS or Gen AI
Location: Austin, TX (3 days work from office)
During the discovery stage, it will be 5 days working from office for the first 4 weeks of discovery
Job Description:
We are looking for an AI Engineer to modernize and enhance our existing regex/keyword-based ElasticSearch system by integrating state-of-the-art semantic search, dense retrieval, and LLM-powered ranking techniques.
This role will drive the transformation of traditional search into an intelligent, context-aware, personalized, and high-precision search experience.
The ideal candidate has hands-on experience with ElasticSearch internals, information retrieval (IR), embedding-based search, BM25, re-ranking, LLM-based retrieval pipelines, and AWS cloud deployment.
Roles & Responsibilities
Modernizing the Search Platform
Analyze limitations in current regex & keyword-only search implementation on ElasticSearch.
Enhance search relevance using:
BM25 tuning
Synonyms, analyzers, custom tokenizers
Boosting strategies and scoring optimization
Introduce semantic / vector-based search using dense embeddings.
2. LLM-Driven Search & RAG Integration
Implement LLM-powered search workflows including:
Query rewriting and expansion
Embedding generation (OpenAI, Cohere, Sentence Transformers, etc.)
Hybrid retrieval (BM25 + vector search)
Re-ranking using cross-encoders or LLM evaluators
Build RAG (Retrieval Augmented Generation) flows using ElasticSearch vectors, OpenSearch, or AWS-native tools.
3. Search Infrastructure Engineering
Build and optimize search APIs for latency, relevance, and throughput.
Design scalable pipelines for:
Indexing structured and unstructured text
Maintaining embedding stores
Real-time incremental updates
Implement caching, failover, and search monitoring dashboards.
4. AWS Cloud Delivery
Deploy and operate solutions on AWS, leveraging:
OpenSearch Service or EC2-managed ElasticSearch
Lambda, ECS/EKS, API Gateway, SQS/SNS
SageMaker for embedding generation or re-ranking models
Implement CI/CD for search models and pipelines.
5. Evaluation & Continuous Improvement
Develop search evaluation metrics (nDCG, MRR, precision@k, recall).
Conduct A/B experiments to measure improvements.
Tune ranking functions and hybrid search scoring.
Partner with product teams to refine search behaviors with real usage patterns.
Required Skills & Qualifications
5–10 years of experience in AI/ML, NLP, or IR systems, with hands-on search engineering.
Strong expertise in ElasticSearch/OpenSearch: analyzers, mappings, scoring, BM25, aggregations, vectors.
Experience with semantic search:
Embeddings (BERT, SBERT, Llama, GPT-based, Cohere)
Vector databases or ES vector fields
Approximate nearest neighbor (ANN) techniques
Working knowledge of LLM-based retrieval and RAG architectures.
Proficient in Python; familiarity with Java/Scala is a plus.
Hands-on AWS experience (OpenSearch, SageMaker, Lambda, ECS/EKS, EC2, S3, IAM).
Experience building and deploying APIs using FastAPI/Flask and containerizing with Docker.
Familiar with typical IR metrics and search evaluation frameworks.
Preferred Skills
Knowledge of cross-encoder and bi-encoder architectures for re-ranking.
Experience with query understanding, spell correction, autocorrect, and autocomplete features.
Exposure to LLMOps / MLOps in search use cases.
Understanding of multi-modal search (text + images) is a plus.
Experience with knowledge graphs or metadata-aware search.
About the Company
Triunity is a Product Development, Staff Augmentation, and Consulting Services company providing solutions and services in North America. We provide IT services and technology solutions to various business verticals like Healthcare, Pharma, Banking, Finance, etc. Our goal is to develop a long-term partnership with businesses and help them get a competitive advantage by providing IT infrastructure and software platforms.
Lead by experts in the IT industry with a proven record of delivering software solutions, consulting, and...
Know more