Hiring - AI/ML Engineer - 100% Remote, United States
Job Role: AI/ML Engineer.
Job Type/Type of Hire: W2 Contract.
Job Location: 100% Remote, United States.
- Summary of the project/initiatives which describes what’s being done:
• Build, modernize and maintain the AI/ML Platform & related frameworks / solutions.
• Participate and contribute in architecture & design reviews.
• Build/Deploy AI/ML platform in Azure with open-source applications (Argo, Jupyter Hub/Kubeflow) and/or cloud/SaaS solutions (Azure ML, Databricks).
• You will design, develop, test, deploy, and maintain distributed & GPU-enabled Machine Learning Pipelines using K8s/AKS based Argo Workflow Orchestration solutions, while collaborating with Data Scientists.
• Enable/Support platform to do distributed data processing using Apache Spark and other distributed / scale technologies.
• Build ETL pipelines, ingress / egress methodologies in context to AIML use-cases.
• Build highly scalable backend REST APIs for metadata management and other misc. business needs.
• Deploy Application in Azure Kubernetes Service using GitLab, Jenkins, Docker, Kubectl, Helm and Manifesto.
• Experience in branching, tagging, and maintaining the versions across different environments in GitLab.
• Review code developed by other developers and provide feedback to ensure best practices (e.g., design patterns, accuracy, testability, efficiency etc.)
• Work with relevant engineering, operations, business lines, and infrastructure groups to ensure effective architectures and designs and communicate findings clearly to technical and non-technical partners.
• Perform functional, benchmark & performance testing and tuning to achieve performant AIML workflow(s), interactive notebook user experiences, and pipelines.
• Assess, design & optimize the resources capacities for ML based resource (GPU) intensive workloads.
• Communicate processes and results of the application with all parties involved in the product team, like engineers, product owner, scrum master and third-party vendors.
- Top 5-10 responsibilities for this position:
• Experience developing AIML platforms & frameworks (including core offerings such as model training, inferencing, distributed/parallel programming), preferably on Kubernetes and native cloud.
• Highly skilled with Python or JAVA programming languages.
• Highly skilled with database languages like SQL & NoSQL.
• Experience designing, developing, and deploying highly maintainable, extensible, and testable distributed applications using Python and other languages.
• Experience developing ETL pipelines and REST APIs in Python using Flask or Django.
• Experienced with technologies/frameworks including Kubernetes, Helm Charts, Notebooks, Workflow orchestration tools, and CI/CD & monitoring frameworks.
- Basic Qualifications:
• Bachelor’s/master’s degree in computer science or data science.
• 6 – 8 years of experience in software development and with data structures/algorithms.
- Required & Preferred Technical Qualifications/Skills:
• Experience with AI/ML open-source projects in large datasets using Jupyter, Argo, Spark, Pytorch, TensorFlow.
• Experience creating Unit and Functional test cases using PyTest, UnitTest.
• Experience with training and tuning models in Machine Learning.
• Experience working with Jupyter Hub.
• Experience with DB management system like PostgreSQL.
• Experience in searching, monitoring, and analyzing logs using Splunk/Kibana.
• GraphQL/Swagger implementation knowledge.
• Strong understanding and experience with Kubernetes for availability and scalability of applications in Azure Kubernetes Service.
• Experience building CI/CD pipelines using Cloudbees Jenkins, Docker, Artifactory, Kubernetes, Helm Charts and Gitlab.
• Experience with tools like Jupyter Hub, Kubeflow, MLFlow, TensorFlow, Scikit, Apache Spark, Kafka.
• Experience with workflow orchestration tools such as Apache Airflow, Argo workflows.
• Familiarity with Conda, PyPi, and Node.js package builds.
Email: spandan@tigerbells.com