As the Senior Machine Learning Platform Engineer, you will be a critical part of Netography’s ML team, building and operating the Machine Learning platform and systems that provide advanced detection capabilities for Netography Fusion. The data includes network flows from large enterprises, including on-prem, cloud, and OT network sources, enriched with context from 3rd party data sources, providing a real-time and well-structured data set to work with. You will be the lead engineer responsible for how AWS SageMaker is architected, ensuring it is scalable, reliable, and efficient as part of Netography’s SaaS platform. You will collaborate with a backend software engineer to integrate ML models into production systems and instrument the data flows with other SaaS components. You will also work closely with a data scientist to understand their requirements and design solutions that meet their needs and who will own building the models.
Solutions You’ll Bring
Your background needs to include experience building a production ML environment at a tech company that was solving problems in a similar or analogous area. Direct experience with AWS SageMaker is ideal, but strong, relevant experience with a different ML tech stack together with a desire to become a SageMaker expert also is a fit. Specific domain expertise working with ML in the networking or security space is a plus, but not required.
- Principal owner of Netography’s ML platform using AWS SageMaker with end-to-end responsibility for ML Ops
- Responsible for the architecture, implementation, and operation of the ML platform
- Design, build, and maintain the services and frameworks for ML model training and serving in a scalable manner
- Responsible for implementing new and updating existing ML models
- Developing and implementing testing strategies to ensure the quality and accuracy of ML models
- Participating in code reviews, contributing to the development of coding standards and best practices
- Communicating technical concepts and project updates to stakeholders, both technical and non-technical, in a clear and concise manner
- Shaping the future of ML at Netography
- Participating in the entire software development life cycle, from design to deployment and maintenance
- Participating in on-call rotation
- Translate loosely defined requirements into solutions
Who You Are
- 5+ years of experience as a hands-on engineer building data-oriented products and/or ML systems/products
- 3+ years of experience in building and managing production ML platforms in an ML Ops, ML Platform, or similar team
- 3+ years of experience building large-scale data-oriented products in AWS
- Experience with AWS SageMaker or an ML tech stack that you have end-to-end expertise in building and operating
- Experience with model training, validation, and deployment, including optimizing models for performance and scalability.
- Familiarity with deep learning frameworks such as TensorFlow, PyTorch, or Keras.
- Strong problem-solving and analytical skills, including the ability to debug complex issues in production environments.
- Experience with real-time, online, and/or high-throughput & low-latency distributed systems
- Exceptional coding and design skills. The primary languages being used will be Go and Python, but experience across other languages and a desire to learn are sufficient.
- Experience building data pipelines, Docker, messaging technologies (eg Kafka, Kinesis, SQS), NoSQL databases, (ArangoDB, EKS, DynamoDB), distributed analytics engines (ES)
Who we are
When the Netography Founders established the first DDoS defense company twenty years ago, they envisioned a day when networks could efficiently operate without dedicated security devices or complex security support. They dreamed of networks that could secure themselves without human intervention or rely on expensive hardware.
Today, Netography has made that vision a reality with the cloud’s power & flexibility. We’re helping companies gain visibility into on-premises, cloud & hybrid network environments to eliminate blind spots and an added security layer that does not rely on signatures to remediate threats. By assessing threats in the cloud, we can detect, route, or block bad traffic at a scale and ease never before possible.
$ 160,000- $190,000
What we offer
- Employer paid medical benefits
- Unlimited PTO
- Stock Options
- Inclusive environment that fosters creative problem solving
- The capability to work remotely