Together AI's Posts (85)

QA Engineer

As a Senior QA Engineer at Together,  you will play a pivotal role in ensuring the smooth deployment of our products. You will be responsible for developing and implementing best practices for QA processes, creating and documenting use cases and scenarios. Your expertise will help shape our release process and ensure the reliability and quality of our software. Responsibilities: Release Management: Automated Testing: Collaboration and Communication: Qualifications: About Together AI Together AI is a research-driven artificial intelligence company. We believe open and transparent AI systems will drive innovation and create the best outcomes for society, and together we are on a mission to significantly lower the cost of modern AI systems by co-designing software, hardware, algorithms, and models. We have contributed to leading open-source research, models, and datasets to advance the frontier of AI, and our team has been behind technological advancement such as FlashAttention, Hyena, FlexGen, and RedPajama. We invite you to join a passionate group of researchers and engineers in our journey in building the next generation AI infrastructure. Compensation We offer competitive compensation, startup equity, health insurance and other competitive benefits. The US base salary range for this full-time position is: $160,000 - $220,000 + equity + benefits. Our salary ranges are determined by location, level and role. Individual compensation will be determined by experience, skills, and job-related knowledge. Equal Opportunity Together AI is an Equal Opportunity Employer and is proud to offer equal employment opportunity to everyone regardless of race, color, ancestry, religion, sex, national origin, sexual orientation, age, citizenship, marital status, disability, gender identity, veteran status, and more. Please see our privacy policy athttps://www.together.ai/privacy - Develop and maintain release management processes, ensuring that all releases are thoroughly tested and meet quality standards before deployment. - Identify potential risks in the release process and implement strategies to mitigate them. - Best Practices and Documentation: - Establish and enforce QA best practices and standards for testing, automation, and documentation. - Create detailed and comprehensive test scenarios, use cases, and documentation to guide testing efforts and ensure thorough coverage. - Continuously evaluate and refine testing processes and methodologies to improve efficiency and effectiveness. - Analyze test results, identify trends, and provide actionable insights to improve software quality. - Work closely with development teams to understand new features, changes, and potential impacts on the release process. - Provide regular updates on testing progress, release readiness, and quality metrics to stakeholders. - Participate in Agile/Scrum ceremonies and contribute to sprint planning and retrospectives. - Bachelor’s degree in Computer Science, Engineering, or a related field (or equivalent work experience). - 7+ years of experience in QA engineering, with a strong background in release management and automated testing. - Proven experience with release management processes and best practices. - Strong communication skills and the ability to work effectively with cross-functional teams. - Experience working with SDETs and Engineering teams to increase coverage of high priority features. - Manage competing priorities to keep high standards of quality - Excellent problem-solving skills with a keen attention to detail.

Location: San Francisco

Salary range: None - None

Machine Learning Operations (MLOps) Engineer

Together AI is looking for an MLOps engineer who will develop systems and APIs that enable our customers to perform inference and fine tune LLMs. Relevant experience includes implementing runtime systems that perform inference at scale using AI/ML models from simple models up to the largest LLMs. Together AI is a research-driven artificial intelligence company. We believe open and transparent AI systems will drive innovation and create the best outcomes for society, and together we are on a mission to significantly lower the cost of modern AI systems by co-designing software, hardware, algorithms, and models. We have contributed to leading open-source research, models, and datasets to advance the frontier of AI, and our team has been behind technological advancement such as FlashAttention, Hyena, FlexGen, and RedPajama. We invite you to join a passionate group of researchers and engineers in our journey in building the next generation AI infrastructure. We offer competitive compensation, startup equity, health insurance and other competitive benefits. The US base salary range for this full-time position is: $160,000 - $240,000 + equity + benefits. Our salary ranges are determined by location, level and role. Individual compensation will be determined by experience, skills, and job-related knowledge. Together AI is an Equal Opportunity Employer and is proud to offer equal employment opportunity to everyone regardless of race, color, ancestry, religion, sex, national origin, sexual orientation, age, citizenship, marital status, disability, gender identity, veteran status, and more. Please see our privacy policy athttps://www.together.ai/privacy - 5+ years experience working on a production level ML training or inference system. - Bachelor’s degree in computer science or equivalent industry experience. - Strong understanding of the state of the art in machine learning especially LLMs. - Experience with DevOps practices like CI/CD, automation, containerization (Docker), and orchestration (Kubernetes). - Proficiency in cloud platforms like AWS, Google Cloud, or Azure. - Expertise in programming (Python, go, etc.) and frameworks for ML (TensorFlow, PyTorch, Scikit-learn). - Work closely with engineering, research, and sales on deploying, evaluating, and operating inference systems for both customers and internal use. - Build and maintain tools, services, and documentation for automation and testing. - Analyze and improve efficiency, scalability, and stability of various system resources. - Conduct design and code reviews. - Participate in an on-call rotation to respond to critical incidents as needed.

Location: San Francisco

Salary range: None - None

Machine Learning Engineer

About the Role Together AI is looking for an ML Engineer who will develop systems and APIs that enable our customers to perform inference and fine tune LLMs. Relevant experience includes implementing runtime systems that perform inference at scale using AI/ML models from simple models up to the largest LLMs. Requirements Responsibilities About Together AI Together AI is a research-driven artificial intelligence company. We believe open and transparent AI systems will drive innovation and create the best outcomes for society, and together we are on a mission to significantly lower the cost of modern AI systems by co-designing software, hardware, algorithms, and models. We have contributed to leading open-source research, models, and datasets to advance the frontier of AI, and our team has been behind technological advancement such as FlashAttention, Hyena, FlexGen, and RedPajama. We invite you to join a passionate group of researchers and engineers in our journey in building the next generation AI infrastructure. Compensation We offer competitive compensation, startup equity, health insurance and other competitive benefits. The US base salary range for this full-time position is: $160,000 - $220,000 + equity + benefits. Our salary ranges are determined by location, level and role. Individual compensation will be determined by experience, skills, and job-related knowledge. Equal Opportunity Together AI is an Equal Opportunity Employer and is proud to offer equal employment opportunity to everyone regardless of race, color, ancestry, religion, sex, national origin, sexual orientation, age, citizenship, marital status, disability, gender identity, veteran status, and more. Please see our privacy policy athttps://www.together.ai/privacy - 5+ years experience writing high-performance, well-tested, production quality code - Bachelor’s degree in computer science or equivalent industry experience - Familiar with LLM inference ecosystem, including frameworks and engines (e.g. vLLM, SGLang, TRT, ...) - Demonstrated experience in building large scale, fault tolerant, distributed systems like storage, search, and computation - Expert level programmer in one or more of Python, Go, Rust, or C/C++ - Experience implementing runtime inference services at scale or similar - Design and build the production systems that power the Together Cloud inference and fine-tuning APIs, enabling reliability and performance at scale - Partner with researchers, engineers, product managers, and designers to bring new features and research capabilities to the world - Analyze and improve efficiency, scalability, and stability of various system resources - Conduct design and code reviews - Create services, tools & developer documentation - Create testing frameworks for robustness and fault-tolerance - Participate in an on-call rotation to respond to critical incidents as needed

Location: San Francisco

Salary range: None - None

Lead Cloud Infrastructure Engineer

Together AI is hiring a Lead Cloud Infrastructure Engineer to own and operate the cloud foundation that powers our rapidly scaling data platforms. In this role, you will be the primary engineer responsible for defining, building, and maintaining the AWS infrastructure that underpins data engineering systems across the company — from internal analytics pipelines to customer-facing metering and billing systems. You will work closely with multiple data engineering teams, enabling them to move faster by building reliable, secure-by-default infrastructure they can depend on. You’ll partner with our dedicated security engineering team to ensure best practices around IAM, network design, and data lake access — while focusing on platform reliability, scalability, and developer experience. This is a high-ownership, hands-on engineering role. You’ll manage everything from Terraform modules and CI/CD pipelines to Lake Formation permissions and observability tools — with a mandate to build infrastructure that just works, and keeps working. Together AI is a research-driven artificial intelligence company. We believe open and transparent AI systems will drive innovation and create the best outcomes for society, and together we are on a mission to significantly lower the cost of modern AI systems by co-designing software, hardware, algorithms, and models. We have contributed to leading open-source research, models, and datasets to advance the frontier of AI, and our team has been behind technological advancement such as FlashAttention, Hyena, FlexGen, and RedPajama. We invite you to join a passionate group of researchers and engineers in our journey in building the next generation AI infrastructure. We offer competitive compensation, startup equity, health insurance and other competitive benefits. The US base salary range for this full-time position is: $160,000 - $240,000 + equity + benefits. Our salary ranges are determined by location, level and role. Individual compensation will be determined by experience, skills, and job-related knowledge. Together AI is an Equal Opportunity Employer and is proud to offer equal employment opportunity to everyone regardless of race, color, ancestry, religion, sex, national origin, sexual orientation, age, citizenship, marital status, disability, gender identity, veteran status, and more. Please see our privacy policy athttps://www.together.ai/privacy - Own the design, implementation, and operation of all AWS-based infrastructure supporting data systems - Build and maintain reproducible, well-documented Terraform infrastructure modules - Collaborate with multiple data engineering teams to understand and support their infrastructure needs - Implement and maintain IAM policies and Lake Formation permissions in partnership with the security engineering team - Build internal tooling and automation to support self-service infra provisioning - Monitor infrastructure health and cost, and drive continuous improvements in reliability and efficiency - Serve as the point person for infrastructure questions, issues, and requests across the company’s data platforms - 5+ years of experience in infrastructure, SRE, or DevOps roles - Deep experience with Terraform and AWS infrastructure design, including VPCs, IAM, S3, EC2, etc. - Experience with access control and permissions in AWS, including Lake Formation or similar - Experience supporting data infrastructure or working alongside data engineering teams (Kafka/Kinesis/ClickHouse a plus) - Familiarity with CI/CD for infrastructure (GitHub Actions, TeamCity, ArgoCD, etc.) - Proficiency in scripting or programming (Python, Go, or Bash preferred) - Comfortable working autonomously and supporting multiple stakeholders - Bonus: experience with Kubernetes, hybrid cloud/on-prem infrastructure, or cost optimization at scale

Location: San Francisco

Salary range: None - None

Infrastructure Security Engineer

We are seeking a skilled Infrastructure Security Engineer to join our team and contribute to building open, transparent, and secure AI systems. As a crucial member of our security team, you will play a vital role in safeguarding our globally distributed systems and infrastructure. Preferred Qualifications As our ideal candidate, you will be part engineer, part hacker, with a passion for cybersecurity and a drive to protect our systems from evolving threats. Role based at our Offices in Amsterdam. Together AI is a research-driven artificial intelligence company. We believe open and transparent AI systems will drive innovation and create the best outcomes for society, and together we are on a mission to significantly lower the cost of modern AI systems by co-designing software, hardware, algorithms, and models. We have contributed to leading open-source research, models, and datasets to advance the frontier of AI, and our team has been behind technological advancement such as FlashAttention, Hyena, FlexGen, and RedPajama. We invite you to join a passionate group of researchers in our journey in building the next generation AI infrastructure. Together AI is an Equal Opportunity Employer and is proud to offer equal employment opportunity to everyone regardless of race, color, ancestry, religion, sex, national origin, sexual orientation, age, citizenship, marital status, disability, gender identity, veteran status, and more. Please see our privacy policy athttps://www.together.ai/privacy - Collaborate with engineering and infrastructure teams to enhance the security of our global systems - Develop and implement internal security tools to address security challenges at scale - Work closely with the security team to gather system telemetry and improve detection of potential security events - Protect our global infrastructure from a wide range of threats, from amateur hackers to sophisticated nation-state actors - Identify and remediate vulnerabilities in our engineering systems, adhering to best practices for data security. - Use AI-driven models to develop systems for improved security detection and response, data classification, and other security-related tasks. - 5+ years working within public cloud or data center infrastructure space - Proven experience in systems and cloud-native software engineering - Proficiency in Linux configuration, security, and administration - Experience deploying and utilizing security telemetry services (e.g., OSQuery, Falco) - Expertise in automation tools such as Ansible, Terraform, and/or Salt - Experience with authenticated vulnerability scanning of distributed systems - Golang or Rust development experience - Experience building and managing cloud-native services (e.g., AWS, Google Cloud Platform) - Strong understanding of Linux system and networking fundamentals, including TCP/IP, kernel operations, memory, and file system management - Familiarity with Open Compute Project (OCP) technologies (OpenBMC, Coreboot, Linuxboot, etc.)

Location: Amsterdam

Salary range: None - None

1 ... 6 7 8 9 ... 17