Together AI's Posts (85)

Senior Backend Engineer - Commerce

Together AI is seeking a Senior Backend Engineer to shape, build, and scale the commerce platform that drives our Together’s Cloud products. As a member of the Commerce Engineering team, you will develop and work on mission-critical commerce capabilities including usage-based billing, payment processing, customer-facing analytics, and product entitlements. This role is for someone who thrives on solving complex challenges in distributed systems, and has expertise in backend API services, relational databases, and event-driven architectures for a rapidly scaling and commerce-intensive company. You will work across cloud-native services and globally distributed data centers to deliver high-performance, reliable solutions. Together AI is a research-driven artificial intelligence company. We believe open and transparent AI systems will drive innovation and create the best outcomes for society, and together we are on a mission to significantly lower the cost of modern AI systems by co-designing software, hardware, algorithms, and models. We have contributed to leading open-source research, models, and datasets to advance the frontier of AI, and our team has been behind technological advancement such as FlashAttention, Hyena, FlexGen, and RedPajama. We invite you to join a passionate group of researchers and engineers in our journey in building the next generation AI infrastructure. We offer competitive compensation, startup equity, health insurance and other competitive benefits. The US base salary range for this full-time position is: $160,000 - $230,000 + equity + benefits. Our salary ranges are determined by location, level and role. Individual compensation will be determined by experience, skills, and job-related knowledge. Together AI is an Equal Opportunity Employer and is proud to offer equal employment opportunity to everyone regardless of race, color, ancestry, religion, sex, national origin, sexual orientation, age, citizenship, marital status, disability, gender identity, veteran status, and more. Please see our privacy policy athttps://www.together.ai/privacy - 5+ years of demonstrated experience in building large scale, fault tolerant, distributed systems and API microservices - Experience designing, analyzing and improving efficiency, scalability, and stability of various system resources - Excellent understanding of low level operating systems concepts including multi-threading, memory management, networking and storage, performance and scale - Expert-level programmer in one or more of Golang, Rust, Python, Java, or TypeScript - Proficiency in writing and maintaining Infrastructure as Code (IaC) using tools like Terraform, AWS CDK, or Pulumi - Proficiency in version control practices and integrating IaC with CI/CD pipelines. - Experience with payment processors (e.g. Stripe) and billing systems a plus - Experience with Kubernetes, or containers a plus - Experience building and operating data infrastructure (Kinesis, Airflow, Kafka, etc) a plus - Bachelor’s or Master’s degree in Computer Science, Computer Engineering, or a related technical field, or equivalent practical experience - Identify, design, and develop foundational backend services that power Together’s commerce platform - Analyze and improve the robustness and scalability of existing distributed systems, APIs, databases, and infrastructure - Partner with product teams to understand functional requirements and deliver solutions that meet business needs - Write clear, well-tested, and maintainable software and IaC for both new and existing systems - Conduct design and code reviews, create developer documentation, and develop testing strategies for robustness and fault tolerance - Participate in an on-call rotation to address critical incidents when necessary

Location: San Francisco

Salary range: None - None

Senior Data Engineer

Together AI is looking for a Senior Data Engineer to help define, build, and operate the data infrastructure that handles millions of events every day to power Together’s mission-critical systems. As a Senior Data Engineer, you will work with our Data and Commerce engineering team to scale the data processing components of Together’s usage-based billing system, real-time customer-facing analytics product, and internal business intelligence tools. You will work across both cloud-native services and globally distributed data centers. If you thrive in fast-paced environments and have a passion for defining and building early-stage data platforms for a rapidly scaling and data-intensive company, this is for you. Together AI is a research-driven artificial intelligence company. We believe open and transparent AI systems will drive innovation and create the best outcomes for society, and together we are on a mission to significantly lower the cost of modern AI systems by co-designing software, hardware, algorithms, and models. We have contributed to leading open-source research, models, and datasets to advance the frontier of AI, and our team has been behind technological advancement such as FlashAttention, Hyena, FlexGen, and RedPajama. We invite you to join a passionate group of researchers and engineers in our journey in building the next generation AI infrastructure. We offer competitive compensation, startup equity, health insurance and other competitive benefits. The US base salary range for this full-time position is: $160,000 - $240,000 + equity + benefits. Our salary ranges are determined by location, level and role. Individual compensation will be determined by experience, skills, and job-related knowledge. Together AI is an Equal Opportunity Employer and is proud to offer equal employment opportunity to everyone regardless of race, color, ancestry, religion, sex, national origin, sexual orientation, age, citizenship, marital status, disability, gender identity, veteran status, and more. Please see our privacy policy athttps://www.together.ai/privacy - 5+ years of demonstrated experience in building large scale, fault tolerant, distributed data platforms, stream processing pipelines, ETLs, etc - Expert-level skills in designing, building, and operating stream processing pipelines with services like AWS Kinesis, Apache Kafka, or Redpanda - Expert-level knowledge of building real-time customer facing analytics systems using services like AWS TimeStream or Clickhouse - Proficiency in writing and maintaining Infrastructure as Code (IaC) using tools like Terraform, AWS CDK, or Pulumi - Proficiency in version control practices and integrating IaC with CI/CD pipelines. - Proficiency in implementing and managing GitOps workflows with tools such as ArgoCD, Github Actions, TeamCity, or similar - Proficiency in one or more of Golang, Rust, Python, Java, or TypeScript - Bachelor’s or Master’s degree in Computer Science, Computer Engineering, or a related technical field, or equivalent practical experience - Experience with Kubernetes, or containers a plus - Identify, design, and develop foundational data infrastructure components capable of handling millions or billions of events daily - Analyze and improve the robustness and scalability of existing data processing infrastructure - Partner with product teams to understand functional requirements and deliver solutions that meet business needs - Write clear, well-tested, and maintainable infra-as-code and software for both new and existing systems - Conduct design and code reviews, create developer documentation, and develop testing strategies for robustness and fault tolerance - Participate in an on-call rotation to address critical incidents when necessary

Location: San Francisco

Salary range: None - None

Senior DevOps Engineer

About Together AI Together AI is a research-driven artificial intelligence company. We believe open and transparent AI systems will drive innovation and create the best outcomes for society, and together we are on a mission to significantly lower the cost of modern AI systems by co-designing software, hardware, algorithms, and models. We have contributed to leading open-source research, models, and datasets to advance the frontier of AI, and our team has been behind technological advancement such as FlashAttention, Hyena, FlexGen, and RedPajama. We invite you to join a passionate group of researchers and engineers in our journey in building the next generation AI infrastructure. We are hiring a talented Senior DevOps Engineer to develop the software and processes for orchestration of AI workloads over large fleets of distributed GPU hardware. In this role, you'll be part of a cloud engineering organization that aims toautomate everythingand build failure-resistant and horizontally scalable cloud infrastructure for GPU-resident applications. As a Senior DevOps Engineer, you'll build deep understanding of Together AI’s services and use that knowledge to optimize and evolve our infrastructure's reliability, availability, serviceability, and profitability. The best applicants for this role are deeply technical, enthusiastic, great collaborators, and intrinsically motivated to deliver high quality infrastructure. You have experience practicing infrastructure-as-code, including the use of tools like Terraform and Ansible. You also have strong software development fundamentals, systems knowledge, troubleshooting abilities, and a deep sense of responsibility. Requirements Responsibilities About Together AI Together AI is a research-driven artificial intelligence company. We believe open and transparent AI systems will drive innovation and create the best outcomes for society, and together we are on a mission to significantly lower the cost of modern AI systems by co-designing software, hardware, algorithms, and models. We have contributed to leading open-source research, models, and datasets to advance the frontier of AI, and our team has been behind technological advancement such as FlashAttention, Hyena, FlexGen, and RedPajama. We invite you to join a passionate group of researchers in our journey in building the next generation AI infrastructure. Compensation We offer competitive compensation, startup equity, health insurance and other competitive benefits. The US base salary range for this full-time position is: $160,000 - $230,000 + equity + benefits. Our salary ranges are determined by location, level and role. Individual compensation will be determined by experience, skills, and job-related knowledge. Equal Opportunity Together AI is an Equal Opportunity Employer and is proud to offer equal employment opportunity to everyone regardless of race, color, ancestry, religion, sex, national origin, sexual orientation, age, citizenship, marital status, disability, gender identity, veteran status, and more. Please see our privacy policy athttps://www.together.ai/privacy - Minimum of 5 years of prior relevant experience in DevOps, cloud computing, data center operations and Linux systems administration - Experience in programming in at least one of the following languages: Go, Python, Java, and C++ - Experience designing and building advanced CI/CD pipeline frameworks - Experience with cloud computing toolsets like Terraform, Vault, and Packer - Experience with configuration management tools like Ansible, Pulumi, Chef and Puppet - Experience with Kubernetes and containerization - Strong sense of ownership and desire to build great tools for others - Introduce tools to facilitate greater automation and operability of services - Design, build, and maintain CI/CD infrastructure - Architect, deploy, and scale observability infrastructure - Create runtime tools/processes that optimize cloud triaging and limit downtime - Define best practices to make our systems and services measurable - Work closely with internal teams to ensure best practices are appropriately applied - Build tools to help engineering and research teams measure and improve their velocity - Analyze and decompose complex software systems - Collaborate with and influence others to improve the overall design

Location: San Francisco

Salary range: None - None

Senior Software Development Engineer in Test

As a core member on the SDET Team at Together AI, you will be a key player in setting a high quality bar for our users and customers. Your primary focus will be on designing and implementing automated testing processes using Typescript, Golang, and Python. We’re looking for an independent and self-sufficient member who works closely with stakeholders and engineers, ensuring the overall quality of the products we have at Together AI. Preferred Qualifications Together AI is a research-driven artificial intelligence company. We believe open and transparent AI systems will drive innovation and create the best outcomes for society, and together we are on a mission to significantly lower the cost of modern AI systems by co-designing software, hardware, algorithms, and models. We have contributed to leading open-source research, models, and datasets to advance the frontier of AI, and our team has been behind technological advancement such as FlashAttention, Hyena, FlexGen, and RedPajama. We invite you to join a passionate group of researchers and engineers in our journey in building the next generation AI infrastructure. We offer competitive compensation, startup equity, health insurance and other competitive benefits. The US base salary range for this full-time position is: $140,000 - $220,000 + equity + benefits. Our salary ranges are determined by location, level and role. Individual compensation will be determined by experience, skills, and job-related knowledge. Together AI is an Equal Opportunity Employer and is proud to offer equal employment opportunity to everyone regardless of race, color, ancestry, religion, sex, national origin, sexual orientation, age, citizenship, marital status, disability, gender identity, veteran status, and more. Please see our privacy policy athttps://www.together.ai/privacy - Bachelor's degree in Computer Science, Software Engineering, or a related field or 5+ years of industry experience - Proficiency in Golang, Python, or TypeScript - Proven experience as an SDET, or a similar role in a software development environmentStrong knowledge of automation testing methodologies, tools, and best practicesExperience in Cypress, Rest API Testing, or K6Passionately committed to ensuring the highest standards of software quality and dedicated to delivering top-notch products to our users - Strong knowledge of automation testing methodologies, tools, and best practices - Experience in Cypress, Rest API Testing, or K6 - Passionately committed to ensuring the highest standards of software quality and dedicated to delivering top-notch products to our users - Excellent problem-solving skills and attention to detail - Strong communication and collaboration skills - Self-motivated and adaptable in a fast-paced startup environment - Strong knowledge of automation testing methodologies, tools, and best practices - Experience in Cypress, Rest API Testing, or K6 - Passionately committed to ensuring the highest standards of software quality and dedicated to delivering top-notch products to our users - Familiarity with AI and machine learning concepts - Experience with CI/CD, Argo CD, or Github actions - Experience testing sites running on AWS and EKS - With the SDET Team, develop a sustainable test automation strategy and drive accountability and ownership across relevant teams to maintain these practicesIdentify project needs and establish QA best practices and processes that take into account the team's resources, roadmap, and quality standardsUphold high quality standards using user impact as a factor in decisionsWork closely with engineering and product teams to understand project requirements and align on testing goals, defining strategies and test plans for the project. - Identify project needs and establish QA best practices and processes that take into account the team's resources, roadmap, and quality standards - Uphold high quality standards using user impact as a factor in decisions - Work closely with engineering and product teams to understand project requirements and align on testing goals, defining strategies and test plans for the project. - Create and maintain robust test automation frameworks using Cypress to increase test efficiency and coverage - Write, maintain, and execute automated test scripts for functionality, performance, and reliability testing - Conduct automated regression testing to validate software changes and updates - Document test automation processes, findings, and results for reference and reporting purposes - Stay current on emerging testing tools, best practices, and quality assurance trends - Identify project needs and establish QA best practices and processes that take into account the team's resources, roadmap, and quality standards - Uphold high quality standards using user impact as a factor in decisions - Work closely with engineering and product teams to understand project requirements and align on testing goals, defining strategies and test plans for the project.

Location: San Francisco

Salary range: None - None

Senior Systems Administrator

About Together AI Together AI is a research-driven artificial intelligence company. We believe open and transparent AI systems will drive innovation and create the best outcomes for society, and together we are on a mission to significantly lower the cost of modern AI systems by co-designing software, hardware, algorithms, and models. We have contributed to leading open-source research, models, and datasets to advance the frontier of AI, and our team has been behind technological advancement such as FlashAttention, Hyena, FlexGen, and RedPajama. We invite you to join a passionate group of researchers in our journey in building the next generation AI infrastructure. As the Research Systems Engineer, you will partner with research professionals to design, implement, and maintain high-performance computing (HPC) clusters and cloud environments to support research and development activities. You will collaborate with research professionals to ensure seamless operation of research environments, including job scheduling, resource allocation, and data management. Responsibilities: Qualifications: Compensation We offer competitive compensation, startup equity, health insurance and other competitive benefits. The US base salary range for this full-time position is: $160,000 - $230,000 + equity + benefits. Our salary ranges are determined by location, level and role. Individual compensation will be determined by experience, skills, and job-related knowledge. Equal Opportunity Together AI is an Equal Opportunity Employer and is proud to offer equal employment opportunity to everyone regardless of race, color, ancestry, religion, sex, national origin, sexual orientation, age, citizenship, marital status, disability, gender identity, veteran status, and more. Please see our privacy policy athttps://www.together.ai/privacy - Lead the installation and upgrades of system hardware and software, including computational systems, clusters, standalone machines, storage systems and a variety of network fabrics including Ethernet, InfiniBand, and Fibre Channel. - Provide expertise and guidance in HPC infrastructure, design, implementation, and optimization. - Serve as the primary technical point of contact for our Research team. Troubleshoot and resolve any system related problems to ensure the Research team’s success in using the environments - Coordinate across multi-vendor resources, manage escalations effectively, handle complex issues, and ensure timely and satisfactory resolutions. - Maintain detailed documentation of system configurations, procedures, and troubleshooting guides to facilitate knowledge sharing within the Research team. - Contribute to the creation of training materials to enable the Research team’s success and platform adoption. - Research new and emerging technologies, evaluate workflows and plans, and make recommendations for future improvements to the HPC environment - 5+ years of Linux system administration experience - Strong understanding of HPC architectures with GPU management - Experience with job schedulers and resource managers (e.g. Slurm) - Knowledge of Linux operating systems (e.g., Ubuntu, Red Hat, CentOS) - Working experience with programming languages (e.g., Go, Python, Bash) - Experience with network protocols (e.g., TCP/IP, InfiniBand) - Experience with containerization and virtualization technologies (e.g., Docker, Kubernetes) - Knowledge of cloud computing platforms (e.g., AWS, Azure, Google Cloud) - Familiarity with machine learning and artificial intelligence frameworks (e.g., TensorFlow, PyTorch) - Experience with data analytics, visualization and observability tools (e.g., Grafana, Tableau, Power BI) - Strong problem-solving and analytical skills - Excellent communication and collaboration skills

Location: San Francisco

Salary range: None - None

1 2 3 4 5 ... 17