Together AI's Posts (85)

2025-07-19

Senior DevOps Engineer

About Together AI Together AI is a research-driven artificial intelligence company. We believe open and transparent AI systems will drive innovation and create the best outcomes for society, and together we are on a mission to significantly lower the cost of modern AI systems by co-designing software, hardware, algorithms, and models. We have contributed to leading open-source research, models, and datasets to advance the frontier of AI, and our team has been behind technological advancement such as FlashAttention, Hyena, FlexGen, and RedPajama. We invite you to join a passionate group of researchers and engineers in our journey in building the next generation AI infrastructure. We are hiring a talented Senior DevOps Engineer to develop the software and processes for orchestration of AI workloads over large fleets of distributed GPU hardware. In this role, you'll be part of a cloud engineering organization that aims toautomate everythingand build failure-resistant and horizontally scalable cloud infrastructure for GPU-resident applications. As a Senior DevOps Engineer, you'll build deep understanding of Together AI’s services and use that knowledge to optimize and evolve our infrastructure's reliability, availability, serviceability, and profitability. The best applicants for this role are deeply technical, enthusiastic, great collaborators, and intrinsically motivated to deliver high quality infrastructure. You have experience practicing infrastructure-as-code, including the use of tools like Terraform and Ansible. You also have strong software development fundamentals, systems knowledge, troubleshooting abilities, and a deep sense of responsibility. Requirements Responsibilities About Together AI Together AI is a research-driven artificial intelligence company. We believe open and transparent AI systems will drive innovation and create the best outcomes for society, and together we are on a mission to significantly lower the cost of modern AI systems by co-designing software, hardware, algorithms, and models. We have contributed to leading open-source research, models, and datasets to advance the frontier of AI, and our team has been behind technological advancement such as FlashAttention, Hyena, FlexGen, and RedPajama. We invite you to join a passionate group of researchers in our journey in building the next generation AI infrastructure. Compensation We offer competitive compensation, startup equity, health insurance and other competitive benefits. The US base salary range for this full-time position is: $160,000 - $230,000 + equity + benefits. Our salary ranges are determined by location, level and role. Individual compensation will be determined by experience, skills, and job-related knowledge. Equal Opportunity Together AI is an Equal Opportunity Employer and is proud to offer equal employment opportunity to everyone regardless of race, color, ancestry, religion, sex, national origin, sexual orientation, age, citizenship, marital status, disability, gender identity, veteran status, and more. Please see our privacy policy athttps://www.together.ai/privacy - Minimum of 5 years of prior relevant experience in DevOps, cloud computing, data center operations and Linux systems administration - Experience in programming in at least one of the following languages: Go, Python, Java, and C++ - Experience designing and building advanced CI/CD pipeline frameworks - Experience with cloud computing toolsets like Terraform, Vault, and Packer - Experience with configuration management tools like Ansible, Pulumi, Chef and Puppet - Experience with Kubernetes and containerization - Strong sense of ownership and desire to build great tools for others - Introduce tools to facilitate greater automation and operability of services - Design, build, and maintain CI/CD infrastructure - Architect, deploy, and scale observability infrastructure - Create runtime tools/processes that optimize cloud triaging and limit downtime - Define best practices to make our systems and services measurable - Work closely with internal teams to ensure best practices are appropriately applied - Build tools to help engineering and research teams measure and improve their velocity - Analyze and decompose complex software systems - Collaborate with and influence others to improve the overall design

Location: San Francisco

Salary range: None - None

2025-07-19

Senior Software Development Engineer in Test

As a core member on the SDET Team at Together AI, you will be a key player in setting a high quality bar for our users and customers. Your primary focus will be on designing and implementing automated testing processes using Typescript, Golang, and Python. We’re looking for an independent and self-sufficient member who works closely with stakeholders and engineers, ensuring the overall quality of the products we have at Together AI. Preferred Qualifications Together AI is a research-driven artificial intelligence company. We believe open and transparent AI systems will drive innovation and create the best outcomes for society, and together we are on a mission to significantly lower the cost of modern AI systems by co-designing software, hardware, algorithms, and models. We have contributed to leading open-source research, models, and datasets to advance the frontier of AI, and our team has been behind technological advancement such as FlashAttention, Hyena, FlexGen, and RedPajama. We invite you to join a passionate group of researchers and engineers in our journey in building the next generation AI infrastructure. We offer competitive compensation, startup equity, health insurance and other competitive benefits. The US base salary range for this full-time position is: $140,000 - $220,000 + equity + benefits. Our salary ranges are determined by location, level and role. Individual compensation will be determined by experience, skills, and job-related knowledge. Together AI is an Equal Opportunity Employer and is proud to offer equal employment opportunity to everyone regardless of race, color, ancestry, religion, sex, national origin, sexual orientation, age, citizenship, marital status, disability, gender identity, veteran status, and more. Please see our privacy policy athttps://www.together.ai/privacy - Bachelor's degree in Computer Science, Software Engineering, or a related field or 5+ years of industry experience - Proficiency in Golang, Python, or TypeScript - Proven experience as an SDET, or a similar role in a software development environmentStrong knowledge of automation testing methodologies, tools, and best practicesExperience in Cypress, Rest API Testing, or K6Passionately committed to ensuring the highest standards of software quality and dedicated to delivering top-notch products to our users - Strong knowledge of automation testing methodologies, tools, and best practices - Experience in Cypress, Rest API Testing, or K6 - Passionately committed to ensuring the highest standards of software quality and dedicated to delivering top-notch products to our users - Excellent problem-solving skills and attention to detail - Strong communication and collaboration skills - Self-motivated and adaptable in a fast-paced startup environment - Strong knowledge of automation testing methodologies, tools, and best practices - Experience in Cypress, Rest API Testing, or K6 - Passionately committed to ensuring the highest standards of software quality and dedicated to delivering top-notch products to our users - Familiarity with AI and machine learning concepts - Experience with CI/CD, Argo CD, or Github actions - Experience testing sites running on AWS and EKS - With the SDET Team, develop a sustainable test automation strategy and drive accountability and ownership across relevant teams to maintain these practicesIdentify project needs and establish QA best practices and processes that take into account the team's resources, roadmap, and quality standardsUphold high quality standards using user impact as a factor in decisionsWork closely with engineering and product teams to understand project requirements and align on testing goals, defining strategies and test plans for the project. - Identify project needs and establish QA best practices and processes that take into account the team's resources, roadmap, and quality standards - Uphold high quality standards using user impact as a factor in decisions - Work closely with engineering and product teams to understand project requirements and align on testing goals, defining strategies and test plans for the project. - Create and maintain robust test automation frameworks using Cypress to increase test efficiency and coverage - Write, maintain, and execute automated test scripts for functionality, performance, and reliability testing - Conduct automated regression testing to validate software changes and updates - Document test automation processes, findings, and results for reference and reporting purposes - Stay current on emerging testing tools, best practices, and quality assurance trends - Identify project needs and establish QA best practices and processes that take into account the team's resources, roadmap, and quality standards - Uphold high quality standards using user impact as a factor in decisions - Work closely with engineering and product teams to understand project requirements and align on testing goals, defining strategies and test plans for the project.

Location: San Francisco

Salary range: None - None

2025-07-19

AI Native Account Executive

As a Startup Account Executive at Together, you’ll drive AI innovation by securing strategic deals with the fastest growing startups in the world. You’ll develop deep relationships with your clients to help them achieve their ambitious goals, accelerating both innovation & their impact on the world. You’ll work cross-functionally with product, engineering, and research to help deliver the best products for your customers. The ideal candidate will have a passion for entrepreneur & AI, relationship-driven selling, and a fast paced environment. Together AI is a research-driven artificial intelligence company. We believe open and transparent AI systems will drive innovation and create the best outcomes for society, and together we are on a mission to significantly lower the cost of modern AI systems by co-designing software, hardware, algorithms, and models. We have contributed to leading open-source research, models, and datasets to advance the frontier of AI, and our team has been behind technological advancement such as FlashAttention, Hyena, FlexGen, and RedPajama. We invite you to join a passionate group of researchers in our journey in building the next generation AI infrastructure. We offer competitive compensation, startup equity, health insurance and other competitive benefits. The US base salary range for this full-time position is: $180 - 250K OTE + equity + benefits. Our salary ranges are determined by location, level and role. Individual compensation will be determined by experience, skills, and job-related knowledge. This is a hybrid role based in the Bay Area. Together AI is an Equal Opportunity Employer and is proud to offer equal employment opportunity to everyone regardless of race, color, ancestry, religion, sex, national origin, sexual orientation, age, citizenship, marital status, disability, gender identity, veteran status, and more. Please see our privacy policy athttps://www.together.ai/privacy - Generate pipeline & win new business in the startup ecosystem. - Design & execute creative, strategic & customer centric sales strategies to meet & exceed revenue quotas - Find creative ways to integrate into the startup ecosystem & become a trusted partner of founders & their teams - Collaborate on product roadmaps & features by bringing the voice of the customer into Together - Work closely with the SDR team to help refine outbound approach, inform product-market fit, messaging & value prop for Together products. - 5-10 years of experience in sales, with a track record of exceeding targets - Technical, passion for technology & a desire to work with highly technical teams and products - An excellent communicator with both clients and internal teams - Adaptability, coachability, high drive and sense of urgency - enjoys working within a fast-paced environment wearing multiple hats - Enjoys experimenting with the sales pitch/process to achieve company goals - Experience and success with pipeline generation - A passion for & experience with AI systems and/or infrastructure / API products highly preferred

Location: San Francisco

Salary range: None - None

2025-07-19

Customer Support Engineer

Customer Support Engineer Location:San Francisco, CA (Hybrid) About the role:As a Customer Support Engineer at a pioneering AI company, you'll be the first line of defense to support customers as they build out training, fine tuning, and inference solutions with Together AI. You'll dive deep into complex technical challenges, providing swift and effective solutions while serving as a product expert. As a part of the Customer Experience organization, you will collaborate closely with product and sales, driving continuous improvement of our offerings. This is an exciting opportunity for a deeply technical professional passionate about AI and customer success to make a significant impact in a fast-paced, innovative environment. Responsibilities Qualifications About Together AI Together AI is a research-driven artificial intelligence company. We believe open and transparent AI systems will drive innovation and create the best outcomes for society, and together we are on a mission to significantly lower the cost of modern AI systems by co-designing software, hardware, algorithms, and models. We have contributed to leading open-source research, models, and datasets to advance the frontier of AI, and our team has been behind technological advancement such as FlashAttention, Hyena, FlexGen, and RedPajama. We invite you to join a passionate group of researchers in our journey in building the next generation AI infrastructure. Compensation We offer competitive compensation, startup equity, health insurance, and other benefits, as well as flexibility in terms of remote work. The US base salary range for this full-time position is: $180K-260K + equity + benefits. Our salary ranges are determined by location, level and role. Individual compensation will be determined by experience, skills, and job-related knowledge. Equal Opportunity Together AI is an Equal Opportunity Employer and is proud to offer equal employment opportunity to everyone regardless of race, color, ancestry, religion, sex, national origin, sexual orientation, age, citizenship, marital status, disability, gender identity, veteran status, and more. - Engage directly with customers to tackle and resolve complex technical challenges involving our cutting-edge GPU clusters and our inference and fine-tuning services; ensure swift and effective solutions every time. - Become a product expert in all of our Gen AI solutions, serving as the last line of technical defense before issues are escalated to Engineering and Product teams. - Collaborate seamlessly across Engineering, Research, and Product teams to address customer concerns; collaborate with senior leaders both internally and externally to ensure the highest levels of customer satisfaction. - Transform customer insights into action by identifying patterns in support cases and working with Engineering and Go-To-Market teams to drive Together’s roadmap (e.g., future models to support) - Maintain detailed documentation of system configurations, procedures, troubleshooting guides, and FAQs to facilitate knowledge sharing with team and customers. - Be flexible in providing support coverage during holidays, nights and weekends as required by business needs to ensure consistent and reliable service for our customers. - 5+ years of experience in a customer-facing technical role with at least 1 year in a support function in AI - Strong technical background, with knowledge of AI, ML, GPU technologies and their integration into high-performance computing (HPC) environments. - Familiarity with infrastructure services (e.g., Kubernetes, SLURM), infrastructure as code solutions (e.g., Ansible) high-performance network fabrics, NFS-based storage management, container infrastructure, and scripting and programming languages. - Familiarity with operating storage systems in HPC environments such as Vast and Weka - Familiarity with inspecting and resolving network-related errors - Strong knowledge of Python, TypeScript, and/or JavaScript with testing/debugging experience using curl and Postman-like tools - Foundational understanding in the installation, configuration, administration, troubleshooting, and securing of compute clusters. - Complex technical problem solving and troubleshooting, with a proactive approach to issue resolution - Ability to work cross-functionally with teams such as Sales, Engineering, Support, Product and Research to drive customer success. - Strong sense of ownership and willingness to learn new skills to ensure both team and customer success. - Excellent communication and interpersonal skills, with the ability to explain complex technical concepts to non-technical stakeholders. - Ability to operate in dynamic environments, adept at managing multiple projects, and comfortable with frequent context switching and prioritization.

Location: San Francisco

Salary range: None - None

2025-07-19

LLM Inference Frameworks and Optimization Engineer

At Together.ai, we are building state-of-the-art infrastructure to enable efficient and scalable inference for large language models (LLMs). Our mission is to optimize inference frameworks, algorithms, and infrastructure, pushing the boundaries of performance, scalability, and cost-efficiency. We are seeking anInference Frameworks and Optimization Engineer to design, develop, and optimize distributed inference engines that support multimodal and language models at scale. This role will focus on low-latency, high-throughput inference, GPU/accelerator optimizations, and software-hardware co-design, ensuring efficient large-scale deployment of LLMs and vision models. This role offers a unique opportunity to shape the future of LLM inference infrastructure, ensuring scalable, high-performance AI deployment across a diverse range of applications. If you're passionate about pushing the boundaries of AI inference, we’d love to hear from you! Must-Have: Nice-to-Have: Together AI is a research-driven artificial intelligence company. We believe open and transparent AI systems will drive innovation and create the best outcomes for society, and together we are on a mission to significantly lower the cost of modern AI systems by co-designing software, hardware, algorithms, and models. We have contributed to leading open-source research, models, and datasets to advance the frontier of AI, and our team has been behind technological advancement such as FlashAttention, Hyena, FlexGen, and RedPajama. We invite you to join a passionate group of researchers in our journey in building the next generation AI infrastructure. We offer competitive compensation, startup equity, health insurance and other competitive benefits. The US base salary range for this full-time position is: $160,000 - $230,000 + equity + benefits. Our salary ranges are determined by location, level and role. Individual compensation will be determined by experience, skills, and job-related knowledge. Together AI is an Equal Opportunity Employer and is proud to offer equal employment opportunity to everyone regardless of race, color, ancestry, religion, sex, national origin, sexual orientation, age, citizenship, marital status, disability, gender identity, veteran status, and more. Please see our privacy policy athttps://www.together.ai/privacy - Design and develop fault-tolerant, high-concurrency distributed inference engine for text, image, and multimodal generation models. - Implement and optimize distributed inference strategies, including Mixture of Experts (MoE) parallelism, tensor parallelism, pipeline parallelism for high-performance serving. - Apply CUDA graph optimizations, TensorRT/TRT-LLM graph optimizations, and PyTorch-based compilation (torch.compile), and speculative decoding to enhance efficiency and scalability. - Collaborate with hardware teams on performance bottleneck analysis, co-optimize inference performance for GPUs, TPUs, or custom accelerators. - Work closely with AI researchers and infrastructure engineers to develop efficient model execution plans and optimize E2E model serving pipelines. - Experience:3+ years of experience in deep learning inference frameworks, distributed systems, or high-performance computing. - 3+ years of experience in deep learning inference frameworks, distributed systems, or high-performance computing. - Technical Skills:Familiar with at least one LLM inference frameworks (e.g., TensorRT-LLM, vLLM, SGLang, TGI(Text Generation Inference)).Background knowledge and experience in at least one of the following: GPU programming (CUDA/Triton/TensorRT), compiler, model quantization, and GPU cluster scheduling.Deep understanding ofKV cachesystems likeMooncake,PagedAttention, or custom in-house variants. - Familiar with at least one LLM inference frameworks (e.g., TensorRT-LLM, vLLM, SGLang, TGI(Text Generation Inference)). - Background knowledge and experience in at least one of the following: GPU programming (CUDA/Triton/TensorRT), compiler, model quantization, and GPU cluster scheduling. - Deep understanding ofKV cachesystems likeMooncake,PagedAttention, or custom in-house variants. - Programming:Proficient in Python and C++/CUDA for high-performance deep learning inference. - Proficient in Python and C++/CUDA for high-performance deep learning inference. - Optimization Techniques:Deep understanding of Transformer architectures and LLM/VLM/Diffusion model optimization.Knowledge of inference optimization, such as workload scheduling, CUDA graph, compiled, efficient kernels - Deep understanding of Transformer architectures and LLM/VLM/Diffusion model optimization. - Knowledge of inference optimization, such as workload scheduling, CUDA graph, compiled, efficient kernels - Soft Skills:Strong analytical problem-solving skills with a performance-driven mindset.Excellent collaboration and communication skills across teams. - Strong analytical problem-solving skills with a performance-driven mindset. - Excellent collaboration and communication skills across teams. - 3+ years of experience in deep learning inference frameworks, distributed systems, or high-performance computing. - Familiar with at least one LLM inference frameworks (e.g., TensorRT-LLM, vLLM, SGLang, TGI(Text Generation Inference)). - Background knowledge and experience in at least one of the following: GPU programming (CUDA/Triton/TensorRT), compiler, model quantization, and GPU cluster scheduling. - Deep understanding ofKV cachesystems likeMooncake,PagedAttention, or custom in-house variants. - Proficient in Python and C++/CUDA for high-performance deep learning inference. - Deep understanding of Transformer architectures and LLM/VLM/Diffusion model optimization. - Knowledge of inference optimization, such as workload scheduling, CUDA graph, compiled, efficient kernels - Strong analytical problem-solving skills with a performance-driven mindset. - Excellent collaboration and communication skills across teams. - Experience in developing software systems for large-scale data center networks with RDMA/RoCE - Familiar with distributed filesystem(e.g., 3FS, HDFS, Ceph) - Familiar with open source distributed scheduling/orchestration frameworks, such as Kubernetes (K8S) - Contributions to open-source deep learning inference projects.

Location: San Francisco, Singapore, Amsterdam

Salary range: None - None

1 ... 4 5 6 7 ... 17