
Trainy
ActiveInfrastructure for managing GPU clusters for training/serving.
About
Goodbye Slurm, Hello Konduktor. Trainy Konduktor is a software platform for AI teams to schedule workloads with priority, control resource allocation, and improve GPU reliability. With Konduktor, teams submit jobs to a healthy pool of GPUs, assign job priority with a simple user interface, and never worry about hardware faults again.
Founders
Roanak Baviskar· FounderStudied CS & Mathematics at UC Santa Cruz. Led Audio team at Hive AI, where we trained and deployed 500M parameter-scale models to production.
Andrew Aikawa· FounderCo-founder and CTO at Trainy building a training platform to make deep learning go faster. Previously a lead Machine Learning Engineer for Hive AI's object detection products. I completed my Physics Ph.D. UC Berkeley '22 where my thesis focused on applying computer vision and deep learning on nanoscience. Physics & Computer Science B.A. UC Berkeley '17.
Product launches · 2 launches
Neptune-compatible experiment tracking, built for teams running AI workloads
Dashboards to help ML engineers training large models isolate performance bottlenecks and boost training speed.