Software Engineer Data Infrastructure at DatologyAI

D
DatologyAI

Software Engineer Data Infrastructure

us flag
United States

180k - 250k USD

On-site

Full Time

#Data Engineering

#Machine Learning

#Data Platform

#Spark

#Flink

#Snowflake

#Hive

#SQL

#HDFS

#S3

#Airflow

#Dagster

#Python

DatologyAI is looking for a Software Engineer Data Infrastructure

Sign up to unlock quick summaries and profile fit assessments

About the Company

Companies want to train their own large models on their own data. The current industry standard is to train on a random sample of your data, which is inefficient at best and actively harmful to model quality at worst. There is compelling research showing that smarter data selection can train better models faster—we know because we did much of this research. Given the high costs of training, this presents a huge market opportunity. We founded DatologyAI to translate this research into tools that enable enterprise customers to identify the right data on which to train, resulting in better models for cheaper. Our team has pioneered deep learning data research, built startups, and created tools for enterprise ML. For more details, check out our recent blog posts sharing our high-level results for text models and image-text models.

We've raised over $57M in funding from top investors like Radical Ventures, Amplify Partners, Felicis, Microsoft, Amazon, and notable angels like Jeff Dean, Geoff Hinton, Yann LeCun and Elad Gil. We're rapidly scaling our team and computing resources to revolutionize data curation across modalities.

This role is based in Redwood City, CA. We are in office 4 days a week.

About the Role

We're looking for an experienced Data Platform Engineer to join as a member of our core Datology AI team. As one of our early senior hires, you will partner closely with our founders on the direction of our product and drive business-critical technical decisions. You will lead the development of our core product and data platform. These are key components of our stack that allow us to process customer data and apply state of the art research for identifying the most informative data points in large-scale datasets. You will have a broad impact over the technology, product, and our company's culture. We provide visa sponsorship for candidates selected for this role.

What You'll Work On

  • Design, build and maintain highly scalable data processing solutions, while ensuring scalability, reliability, and security

  • Architect, build, and deploy the back-end systems and services that power our data curation platform

  • Partner with researchers and engineers to bring new features and research capabilities to our customers

  • Ensure that our systems are reliable, secure, and worthy of our customers' trust

About You

  • Have meaningful experience with leading and building production data systems to deliver on major product initiatives.

    • You have built and managed highly scalable data processing solutions (e.g. Spark, Flink), data lakes or warehouses (e.g. Snowflake, Hive), authored queries (SQL), distributed storage systems (e.g., HDFS, S3), used workflow management (e.g. Airflow, Dagster), and have experience maintaining the infra that supports these.

  • Proficiency in at least one programming language commonly used within Data Engineering, such as Python, Scala, or Java.

  • Expertise with any of ETL schedulers such as Airflow, Dagster, or similar frameworks.

  • Experience maintaining a high quality bar for design, correctness, and testing.

  • Take pride in building and operating scalable, reliable, secure systems

  • Have a humble attitude, an eagerness to help your colleagues, and a desire to do whatever it takes to make the team succeed

  • Own problems end-to-end, and are willing to pick up whatever knowledge you're missing to get the job done

  • You have experience being the technical lead of a Data Engineering / Platform / Infrastructure Team.

  • Experience building ML/DL systems and/or data infrastructure that feeds into training large ML models

Don’t meet every single requirement? We still encourage you to apply. If you’re excited about our mission and eager to learn, we want to hear from you!

Compensation

At DatologyAI, we are dedicated to rewarding talent with highly competitive salary and significant equity. The salary for this position ranges from $180,000 to $250,000.

  • The candidate's starting pay will be determined based on job-related skills, experience, qualifications, and interview performance.

We offer a comprehensive benefits package to support our employees' well-being and professional growth:

  • 100% covered health benefits (medical, vision, and dental).

  • 401(k) plan with a generous 4% company match.

  • Unlimited PTO policy

  • Annual $2,000 wellness stipend.

  • Annual $1,000 learning and development stipend.

  • Daily lunches and snacks are provided in our office!

  • Relocation assistance for employees moving to the Bay Area.

D

DatologyAI

8 views

1 applied

Markets

Artificial Intelligence
Information Technology
Visit DatologyAI
Share this job
Copy Permalink
Open roles at DatologyAI
D
DatologyAI

Infrastructure Engineer

us flag
United States

180k - 250k USD

On-site

Full Time

#Engineering

#Artificial Intelligence

#Data Infrastructure

#BASH

#Kubernetes

#Python

#Terraform

#AWS

#Azure

#GCP

#Debugging

#Cloud Platforms

#Infrastructure

Discover similar jobs
Constructive Dialogue Institute logo
Constructive Dialogue Institute

Senior Data Scientist

us flag
United States

135k - 145k USD

Remote

Full Time

#Data Science

#Analytics Engineering

#Nonprofit

#SQL

#Python

#Data Pipelines

#AWS

#Dashboards

#Git

#Data Quality

#BI Tools

S
Solo.io, Inc.

RevOps Engineer

Remote

Full Time

#Revenue Operations

#Data Engineering

#Analytics

#SQL

#DBT

#Data Pipelines

#Salesforce

#BigQuery

#Fivetran

#Airbyte

#Marketo

#API Testing

Allata logo
Allata

Ascend Program - Data

Remote

Full Time

#Data

#Data Engineering

#Software Development

#Data Analysis

#AI

#Agile

#Jira

#Git

#Cloud Platforms

Tebra logo
Tebra

Security Architect

179k - 204k USD

Remote

Full Time

#Security

#Cloud Security

#Healthcare

#Cloudflare

#GCP

#Kubernetes

#Terraform

#Python

#DevSecOps

#Vertex AI

#BigQuery

#Helm

#Workato

S
Sportalliance

Senior Commercial & Pricing Analyst

Remote

Full Time

#SaaS

#Analytics

#Pricing Strategy

#Financial Modeling

#SQL

#AI Tools

#Revenue Forecasting

#Business

#Spreadsheets

#Scenario Modeling

#Data Analysis

U
Union

Sales Engineer

Remote

Full Time

#AI

#Sales

#Machine Learning

#MLOps

#PyTorch

#TensorFlow

#Spark

#Kubernetes

#Docker

#AWS

#Terraform

#MEDDIC

N
NewPage Solutions Inc

Python Developer

Remote

Contractor

#Technology

#Digital Health

#Continuous Delivery

#Python

#AWS Lambda

#AWS ECS

#Automated Testing

#Agile Methodologies

#Terraform

#Drupal

#PHP

#S3

#DynamoDB

D
Deepgram

Pre-Sales Solutions Engineer

Remote

Full Time

#AI

#Solutions Engineering

#Python

#JavaScript

#API Integration

#Speech Recognition

#NLP

#Cloud Platforms

#Docker

#Kubernetes

#Sales Methodologies

L
Lightdash

Head of Engineering

Remote

Full Time

#Engineering Leadership

#AI

#Developer Experience

#TypeScript

#React

#Node.Js

#SQL

#Docker

#Kubernetes

#GCP

#Architecture

#Security

saas.group logo
saas.group

Applied Research Scientist

Remote

Full Time

#AI

#Research

#SQL

#Python

#Data Analysis

#Experiment Design

#Data Pipelines

#Validation

#AI Tools

#Research Methodology

Dataiku logo
Dataiku

Fullstack Software Engineer

Remote

Full Time

#Engineering

#AI

#Solutions

#Vue.Js

#React

#Angular

#Python

#fastAPI

#Flask

#RESTful API

#Data

A
Ankorstore

Lead Data Platform

Remote

Full Time

#Data Engineering

#Platform

#Data Platform

#SQL

#Python

#BigQuery

#Airflow

#DBT

#Sigma

#Amplitude

#Terraform

#Product Analytics

E
Eight Sleep

Senior Backend Engineer

Remote

Full Time

#Software Engineering

#Distributed Systems

#Java

#Kotlin

#Scala

#C#

#Python

#NodeJS

#TypeScript

#Cloud Services

H
HTTPie

Senior Fullstack Engineer

Remote

Full Time

#Developer Tools

#Engineering

#API Testing

#TypeScript

#React

#Next.js

#TailwindCSS

#Python

#Django

#AWS

#Terraform

#WebSocket

#Electron

Adthena logo
Adthena

Anti-Bot Engineer

Remote

Full Time

#Web Scraping

#Search

#Data Engineering

#Python

#Automation

#Playwright

#Selenium

#Management

#HTTP

#Docker

#Kubernetes

Hummingbird logo
Hummingbird

Senior Software Engineer, Infrastructure

Remote

Full Time

#Infrastructure Engineering

#Fintech

#Security

#AWS

#Terraform

#PostgreSQL

#Redis

#Ruby on Rails

#Python

#Docker

#CircleCi

#GraphQL

#TypeScript

Fullscript logo
Fullscript

Lead Data Scientist

Remote

Full Time

#Data Science

#Health Tech

#Causal Inference

#Python

#Statistical Modeling

#Research

#Data

#Data Analysis

#Machine Learning

R
refurbed

Category & Sales Marketing Analyst

Remote

Full Time

#Marketing Analyst

#Performance Marketing

#Data Analysis

#Category Management

#Marketing Operations

#Management

#Planning

#CRM

#SQL

#Excel

#Google Analytics

TheGiftedCompany logo
TheGiftedCompany

Java Developer

Remote

Contractor

#Technology

#Fintech

#Martech

#Java

#Spring Boot

#SQL

#Kubernetes

#AWS

#Gitlab CI

#Junit

A
Axelera AI

Field Application Engineering Manager

Remote

Full Time

#AI

#Field Engineering

#Customer Success

#PyTorch

#TensorFlow

#Embedded Systems

#Python

#CUDA

#Computer Vision

#Deployment

Your dream job awaits.

Explore exciting opportunities, connect with top employers, and ignite your career.