Software Engineer Data Infrastructure
180k - 250k USD
On-site
Full Time
#Data Engineering
#Machine Learning
#Data Platform
#Spark
#Flink
#Snowflake
#Hive
#SQL
#HDFS
#S3
#Airflow
#Dagster
#Python
At DatologyAI, we are transforming how companies train large models by moving away from inefficient, random data sampling. Our research-backed approach helps enterprises identify the most valuable data, leading to higher-quality models at a lower cost. With over $57 million in funding from prominent investors and industry pioneers, we are rapidly expanding our team in Redwood City to redefine data curation. We are currently looking for a dedicated professional to join us on-site four days a week as we push the boundaries of machine learning infrastructure.
What is this role?
We are seeking a Senior Software Engineer for Data Infrastructure to join our core team on a full-time basis. As one of our early senior hires, you will work directly with our founders to shape our product roadmap and make critical technical decisions. You will be responsible for leading the development of the platform that allows us to process customer data and apply cutting-edge research to large-scale datasets. This is a high-impact role where you will influence our technology, our product, and our company culture.
What will you do?
- Design, build, and maintain highly scalable data processing solutions while ensuring the reliability, security, and performance of our systems.
- Architect and deploy the back-end services that serve as the foundation for our data curation platform.
- Collaborate closely with our research and engineering teams to integrate new features and advanced capabilities for our customers.
What makes you a great fit?
You are a seasoned engineer with a track record of leading and building production-grade data systems. We are looking for someone who brings the following expertise:
- Strong experience with scalable data processing tools like Spark or Flink, and familiarity with data lakes or warehouses such as Snowflake or Hive.
- Proficiency in SQL and experience managing distributed storage systems like HDFS or S3.
- Expertise in workflow management tools such as Airflow or Dagster.
- Fluency in Python, along with a commitment to high-quality design, testing, and system correctness.
- Experience building infrastructure that supports machine learning and deep learning training pipelines.
- A humble, collaborative mindset with the drive to own problems from start to finish.
We value your unique background, so even if you do not meet every single requirement, we encourage you to apply if you are passionate about our mission.
What's in it for you?
We offer a competitive salary for this position, which ranges from $180,000 to $250,000, alongside significant equity. Our comprehensive benefits package includes:
- Medical, vision, and dental insurance with 100% of premiums covered.
- A 401k plan featuring a 4% company match.
- Unlimited vacation time and paid time off.
- Annual stipends for wellness programs and professional learning.
- Relocation assistance for those moving to the area.







