Data Engineer
Remote
Full Time
#Engineering
#Python
#Java
#Databricks
#SQL
#NoSQL
#AWS
#Azure
#Airflow
#Hadoop
#Spark
Factored was founded in Palo Alto, California, by Andrew Ng and a group of distinguished AI researchers and engineers. Our mission is to bridge the global gap in AI and machine learning talent by identifying and nurturing exceptional engineers regardless of their location. We pride ourselves on being a transparent, high-growth startup where every team member has a meaningful voice. We are currently looking for a talented Data Engineer to join our remote team and help us build the infrastructure that powers our advanced AI solutions.
Key outcomes
- Migrate existing data pipelines to Databricks Unity Catalog and update storage practices to ensure full compatibility.
- Design, build, and maintain robust data pipelines that move information from distributed sources into our central data lakehouse.
- Assemble complex datasets and develop automated processing techniques for data validation, transformation, and augmentation.
- Partner with AI and machine learning engineers to architect new features, identify data dependencies, and build necessary API integrations.
- Monitor system performance and resolve bugs to ensure high data integrity and a seamless user experience.
- Manage analytics tools to provide actionable insights regarding operational efficiency and business performance.
- Ensure all data infrastructure remains secure and compliant with international data handling regulations.
Requirements
- Three to five years of professional experience delivering high-quality, production-ready code.
- Strong foundational knowledge in computer science, including data structures, algorithms, networking, and object-oriented programming.
- Proven expertise in Python and Java.
- Hands-on experience with Databricks and orchestration tools like Airflow.
- Experience managing data pipelines using both relational SQL and NoSQL databases.
- Proficiency with cloud-based data infrastructure, specifically AWS or Azure.
- Experience working with big data technologies such as Spark, Hadoop, and Kafka.
- Ability to work effectively with unstructured datasets and version control systems like Git.
- Excellent verbal and written communication skills in English.
Preferred qualifications
- A Bachelor’s degree in Computer Science or Mathematics, with a Master’s or PhD considered a plus.
- Experience building scalable RESTful APIs.
- Familiarity with data privacy regulations and best practices.
- Background in developing for real-time, low-latency, or data-intensive environments.
Compensation
We offer a fully remote work environment that allows you to contribute from anywhere in the world. We are committed to investing in your professional growth and supporting your career development within our collaborative team.
How to apply
If you are a self-starter who is passionate about data engineering and thrives in an early-stage startup environment, we invite you to apply. We look forward to reviewing your background and discussing how your skills can help us continue to innovate.





