Senior Data Engineer
Reejig
Data Science
New York, NY, USA
About the Role
You will be our first data hire in the US. Our data team is based in Sydney; our fastest-growing market is the US. We need a strong, hands-on data engineer in the US timezone who can own the data layer for our US product team, support US customers when data issues arise, and build the data foundation for the US function. The data pipelines you build and maintain power core product features, AI capabilities, and customer analytics.
You will work alongside our US-based Head of Product, a Senior Product Designer, and a Senior Fullstack Engineer as part of the core US product team. You report to the Head of Data in Sydney.
This role is primarily data engineering. If you also bring data science capability — analytics, model evaluation, experimentation — there is scope to take on that work as well. We are planning for a separate Data Scientist hire, but for the right candidate the two roles could be combined.
What We Expect
Build and Operate Data Pipelines
- Own the data pipelines that power product features, AI systems, and analytics. Build pipelines to ingest and process structured datasets — CSVs, spreadsheets, APIs, event streams — with strong validation and error handling.
- Develop and maintain pipelines supporting embeddings, RAG, and other LLM-driven data workflows. This is core to how the product works.
- Investigate data inconsistencies, debug pipeline failures, and improve pipeline robustness. When something breaks that affects a US customer, you are the first responder in-timezone.
- Work across existing codebases and data systems. Understand what’s there, improve it, extend it safely.
- Operate in production: deploy, monitor, and maintain data systems using Terraform and CI/CD pipelines.
Support US Customers
- Be the US-hours data responder for customer data issues. Investigate, assess severity, resolve directly or hand off to the Sydney data team with complete context.
- Build playbooks and observability tooling for data triage that compress response times and reduce back-and-forth with Sydney.
- Work with the US Fullstack Engineer to cover cross-discipline customer issues. You own the data domain; they handle application and infrastructure.
Partner with the US Product Team
- Be the data voice alongside the Head of Product, designer, and engineer. Surface what the data can and can’t support, identify opportunities, inform product direction.
- Deliver data-driven features in collaboration with application engineers and the product team.
- Contribute to improving internal tooling for data validation, troubleshooting, and operational visibility.
AI-First Engineering
- Use AI coding tools (Cursor, Copilot, Claude Code, or equivalent) as a core part of how you build. This is a baseline expectation, not a bonus.
- Work with the wider engineering team and Head of Data to consolidate AI-assisted development practices across the data stack.
What You Need
- 5+ years in data engineering or backend systems with a strong data focus.
- Strong Python skills for pipelines, data processing, and scripting.
- Strong SQL and experience with relational databases (PostgreSQL, MySQL).
- Experience building ETL/ELT pipelines and working with event-driven processing (queues, async workers, pub/sub).
- Experience with ML or LLM pipelines, embeddings, or RAG systems. Our product is AI-powered; you need to have worked in this space.
- Infrastructure awareness: comfortable deploying and operating data systems using Terraform, CI/CD, and AWS services (S3, RDS, EKS).
- You can navigate existing systems and codebases, understand how they work, and safely extend or improve them.
- AI-first mindset: you use AI tools to build software, not just build AI features. Proficiency with AI coding tools is expected.
- Clear communicator. You can explain data issues and trade-offs to engineers, product partners, and customers.
- Comfortable with ambiguity and autonomy. You’re joining a small US team with real ownership and no safety net.
Strong Signals
- Data science capability alongside data engineering: analytics, model evaluation, experimentation, statistical analysis.
- Experience with structured ontology data or knowledge graphs.
- Prior startup or scale-up experience in a fast-moving environment.
- Customer-facing data work: investigating data issues for enterprise customers, explaining data quality problems, translating customer needs into pipeline improvements.
- Experience working across timezones in a distributed team.
Tech Stack
Python, MySQL, Redis, AWS including S3, RDS, and EKS
LLM pipelines, embeddings, RAG, structured ontology data
Why This Matters
Reejig is at a point of real inflection. We are building a company that will reshape how organisations operate in the AI era. The people who join now are not stepping into a role, they are stepping into a chance to build something with scale and long term impact. If you want work that changes your career and creates meaningful equity upside, this is the moment.