Job description:Data Engineer (Databricks & PySpark)
Role Type: 6-Month Contract (Subcon)
Location: Bracknell, Hybrid (2 days/week in-office)
Role Overview: We are seeking a mid-to-senior Data Engineer to build, optimize, and maintain scalable data pipelines within a cloud-distributed computing environment. You will act as an expert data wrangler, ensuring optimal data delivery architecture for software developers, data analysts, and data scientists.
Core Responsibilities:
- Develop and operationalize reliable ETL/ELT pipelines for large, complex datasets.
- Assemble and transform unstructured data (JSON) into actionable insights.
- Perform logical data modeling, physical database optimization, and security implementation.
- Improve data integrity by implementing automated Data Quality checks.
- Provide On-Call support to unblock users and resolve high-severity pipeline issues.
- Collaborate with Data Science teams to streamline data delivery for advanced analytics.
Technical Qualifications:
- Experience: 5+ years in Data Engineering and Pipeline Operationalization.
- Databricks/Spark: 3+ years of hands-on experience with Databricks and PySpark.
- Cloud: 3+ years of experience within AWS ecosystems.
- Unstructured Data: 3+ years of experience processing JSON and complex semi-structured data.
- SQL: Expert-level Advanced SQL skills (performance tuning, window functions, etc.).
- DevOps: Strong experience with GIT and CI/CD workflows.
- Visualization: Familiarity with Power BI or similar BI tools.
Mandatory Skill Tag: * PySpark / Data Engineering for Data Science.
Randstad Technologies is acting as an Employment Business in relation to this vacancy.
...