社名
社名非公開
職種
データサイエンティスト
業務内容
...
ResponsibilitiesWork directly with Data Analysts and Platform Engineering Team to create reusable experimental and production data pipelinesUnderstand, tune, and master the processing engines (like Spark, Hive, Cascading, etc) used day-to-dayKeep the data whole, safe, and flowing with expertise on high volume data ingest and streaming platforms (like Spark Streaming, Kafka, etc)Sheppard and shape the data by developing efficient structures and schema for the data in storage and transitExplore as many new technology options for data processing, storage, and share them with the teamDevelop tools and contribute to open source wherever possible
求められる経験
Qualifications
You have previously worked on building data pipelines ingesting and transforming large number of events per minute and terabytes of data per day.
You have worked with Spark and Kafka before and have experimented or heard about Flink/Druid/Ignite/Presto/Athena and understand when to use one over the other.
Preferralbly, you have worked with cloud based big data processing platform such as AWS EMR, Google Cloud DataProc.
You are passionate about producing clean, maintainable and testable code part of real-time data pipeline.
You understand how microservices work.
You can connect different services and processes together even if you have not worked with them before and follow the flow of data through various pipelines to debug data issues.
You understand issues with ingesting data from applications in multiple data centres across geographies, on-premise and cloud and will find a way to solve them.
Proficient in Java/Scala/Python/Spark
保険
健康保険,厚生年金保険,雇用保険,労災保険
待遇・福利厚生
401K
Language Learning support
Translation/Interpretation support
VISA sponsor + Relocation support
休日休暇
日曜日,土曜日,祝日
給与
年収900 ~ 1,200万円
賞与
1,000,000
show more
社名
社名非公開
職種
データサイエンティスト
業務内容
ResponsibilitiesWork directly with Data Analysts and Platform Engineering Team to create reusable experimental and production data pipelinesUnderstand, tune, and master the processing engines (like Spark, Hive, Cascading, etc) used day-to-dayKeep the data whole, safe, and flowing with expertise on high volume data ingest and streaming platforms (like Spark Streaming, Kafka, etc)Sheppard and shape the data by developing efficient structures and schema for the data in storage and transitExplore as many new technology options for data processing, storage, and share them with the teamDevelop tools and contribute to open source wherever possible
求められる経験
Qualifications
You have previously worked on building data pipelines ingesting and transforming large number of events per minute and terabytes of data per day.
You have worked with Spark and Kafka before and have experimented or heard about Flink/Druid/Ignite/Presto/Athena and understand when to use one over the other.
Preferralbly, you have worked with cloud based big data processing platform such as AWS EMR, Google Cloud DataProc.
...
You are passionate about producing clean, maintainable and testable code part of real-time data pipeline.
You understand how microservices work.
You can connect different services and processes together even if you have not worked with them before and follow the flow of data through various pipelines to debug data issues.
You understand issues with ingesting data from applications in multiple data centres across geographies, on-premise and cloud and will find a way to solve them.
Proficient in Java/Scala/Python/Spark
保険
健康保険,厚生年金保険,雇用保険,労災保険
待遇・福利厚生
401K
Language Learning support
Translation/Interpretation support
VISA sponsor + Relocation support
休日休暇
日曜日,土曜日,祝日
給与
年収900 ~ 1,200万円
賞与
1,000,000
show more