Data Ingestion into Foundry from external data sources /legacy systems using using Agents/ Magritte connectors, Data Connection. Working with Raw files.
Excellent proficiency in data processing scripting languages like but not limited to Python, Pyspark, sql
Design, create and maintain a optimal data pipeline architecture in foundry
Ability to create data-pipelines and optimize data pipelines using: Pyspark for back-end, Typescript for front-end. Publishing and Using shared libraries in code repository
Assemble large, complex data sets that meet functional / non-functional business requirements in foundry.
Identify, design, and implement internal process improvements: automating manual processes, optimizing data delivery, re-designing infrastructure for greater scalability, etc in foundry
Palantir scheduling jobs for pipeline. Monitoring Data pipeline health and configuring health checks and alerts.(Data expectations)
Build analytics tools using Contour, Quiver, Workshop Application, Slate that utilize the data pipeline to provide actionable insights into KPIs like customer acquisition, operational efficiency and other key business performance metrics
Good Understanding and working knowledge on Foundry Tools: Ontology, Contour, Object-explorer, Ontology-Manager, Object-editor using Actions/ Typescript, Code workbook, Code Repository, Foundry ML
Primary Skills
candidate must have 5+ years of experience in a Data Engineer role, Should have experience using the following software/tools:
Hadoop, Spark, Kafka, etc.
Experience with relational SQL and NoSQL databases, including Postgres and Cassandra/Mongo dB
Experience with stream-processing systems: Storm, Spark-Streaming, etc.
Experience with object-oriented/object function scripting languages: Python, Java, C++, Scala, etc.
Advanced working SQL knowledge and able to quickly envision a technical solution based on functional requirements: At least 4+ years in sql
Experience building and optimizing \'big data\' data pipelines, architectures and data sets : At least 5+ years in Pyspark/ Python
Experience performing root cause analysis on internal and external data and processes to answer specific business questions and identify opportunities for improvement
Secondary Skills
Strong analytic skills related to working with big datasets.
Build processes supporting data transformation, data structures, metadata, dependency and workload management.
Experience supporting and working with cross-functional teams in a dynamic environment.
Ref: 1785457Posted on: May 22, 2024Experience level: ExperiencedContract Type: PermanentLocation:Bangalore, KA, INDepartment: Big Data & Analytics