Principal Applied Scientist

vor 2 Wochen


Hollabrunn, Österreich Neurons Lab Vollzeit

The Principal Solutions Architect (PSA) in Neurons Lab is the central figure of every project. They fill in-between Customer goals and requirements, technologies that fulfill them best, and experts practicing them.

W/ 7+ years of commercial, hands-on experience in many domains (Data Analytics, Engineering, DevOps, Machine Learning) and verticals (GIS, HealthTech, EnergyTech, RetailTech, FinTech), they still learn a lot and are always curious about new offerings from AWS, Open-Source Community, and beyond. All, to offer their clients the best experience. In their work, they use AWS best practices (WAF, CAF,...) and act as true leaders, seeking not just to meet short-term contract terms but give Customers a long-running value stream.

**About project**

The customer is developing a comprehensive, first-in-class Geospatial Data Warehouse SaaS. Geospatial data is traditionally hard to work with and requires years of experience with highly-specialized tools. These tools are highly fragmented - each solves only a little piece of the puzzle, uses different (often incompatible) formats for data, and exposes it through interfaces not supported by standard analytical tools. The customer aims to unite this fractured world and give its Customers a powerful, convenient instrument to collect, process, store, analyze, and visualize their geospatial data. All to help them uncover the extra value hidden in their data without the undifferentiated heavy lifting of running their own infrastructure or studying rocket science.

**Challenges & Responsibilities**
- Architecting, building, and maintaining cost-efficient, scalable cloud environments for customers.
- Understanding the customer's business objectives and creating cloud-based solutions to facilitate those objectives.
- Conducts Well-Architected Reviews and audits customer AWS Accounts
- Participate in the SOW creation process along with the Sales team, Delivery Manager, and engineers
- Conducts customer-facing architecture assessment meetings to gather business and technical requirements, aligns those cloud best practices, architect a solution and, write Epics and User Stories in collaboration with the Team
- Participates in internal and external projects Stand-Ups as a Product Owner from the project's inception to completion
- Support quality development practices and pursue new and better ways to build and deploy software and ML/AI models

**Skills**

**Foundational / must-have skills**:

- **DataOps**:

- data management: AWS Lambda for Python / Amazon EventBridge / Apache Airflow / AWS Step Functions
- data pipelines: Amazon EMR, DaskYarn
- data lakes: AWS Lake Formation, Glue Data Catalog / Apache Hive Metastore; Amazon S3, Athena; Apache Iceberg, Parquet
- data warehouses: Amazon Redshift, Aurora for PostgreSQL
- data sources and destinations: Amazon RDS / Aurora for PostgreSQL
- **DevOps**:

- AWS CDK / AWS Marketplace, Application Cost Profiler / AWS Cost Explorer, Savings Plans & Reserved Instances / AWS Control Tower, Organizations, CloudTrail, Config;
- Amazon CloudWatch, SNS / AWS Transit Gateway, PrivateLink; Amazon VPC, API Gateway;
- REST APIs / Amazon CloudFront, Route 53 / Git; AWS CodeBuild, CodeArtifact, CodeDeploy, CodePipeline / AWS IAM, IAM Identity Center (aka SSO), IAM Access Analyzer, Security Hub, Resilience Hub, KMS, Secrets Manager;
- Amazon Cognito, Cognito External Identity Providers, GuardDuty, Detective
- **Data Analytics**:

- Python / Pandas, NumPy;
- AWS SDK for Python aka boto3, AWS SDK for Pandas aka awswrangler / SQL: Trino / Presto / Amazon Athena SQL;
- Amazon Redshift SQL;
- PostgreSQL

**Advanced / nice-to-have skills**:

- **DataOps**:

- data management: Astronomer
- data lakes: Apache Atlas / Apache GeoParquet, Arrow, GeoArrow
- data lake houses: Amazon Redshift Spectrum
- data sources and destinations: Amazon RDS / Aurora for MySQL, DynamoDB, DocumentDB; not-in-AWS PostgreSQL, not-in-AWS MySQL, OGC 2.0 (WFS / Web Feature Service), OpenGIS (WMS / Web Map Service), OpenGIS (WMTS / Web Map Tile Service), not-in-AWS MongoDB, Esri Geodatabase / ArcGIS
- data formats: GeoJSON, GeoTIFF, Shapefile, netCDF, HDF5 / GML, GPX, KML, WKT, WKB, IMG, CSV, or other GDAL-supported vector and raster formats
- **DevOps**:

- AWS CDK for Python / AWS Budgets, Cost & Usage Reports, Cost Anomaly Detection / AWS Backup, License Manager / AWS X-Ray, Lambda Powertools;
- Amazon OpenSearch Service; Sentry, Datadog / GitLab, NPM, Webpack, Babel, JavaScript CI / CD, TypeScript CI / CD, ReactJS CI / CD, NextJS CI / CD, Figma CI / CD & Figma Tokens;
- AWS Amplify / AWS Web Application Firewall (WAF), Global Accelerator, Shield; Amazon S3 Website Hosting
- **Data Analytics**:

- GeoPandas, dask-geopandas; GDAL / Fiona, GEOS / Shapely;
- Xarray;
- Rasterio, rioxarray;
- SciPy;
- Numba;
- Pandera;
- jsonschema; great_expectations / SQL: PartiQL; Amazon Redshift UDFs in Python and AWS Lambda;
- PostGIS for PotstgreSQL / NoSQL: Mongo (MongoDB Atlas on AWS) /