Oliver Meyn

Hadoop & Big Data Consultant

ABOUT ME

I have been working primarily in Hadoop and Big Data since 2009, with Java and SQL development extending back to 1999. I have learned some hard-won lessons in the complexity of distributed system development and management, and I am happy to help others benefit from my experience.
 
My next availability is October 1, 2018.

SERVICES

HADOOP & BIG DATA

  • Cluster setup, from hardware purchasing advice, through OS install, to running Hadoop cluster (Cloudera, Hortonworks)
  • Getting data in and out of the cluster (Nifi, Kafka, Sqoop from/to SQL stores, csvs into HDFS)
  • Building applications, tools, and workflows to work with the cluster (primarily Java)
  • Designing/reviewing architecture to make optimal use of the Hadoop ecosystem (streaming, batching, Lambda, Kappa)

TEAM LEAD & MENTOR

  • I have worked for many years as a Team Lead and would be happy to play that role while helping your company move into the Hadoop world
  • Informal training of teams to take over after an initial cluster setup helps both of us!

COMMUNICATING

  • I am comfortable speaking to small and large audiences at any technical level about the Hadoop world
  • Blog posts are a great, sometimes informal, way to update your customers on your roadmap. I have written a number of blogposts at GBIF describing various adventures and would be happy to write guest posts or to help your team in writing those posts. My posts have been featured in the HBase reference guide and the Hadoop Weekly newsletter.

PLEASE GET IN TOUCH

Oliver Meyn

Oliver Meyn

oliver@elephant.tech

(416) 524-2240

Toronto, Ontario, Canada

STALKING?

WORK HISTORY

2017

Hadoop & Big Data Consultant

RBC Investor and Treasury Services, Toronto

Contributing to the architecture, design, and some development towards ingesting data from several large legacy systems into the Data Lake, transforming with various batch and real-time tools, and exposing through APIs and a web GUI built on microservices.
 
Technology used: Spark, NiFi, Kafka, Yarn, Hive, HDFS, HDF 2/3, HDP 2.4/2.6, RedHat 7

2016

Hadoop & Big Data Consultant

T4G, Toronto

Telus
Converting a batch-based system for TV customer experience modelling to near real-time. Hive over HDFS for static data, NiFi, Kafka, and Spark Streaming for near real-time, SparkML for customer modelling. The work was presented at Dataworks Summit 2017 in San Jose, with slides and audio available: Bringing Real Time to the Enterprise with Hortonworks DataFlow. Key parts are also available as a blog entry: Spark 2.0 streaming from SSL Kafka with HDP 2.4.
 
Technology used: Spark, NiFi, Kafka, Yarn, Hive, HDFS, krb5, HDF 2.0, HDP 2.2/2.4, RedHat 6/7

Financial Services Client
HBase optimization for time series data, Jupyter notebook setup, Spark (PySpark) and HBase for processing, Hive over HDFS for static data. HDP 2.3

2016

Hadoop & Big Data Consultant

EyeReturn Marketing, Inc., Toronto

Consulting on cluster setup and tuning; Hadoop (HDFS, Pig, YARN), HBase, and Spark. CDH 5.5

2010

Senior Software Developer & Scrum Master

GBIF, Copenhagen

Worked as a developer and the Scrum Master for a team of 4-8 developers to support the transition of gbif.org from a batch-oriented, MySQL based system to a real-time processing system based on Hadoop. The system went live in 2013 and has been running on two in-house Cloudera (CDH) clusters. I was responsible for cluster installation, upgrades, maintenance, performance tuning, and was part of the team that designed the architecture of the overall system. I wrote the portion of the system that speaks directly to HBase, including architecting the key and column structure, region sizes, etc. Contributed to all aspects of the Hadoop development, including Hive UDFs, Sqoop-ing in and out of the cluster, Ooozie workflows, custom MapReduce jobs in both versions 1 and 2 (yarn), as well as interactions with Zookeeper and Solr. Served as the primary DevOps liaison between System Administrators and the development team.
 
Additionally worked to build RESTful, JSON-based webservices in Java to deliver data from both CDH and traditional SQL databases (MySQL, PostgreSQL, PostGIS). These were managed in a continuous build environment using Jenkins, Maven, and Nexus. Helped build the analytics portion of the site in a combination of R and Hadoop.
 
Technology used: Hadoop (HDFS, Hive, Zookeeper, HBase, Mapreduce, Yarn, Sqoop, Oozie, SolrCloud), CDH 3/4/5, Java 6/7, R, RabbitMQ, Maven, Jenkins, Nexus, Varnish, Puppet, Ansible, Ganglia, Elasticsearch, Kibana, Git, IntelliJ, JIRA

2008

Software Architect & Team Lead

Zerofootprint, Toronto

Responsible for designing and developing services, messaging infrastructure, web and api clients, within an SOA, for a suite of enterprise environmental products. Notable among them are the Velo enterprise carbon management package, and the TalkingPlug energy management hardware devices and software.
 
Technology used: Java 5/6, SOA, ESB, Mule, JMS, ActiveMQ, Hadoop/HBase, SaaS, Spring MVC, Hibernate, Maven, Eclipse, Web Services, Ruby, JRuby, Rails

2007

Technical Lead & Senior Developer

TSOT, Toronto

Maintained and augmented a private social networking application in the fraternity, sorority and university market, written in Ruby on Rails. Designed and implemented a staged release process around a Subversion server. Technical Lead for a team of 5 comprising front-end, back-end and QA. Prioritized and organized work across the team to meet business goals.
 
Technology used: Ruby on Rails (1.2 & 2), MySQL, Mongrel, Subversion, Apache, TextMate, OS X, Linux

2006

Senior Java Developer

Penson Financial Services, Toronto

Built a Trade Order Management System for a fixed income trading platform. Built as a service in an SOA environment with connections to Bloomberg, both incoming trades via BTS, and outgoing trades via their Consolidated Message Feed (CMF).
 
Technology used: Java 5, J2EE, SOA, ESB, Mule, Websphere MQ/IBM MQSeries, JBoss, Spring, Hibernate, Maven, Eclipse, Subversion, Drools, XML, REST

EDUCATION HISTORY

1999

M.Eng Civil Engineering (Sustainability)

MCMASTER UNIVERSITY

Thesis: Artificial Neural Networks for the Evaluation of Sustainable Community Design

1997

B.Eng Civil Engineering

MCMASTER UNIVERSITY
1997

B.Sc Computer Science

MCMASTER UNIVERSITY

TECH SKILLS

Hadoop (HDFS, HBase, Hive, MapReduce)

Other Big Data (Spark, NiFi, Kafka)

Java

SQL (including MySql, PostgreSQL)

DevOps (Linux, scripting, log analysis)

SOFT SKILLS

Communication

Team Lead (Scrum Master)

TESTIMONIALS

TIM ROBERTSON

TIM ROBERTSON HEAD OF INFORMATICS / GBIF

Oliver was core to the developments and operations at GBIF. He managed and oversaw the pre-production and production Hadoop clusters – procurement, installation, maintenance and application deployment. During this time we evolved through CDH3,4 and 5 operating Hive, Oozie, MR1, YARN and HBase. Oliver was involved in design and development of the stream processing, and was a respected and well-liked team lead for the development group – he will be greatly missed

ANDREI CENJA

ANDREI CENJA SYS ADMIN / GBIF

I was Oliver’s colleague at the GBIF Secretariat in Copenhagen from 2010 to 2015 where, as system administrator, I provided support for the development group he was a member of. We’ve worked together very often on various matters including server and applications setup and configuration, and being a very fast learner, Oliver has become quite competent at Linux administration. I would consider him a very good DevOps person, if not for the fact that DevOps seems in my view too diminutive a notion to describe Oliver’s experience.

Oliver is one of the sharpest persons I’ve met, very knowledgeable, dedicated and hard working. He’s got very good communication skills and is now a certified Scrum Master.

Working with Oliver has been a pleasure and I’ll miss his camaraderie and sense of humor.
And his Star Wars jokes 🙂

IAN HALL

IAN HALL ARCHITECT / ZEROFOOTPRINT

Oliver is an architect’s architect – analytical, pragmatic, and extremely well versed in current technologies and trends. Oliver works well with a team, patiently encouraging and guiding more junior staff while staying open to suggestions and new approaches. Oliver also has a strong business sense and is comfortable voicing questions and suggestions around not just the ‘how’ of a project but also the ‘why’. The tension this creates between the business requirement and practical implementation is critical to creation of great software, and exactly what a good product manager looks for in a technical lead
I thoroughly enjoyed working with Oliver and strongly recommend him for architect/technical leadership positions.

CERTIFICATIONS and PARTNERSHIPS

Hortonworks Partnerworks
Hortonworks HDP Certified Administrator (HDPCA)
 
Cloudera Connect
Cloudera Certified Specialist in Apache HBase (CCSHB)
Cloudera Certified Administrator for Apache Hadoop (CCAH)
 
Certified ScrumMaster

DOWNLOAD MY CV

You can download my CV in pdf format. Recruiters! Please do not submit it for jobs without my permission.

DOWNLOAD CV

 

VIEW MY LINKEDIN PROFILE

Visit my LinkedIn profile for even more detail.

MY LINKEDIN PROFILE