top of page

Data Analytics and Cloud

Natural Language Processing

Natural language processing (NLP) was used to analyze more than 200,000 crowd sourced text reviews. We developed a medical lingua corpus to extract words and phrases for specific medical conditions. Contextual search was then used to extract side effects, efficacy and sentiment. This information was then aggregated and used for the machine learning models that were used to predict the ideal chemical compound combinations for 18 different medical conditions.

 

Published results in research paper on Scribd 2017. 

Applied Machine Learning

Applied machine learning to discover the relationship between nutraceuticals and medical conditions. Used unsupervised learning to develop clusters and determine the relationship between chemical profiles and medical conditions.

 

Models included K-Means, Fuzzy Clustering, DBSCAN (density-based), Latent Class and Sparse Representation. Validated results with medical practitioners and developed new models to explore specific compounds.

 

Published results in research paper available on Scribd. 2017

Machine Learning in the Cloud

Product development for predictive analytics tool for the entertainment industry to predict box office revenues and schedule new movie releases. Used machine learning on Azure for data wrangling, feature engineering and feature selection. Sourced and merged various datasets to compile the most complete database in the industry. However, there were certain sections of missing data that required evaluation, weighting and imputation. Tools included Trifacta for data wrangling and prep, Knime and Mice to impute missing data, R Studio for statistical analysis and ggplot for visualization. Next steps are time series algorithms for predicting release dates. 2016

AWS Cloud Migration

Evaluate AWS accounts for cost savings and new reference architecture. Accounts included commerce applications, data warehouse and business intelligence platforms. Analyze costs using AWS Trusted Advisor and CloudHealth to identify 55% yearly cost savings ($500k+) by implementing right sizing, reserved instances and storage optimization across cloud, co-location and on premise data centers. Evaluate current technology stack and  specify new reference architecture. 2016

Data Warehouse and Data Lake

Develop data pipeline between two different enterprises with headquarters in LA and NY. Dataset included unstructured data from data lake (AWS Redshift) and structured data from the data warehouse receiving feeds from the production database (SQLServer). Data was then mapped to Hive tables in the destination database. The shared data was used to create individual customer profiles across both companies for joint marketing. In Phase 1 of the project data was shared via exports to S buckets for analysis and preliminary mapping into Hive tables. In Phase 2 a scheduled data pipeline with automated feed was specified.

 Big Data Strategy

Wearing many hats at this start-up division of VentureSoft Global. Business development, requirements and product management for predictive analytics products. Developed and managed website, white papers and social media. Worked with the VP of Analytics and VP Client Relations to identify and develop new business and marketing campaigns. Developed strategy for outreach and services offerings. Managed Data Engineers, Data Scientists, Researchers and Developers for predictive analytics solutions, including deep learning with TensorFlow and machine learning with Azure. 2016

Predictive Analytics

Data analytics and business intelligence from 30 years of talent assessment instruments. This included integrating a stream of new assessments data from acquired companies. Data was formatted with metadata and attributes to allow researchers to quickly find what they needed. Data platforms included SAP, SQL, MarkLogic NoSQL, Amazon Dynamo DB NoSQL, Salesforce.com. Data was ingested from external platforms and partners using an XML based API. Some basic algorithms were pre-processed in the Data Warehouse, while the more sophisticated statistical algorithms were processed in R. 2013-2015

Data Warehouse

Data Warehouse initiative to integrate 24 different databases from different platforms (SQL, Oracle, SAP, DynamoDB) to consolidate 30 years of intellectual property. Used SSIS packages for ETL and then built custom RESTful APIs for importing XML and JSON formatted data. SSRS for reporting to Sharepoint repository. Ingest data from acquired companies and third party companies. Provide interface to Data Scientists to perform analytics and data science findings to stakeholders. Used TFS for Agile development issue tracking and user stories to manage remote teams of Business Analysts, Data Scientists, Database Developers, Architects, IP Subject Matter Experts, Executive Stakeholders. 2014-2015

Master Data Management (MDM)

Data cleansing and data integration for client company data that spans across various systems, including SAP, Salesforce and a custom, legacy CRM for various purposes including business development, account management and invoicing. This data drives a hierarchy and incorrect data will break the hierarchy. Evaluation of data entry points revealed user workarounds were creating much of the bad data by masking the certain critical fields at the preliminary business development stage. Revised application workflows to require user to select options from SAP source of truth database, or request new company creation. 2015

Cloud Apps for iOS and Web

This cloud based web application uses AWS S3 and DynamoDB backend. Data feeds into a data warehouse in real-time using custom APIs. The iPad app (iOS front end / AWS S3 / cloud front backend) delivers product demos for business development in offline mode. Originally a single page architecture, React web application, then wrapped in a native application to create a version that can be used on an iPad for offline demonstrations, with additional integrated web content and PDFs. Includes executive assessments for candidates and job profile tool for client HR team. Candidate reports are generated and matched against Best in class profiles. Korn Ferry Four Dimensions, 2014-2015

Systems Integration HR and CRM

Systems integration for HR systems and employee portals in order to provide and track corporate compliance programs at the individual employee level. Includes SAP, SQL, Oracle, Sharepoint, Microsoft GAL. Systems integration for intellectual property databases including SQL, DynamoDB, Oracle, SAP. Systems integration for CRMs including Salesforce and custom built CRMs. 2013-2014

Salesforce Integration

Port over an existing mobile dashboard and reporting application for Account Management and CRM to the Salesforce.com mobile platform. Managed a phased rollout to the user population, beginning with a subset of features and a small group of power users to gain momentum and build awareness. Trained and prepared the Help Desk to seamlessly support users learning the new platform. Defined the requirements for the mobile application. 2015

​

​

​

bottom of page