Clement Gendler

New York, NY · clementgendler98@gmail.com

Hello! I take on challenges to meet business needs with an insightful data-driven approach. With experience in Data Analytics, Data Science, Healthcare, and AdTech, I’m an innovator that creatively solves complex problems to provide value for clients. Other passions I have besides working with big data are creative writing, exploring NYC, and listening to classic rock.

Projects

Chess Outcome Prediction

I examine a data set of 20,000 chess games and try to predict the outcome using a neural network and only the moves played. The moves are in a PGN format, so work needs to be done in order to prepare them for modeling. Data cleaning and preprocessing sections of the project handle that by putting the moves into split lists, putting those lists into an array, then tokenizing all unique moves. In addition, exploratory data analysis is conducted in order to gain a better understanding of the data and look into potential concepts to include into the model. Examples of EDA include looking at the distribution of game outcomes, a histogram of average ratings, and examining the most popular chess openings used by all and highly skilled players. I used LSTM and GRU neural networks to predict the outcome with an 88% accuracy for my best model.

Predicting Future COVID-19 Hotspots

I worked alongside two Data Science Fellows to create a model and map of future COVID-19 hotspots. Our study utilized COVID-19 daily data on confimed cases, population, weather, and COVID-related search activity to create our model. The breadth of the data (over 6 months of daily data for 50 US states) introduced a not-inconsequential amount of complexity. After testing SARIMA and ARIMA models, we tried two additional models: a Facebook Prophet and an IID Model. Ultimately, we determined that the IID Model yielded the most specific predictions, as it allowed us to fit models specific to each state. Our documentation includes heatmaps which visualize for the reader the difference in predictions.

Natural Language Processing Reddit Classification

The mission of this project is to scrape 5,000 posts overall from the anime and manga subreddits and use two models to predict from which subreddit the posts came based on the title. First, a function is created to retrieve all the data. Then, the data set is manually cleaned to ensure as high accuracy as possible and care when handling the unique range in post characters within these subreddits. Exploratory data analysis and preprocessing are handled via a WordCloud visual to show the word range, indicating prevalent words that can be attributed to one of the two subreddits. In addition, a plot of the top 10 words and their frequencies is displayed. RegExpTokenizer is utilized to manually tokenize title and post columns. Then, tokens are lemmatized to reduce verbs and improve accuracy even further. Preprocessing and lemmatizing functions were modified and mapped to the columns, looping back to join the tokenized data into a string form. Two models were tested: a basic LogisticRegression and a basic MultinomialNB, both with a CountVectorizer to further clean and process the data. To my surprise, the LogisticRegression proved to be more accurate, which I justify later within the project. I follow up by making conclusions and recommendations based on my findings.

Experience

Data Analyst

DeepIntent

Built SQL queries in Snowflake, Google BigQuery, and Spark to combine various data sets for ad hoc and custom analyses. Used Looker to create dashboards and models within LookML, automating reporting to increase reports sent by 500%. Provided data-driven reports across DeepIntent’s product suite and client base, generating $1 million in incremental spend. Audited and identified improvement opportunities in our Patient Modeled Audiences, increasing overall reach KPI by 10%. Ran A/B tests to determine marketing technique efficacies using Python, SQL, and Excel (PivotTable, VLOOKUP, etc.). Developed innovative methodologies to derive, evaluate, and present performance benchmarks to clients. Troubleshot data mining, integration, and quality issues while owning campaign-related systems and databases. Created and maintained thorough documentation as the point of contact for dashboard report management.

November 2021 - September 2023

Marketing Data Analyst, Crossix Analytics

Veeva Systems

Produced and managed DIFA HCP Site, a platform for informing, measuring, and optimizing healthcare marketing. Used SQL, Python, Excel, Spark, and data visualization tools to produce monthly deliverables that track web analytics. Analyzed root causes and data anomalies while monitoring and troubleshooting analysis execution. Automated, planned, and improved internal processes through collaboration with the Engineering team. Handled client inquiries related to data and DIFA methodology by providing insight to the Client Services team

January 2021 - November 2021

Data Science Fellow

General Assembly

Completed 480-hour immersive course in python programming and data science . Solved real-world problems by applying data collection, cleaning, visualization, analysis, modeling, as well as machine learning and deep learning techniques to large data sets. Predicted a Chess game’s outcome using only moves played with LSTM and GRU neural networks. Analyzed and classified 5,000 Reddit posts using Natural Language Processing. Forecasted future COVID-19 hotspots with Prophet and IID models in collaboration with two Data Science Fellows

August 2020 - November 2020

Marketing Analyst Intern

Pink Sky LLC - Boutique Marketing Agency

Analyzed qualitative research data for a mattress and personal care company via Excel. Developed summary analysis for mattress research report, providing companies with insight into consumer choice and gaining an advantage over competing companies. Designed a research presentation via PowerPoint, placing emphasis on visualizing summary analysis with a diverse variety of charts to illustrate findings

July 2019 - August 2019

Summer Analyst

Weild & Co. - Investment Bank

Assisted in all areas of the company’s services including strategic advisory and investor relations. Tracked FINRA compliance and placement agent growth to promote business automation. Analyzed, selected, and networked with over 500 FinTech funds to support raising capital. Contributed to weekly executive team meetings, promoting intercompany collaboration

June 2018 - August 2018

Education

General Assembly

Certificate of Completion

Data Science Immersive

480-hour immersive course in python programming and data science.

August 2020 - November 2020

New York University

Bachelor of Arts

Majored in Economics

August 2016 - May 2020

General Assembly

Certificate of Completion

Python Programming

Mastered the fundamentals of Python in an intensive 40-hour course

August 2019 - August 2019

NYU Buenos Aires

Semester Abroad Studies

Buenos Aires, Argentina

Studied abroad at NYU

August 2017 - December 2017

Skills

Technical

Python
SQL
Data Visualization (Matplotlib, Plotly, Seaborn, Looker, Tableau)
Google BigQuery
Pandas
Snowflake
Numpy
Scikit-Learn
Keras/TensorFlow
Natural Language Processing (NLP)
Spark
HTML/CSS
JavaScript
Java
R

Additional

Spanish speaker (professional working proficiency)
Microsoft Office (Excel, PowerPoint, Word)

Interests

When I'm not working, I have a decently wide variety of hobbies. I enjoy listening to jazz and classic rock music, I like to relax by either cooking, baking, or making cocktails, and I love to participate in various outdoor activities such as hiking and exploring new areas!

Recently, I've started to become very fascinated with trying foods from other cultures as well as traveling. During my semester abroad in Buenos Aires, Argentina, I tried to immerse myself as much as possible. Traveling to locations within Argentina like Ushuaia and Iguazu Falls were incredible experiences, which I consider some of the most memorable ones I've ever had.