Courses We Offer
Contact us to learn more about training with Enplus. We're available for small group and company trainings, either in-person or online.
Foundations of Python and Pandas
Pandas is the powerful and one of the most popular libraries for working with tabular data in Python. In this course, you'll learn how to use the two most important data structures in Pandas, the Series and the Dataframe, and understand how to avoid common Pandas missteps.
Advanced Python and Pandas
Building on the material in "Foundations of Python and Pandas", this course explores the more advanced features of Pandas. These include working with time series data, performing joins, and reshaping data between wide and long formats.
Introduction to R
R is a free and open-source programming language and environment designed explicitly for statistics and data analysis. In this course, you'll learn the most important data structures in R and how to use these to run statistical procedures and generate publication-quality graphics.
Advanced Topics in R
Namespaces, closures, and lazy evaluation provide powerful cababilities to R but may not be familiar to data scientists coming from more imperative languages like Python or Matlab. In this course, you'll learn how to use these advanced features of R to simplify your own code and also learn how to debug errors caused by their unintentional use.
Building Packages in R
Reuse and share your code with your team or the world. In this course, you’ll learn how to create R packages, the best way to distribute R code with dependency management, tests, and documentation.
Exploratory Data Analysis and Feature Engineering
What’s the best way to dive into a new dataset? How do I begin to understand a new dataset? Understanding a new dataset often begins with exploratory data analysis.
Dashboarding and Data Visualization (Python/R)
Python boosts a wide range of powerful data visualization libraries ranging from ones primarily designed to create static plots like matplotlib and seaborn to interactive display tools like bokeh, altair, and plotly. In this course, you’ll learn how to work with mpl, seaborn, and holoviews, a wrapper on top of bokeh. We’ll also learn how to put these together into lightweight dashboards.
Modeling and Analytics
Introduction to Machine Learning
Methods like gradient boosted trees, random forests, and nearest neighbors search can achieve impressive results on complex datasets with minimal feature engineering. In this course, you’ll learn the theory behind these methods, when to use them, and become aware of some drawbacks.
Ensemble Methods (Bagging / Boosting, Random Forests)
Two of the most successful modern machine learning methods, Random Forests and gradient boosted trees, are called ensemble methods because they combine predictions from multiple simpler learners. In this course, we explore popular implementations of both these methods used sklearn and xgboost.
Introduction to Natural Language Processing (NLP)
Despite the proliferation of images and video, most information on the Internet is text-based. In this course, we'll learn the basics of NLP, started with parsing text and processing it into a more standard format, then converting words into different feature vectors we can use to train machine learning models.
Which version of a web page converts better? To answer this question, you often need to run a randomized experiment. In this lesson, you’ll learn how to determine the size of your experimental group and analyze the results of the experiment.
Supervised and Unsupervised Learning
In practice, many machine learning problems lack labeled data against which a model may be trained. In this course, you’ll learn how approach problems apply unsupervised learning techniques to problems without labels.
Introduction to Data Engineering
Introduction to Cloud Computing on AWS
Amazon Web Services (AWS) is the most popular, and arguably original cloud provider. In this course, you'll learn how to set up your own virtual private cloud, deploy virtual machines, and securely connect to them. We'll also cover basic cloud automation using Terraform to automate the process of creating and terminating instances.
Relational Database Design and Development
Relational databases solve the problem of persistently storing data, handling concurrent access, and providing a query language for data retrieval and analysis. In this course, you'll learn how to design your own relational schemas for data science and engineering work and explore how different kinds of relational databases support transactional and analytic workloads.
Introduction to SQL
If there is a universal language of working with data, it's probably SQL. In this course, you'll learn to query existing databases using SQL and see how the declarative nature of the SQL language allows you to describe the problem to solve without having to worry about the underlying implementation details.
Dask and Python
In this course, you'll learn how to use Dask, a Python library for parallel and distributed computing, to scale compute and memory across multiple cores. Dask provides integrations with Python libraries like pandas, numpy, and scikit-learn so you can scale your computations without having to learn completely new libraries or significantly refactoring your code.
Testing in Python
Automated testing techniques verify that your code works as you expect it to. In this course, we'll learn how to use pytest, one of the most popular frameworks for testing Python code. We'll also discuss testing data science and engineering code.
Version Control with Git
Modern, high-performing teams track their own work and collaborate using version control systems like git. In this course, you’ll learn to use git on you local computer and with cloud services, and see how it compares to other version control systems.
API Development with FastAPI
In this course, you'll learn how to use FastAPI, a modern Python library that greatly simplifies API development by using recent Python language features like type hints and asyncio. We'll learn how to connect arbitrary machine learning models to your API and review cases where more specialized tools for model serving should be used.
Resources to Get You Started
Not ready for a formal session? Check out one of our upcoming live trainings or recorded video lessons on the O'Reilly platform.
Upcoming Live Trainings:
- April 28, 2020 - Programming with Data: Foundations of Python and Pandas
- May 12, 2020 - Programming with Data: Advanced Python and Pandas