Data Lab Resources
- Data Lab Python Exchange – A wiki-style question and answer platform for the Python community at Tufts University.
- Live Workshops – The Data Lab offers a wide range of live workshops, including several Python-related offerings.
Tufts University Courses
These are courses that have been known to either teach or use Python.
Please see departmental listings for details and SIS for current or future offerings.
- COMP-0010: Computer Science for All
- COMP-0205: Principles of Data Science in Python
- DATA-0201A: Introduction to Python for Data Analysis
- DATA-0201B: Python for Machine Learning
- MATH-0010: Coding Bootcamp
- PSY-0110: Computer Programming for Psychology
- UEP-0293: Urban Analytics and Visualization
- UEP-0294: Data Science for Urban Sustainability
The following resources are not affiliated with or endorsed by Tufts University.
- Comprehensive Guides
- Environment Setup and Management
Comprehensive Guides 🔝
Self-Guided Tutorials 🔝
- Kaggle Micro-Courses – An extensive set of self-guided courses designed to teach you everything you need to know about Python for data science, from basic Python commands to advanced machine learning techniques. The courses include programming exercises with hints, answers, and validation.
- The Python Tutorial – The official Python tutorial. Designed to give you a thorough overview of the built-in capabilities of Python.
- Google’s Python Class – A set of written materials, lecture videos, and numerous coding exercises designed to introduce Python to people with some previous coding experience.
- W3Schools Python Tutorial – While intended for people looking to use Python for web development, this extensive straight-to-the-point tutorial is suitable for everyone looking to get a thorough introduction to Python. Includes a lot of options to run and modify demo code right from your web browser.
- Software Carpentry Lessons – The Software Carpentry Foundation provides open access to the materials used to teach various Software Carpentry workshops. While these materials are intended to be covered in an instructor-led workshop, they are also quite useful for independent learning.
Free Online Books 🔝
- A Whirlwind Tour of Python – A fast-paced introduction to essential features of Python, aimed at researchers and developers with some previous programming experience, intending to use Python for data science and/or scientific programming. Serves as an introduction to the Python Data Science Handbook.
- Python Data Science Handbook – A deep dive into using Python for data science that covers everything from data manipulation and visualization to machine learning. Serves as a sequel to A Whirlwind Tour of Python and requires a solid understanding of Python basics.
- Automate the Boring Stuff – A very practical introduction to Python intended for those who wish to make their computer do things for them. Covers all basic Python features, but focuses heavily on string manipulation, working with files, and automating tedious daily tasks.
- A Byte of Python – A brief beginner’s guide to Python that provides an introduction to the most basic built-in functionalities of Python.
- Python for You and Me – An extensive introduction to Python intended for programmers new to the language with a focus on object-oriented programming and application development.
LinkedIn Learning 🔝
Tufts provides all students, faculty, and staff with free access to LinkedIn Learning, an online learning platform with on-demand video-based courses. The platform contains over 100 courses related to Python, only some of which are outlined here. Definitely take a look at the full course listing for yourself to find courses that focus on aspects of Python programming you are most interested in. To gain free access, log in with your Tufts credentials, after which you will have the option to link your account with your LinkedIn profile.
- Python Essential Training – A very thorough introduction to programming in Python that covers all the basics and even introduces some more advanced concepts related to object-oriented programming and application development.
- Python Quick Start – A brief 90-minute overview of the most essential Python programming fundamentals.
- Python for Students – A quick hour-long introduction to Python intended for high-school students and college undergraduates.
- Python for Data Science Essential Training – An extensive two-part course that covers the most important aspects of using Python for data science. The first part focuses on data analysis and visualization while the second part serves as an introduction to machine learning. Note that this course does not cover the basics of Python programming, so some previous familiarity with Python is highly recommended.
- Data Science Foundations: Python Scientific Stack – A hands-on crash course that introduces Anaconda, Project Jupyer, NumPy, Pandas, Matplotlib, Scikit-Learn, and many other Python libraries essential to data science. The capabilities of Python as a data science tool are demonstrated by analyzing real datasets. A good understanding of Python basics is a prerequisite.
- Using Python for Automation – A quick hour-long introduction to using Python for automation. Requires some knowledge of basic Python programming. Focuses on reading and writing files, web scraping, and using APIs (application programming interfaces).
Environment Setup and Management 🔝
This section covers the setup and management of your Python computing environment, including initial installation and package management using virtual environments.
Command-Line Syntax 🔝
Use of various command-line interfaces is often required for installing Python packages, managing virtual environments, and troubleshooting common issues. Hence, it is highly recommended you familiarize yourself with the basic command-line syntax specific to your system.
- Windows Command Prompt in 15 Minutes – Although originally written as a quick introduction to Windows command-line syntax for a Java course at Princeton University, this resource is useful for anyone looking to get started with the Windows Command Prompt. Just try to do your best to ignore the Java-specific aspects where needed.
- Windows Command Line Cheat Sheet [PDF] – A comprehensive pocket reference guide from the SANS Institute that covers all the most useful Windows commands.
- Command Prompt Cheat Sheet [PDF] – A very compact cheat-sheet covering the basics of Windows Command Prompt syntax.
Downloading and Installing Python 🔝
- Anaconda Distribution – If you are planning on using Python for data science, you should definitely install the Anaconda Distribution. It is a comprehensive data science toolkit that includes Python, Project Jupyter, a graphical user interface (GUI), and an easy-to-use package manager. Furthermore, it comes with most data-science-related Python packages preinstalled and supports virtual environments.
- Thonny and Python Bundle – If you are planning on using Python for programming and application development, this bundle might be the best starting point for you. It includes a self-contained Python distribution and an easy-to-use IDE (integrated development environment) called Thonny that is specifically designed for beginners.
- Official “Vanilla” Python – In the world of data science, the official Python distribution is often referred to as vanilla Python. It is just Python without all the additional features that data scientists and researches use most often and that are included by default in Anaconda. Hence, it is not recommended if you are looking to use Python for data science. However, for up-and-coming programmers and application developers, it might be of interest. Because a lot of computer applications actually rely on and use Python in the background, you might already have vanilla Python installed on your computer. In that case, it is advisable you do not use it for your projects as messing with it might cause errors in other applications on your system.
Using Anaconda 🔝
- Getting Started with Anaconda – A comprehensive quickstart-guide for users completely new to Anaconda.
- Getting Started with Anaconda Navigator – User guide for the graphical user interface (GUI) included in Anaconda.
- Anaconda Starter Guide [PDF] – A brief brochure to help you get started with Anaconda.
Using Conda (Advanced) 🔝
Conda is the package manager included in Anaconda. You can access most functionalities of Conda using Anaconda Navigator. However, advanced users might prefer to use Conda directly from the command line.
- Getting Started with Conda – A comprehensive quickstart-guide to get you comfortable with Conda.
- Conda Cheat Sheet [PDF] – A brief cheat-sheet of the most essential Conda commands.
Installing Packages 🔝
There are two major Python package managers available – Pip and Conda. Pip is the default Python package manager available for all Python distributions. Conda in a more advanced package manager included in the Anaconda Distribution. Whenever possible, it is recommended for novice users to use Conda because it has many advantages over Pip. Primarily, Conda automatically checks for and attempts to resolve dependency issues. Furthermore, Conda packages never require compilation from source code.
- Understanding Conda and Pip – A must-read explaining the core differences between Conda and Pip.
- Managing Packages with Anaconda Navigator – Recommended for novice users just starting up with Python and Anaconda.
- Installing Packages with Conda – Recommended for intermediate users with some command-line experience.
- Installing Packages with Pip – Not recommended, but required for some packages and for those not using the Anaconda Distribution.