David L. answered 10/26/22
Expert, Easy-to-Understand Python Tutoring (No Pandas or Data Science)
It seems many teachers do not tell their students about the Python big picture. So I will. Python is a general-purpose programming language, and one of the most widely used programming languages (including C++, Java, and SQL).
Like many other languages, there are a number of different ways to install Python. They are called "distributions". Some of them are:
- CPython (the most widely used distribution)
- PyPy (for high-performance computing), and
- Anaconda (used for Data Science and related topics).
Programming languages have the basic language, on top of which you can add utilities that are written in the basic language. These utilities come in packages, or libraries. In Python, packages are composed of a number of units, called modules. Some of the most basic Python modules are:
- math, which contains common mathematical functions such as the square root, the natural logarithm, and trigonometric functions.
- csv, which contains tools for reading from text files in the comma-separated-value format.
- re, for using regular expressions.
- os.path, for common directory manipulations.
These 4 modules are among the roughly 200 modules (the Python Standard Library) that are installed with CPython distribution. I am familiar with about 2-3 dozen of these modules.
The pandas, sklearn, numpy, and matplotlib packages are commonly used in data science and related topics. These modules are installed with the Anaconda distribution. They are not installed as part of CPython, but they can be added to CPython using pip (the CPython equivalent of conda).
FYI, pip can install any Python package available at https://pypi.org, the "Python Package Index". Currently, there are 409,781 packages available at the "Python Package Index"!
General-purpose programming languages typically have many thousands of popular libraries based upon them. It is not possible for a single person to know how to use more than a tiny fraction of those thousands of libraries.
I teach basic Python, and I use only CPython. I do not know data science or the typical data science pandas, sklearn, numpy, or matplotlib modules. If you want help with pandas, sklearn, numpy, or matplotlib, I recommend finding a tutor in the Data Science topic, since they're familiar with these packages.