Machine Learning X
A free, open-source Machine Learning & Data Science curriculum that’s currently under heavy development. This is a project by intechgration.
Make sure to install the following software on your machine:
Here are some of the most commonly used data formats. They are used to save information in a way that is readable both by humans and machines. The structured data stored in these file formats can be exchanged between systems and processed by programs.
CSV (Comma-Separated Values) is a simple file format used for storing and exchanging structured data, where each line represents a record or entry, and fields or columns within each record are separated by commas.
Understanding CSV Files (⏱️ 6min)
file mentioned in the video here.In short, CSV is a lightweight data format, where:
delimiter characterAll spreadsheet apps (MS Excel, Google Sheets, Numbers, etc.) can read and write CSV
Watch The Basics of YAML in Under 5 Minutes (⏱️ 4min)
Feel free to explore and become familiar with other related data formats and their variations, such as TSV (Tab Separated Values), TOML (Tom’s Obvious, Minimal Language) and other.
SQL is the most widely used language in Data Science.
Watch and practice with Spreadsheets & SQL for Beginners (⏱️ 21min)
Take the Database Murder Mystery Challenge
Watch the Harvard CS50 Introduction to Databases with SQL course (⏱️ 11h):
“Python, is the primary high-level language for Machine Learning and Data Science.”
You can practice Python online (without the need to install anything on your machine) through the PythonFiddle website
Extra Resources:
Pick one of the following courses for a primary on Statistics (or watch both of them for an even better understanding of the topic):