[ad_1]
We lately realized that we hadn’t introduced you any information science cheatsheets shortly. And it isn’t for his or her lack of availability; information science cheatsheets are all over the place, starting from the introductory to the superior, masking matters from algorithms, to statistics, to interview ideas, and past.
However what makes a very good cheatsheet? What makes a cheatsheet worthy of being singled out as a very good one? It is tough to place your finger on exactly what makes a very good cheatsheet, however clearly one which conveys important info concisely — whether or not that info is of a selected of common nature — is unquestionably a very good begin. And that’s what makes our candidates at the moment noteworthy. So learn on for 4 curated complementary cheatsheets to help you in your information science studying or assessment.
First up is Aaron Wang’s Data Science Cheatsheet 2.0, a 4 web page compilation of statistical abstractions, basic machine studying algorithms, and deep studying matters and ideas. It isn’t meant to be exhaustive, however as a substitute a fast reference for conditions akin to interview preparation and examination evaluations, and the rest requiring an identical degree of assessment depth. The creator notes that whereas these with a fundamental understanding of statistics and linear algebra would discover this useful resource of most profit, newbies ought to be capable of glean helpful info from its content material as nicely.
Screenshot from Aaron Wang’s Data Science Cheatsheet 2.0
Our subsequent cheatsheet providing at the moment is that which Aaron Wang’s useful resource relies on, Maverick Lin’s Data Science Cheatsheet (Wang’s reference to his personal as 2.0 is a direct nod to Lin’s “unique”). We will consider Lin’s cheatsheet as extra in-depth than Wang’s (although Wang’s resolution to make his much less in-depth appears intentional and a helpful various), masking extra basic information science ideas akin to information cleansing, the concept of modeling, doing “massive information” with Hadoop, SQL, and even the fundamentals of Python.
Clearly this can enchantment to those that are extra firmly within the “newbie” camp, and does a very good job of whetting appetites and making readers conscious of the broad subject of information science, and lots of the various ideas which it encompasses. That is undoubtedly one other stable useful resource, particularly if the reader is newcomer to information science.
Screenshot from Maverick Lin’s Data Science Cheatsheet
As we transfer additional again in time — in search of the inspiration for Lin’s cheatsheet — we come throughout William Chen’s Probability Cheatsheet 2.0. Chen’s cheatsheet has garnered a lot consideration and reward over time, and so you could have come throughout it sooner or later. Clearly with a distinct focus (given its identify), Chen’s cheatsheet is a crash course on, or deep dive assessment of, likelihood ideas, together with quite a lot of distributions, covariance and transformations, conditional expectation, Markov chains, numerous formulation of significance, and rather more.
At 10 pages, it is best to be capable of think about the breadth of likelihood matters being lined herein. However do not let that deter you; Chen’s capability to boil ideas right down to their important bullet factors and clarify in plain English whereas not sacrificing on necessities is noteworthy. It is usually wealthy in explanatory visualizations, one thing fairly helpful when house is restricted and the need to be concise is robust.
Not solely is Chen’s compilation a top quality one and worthy of your time, as a newbie or somebody fascinated about a full assessment, I might work in reverse order of how these sources had been introduced — from Chen’s cheatsheet, to Lin’s, and eventually to Wang’s, constructing on prime of ideas as you go.
Screenshot from William Chen’s Probability Cheatsheet 2.0
One last useful resource I am together with right here, although not technically a cheatsheet, is Rishabh Anand’s Machine Learning Bites. Billing itself as “[a]n interview information on widespread Machine Studying ideas, finest practices, definitions, and concept,” Anand has compiled a large ranging assortment of data “bites,” the usefulness of which undoubtedly transcends the initially meant interview preparation. Subjects lined inside embody:
- Mannequin Scoring Metrics
- Parameter Sharing
- k-Fold Cross Validation
- Python Information Varieties
- Bettering Mannequin Efficiency
- Laptop Imaginative and prescient Fashions
- Consideration and its Variants
- Dealing with Class Imbalance
- Laptop Imaginative and prescient Glossary
- Vanilla Backpropagation
- Regularization
- References
Screenshot from Machine Learning Bites
Whereas machine studying “ideas, finest practices, definitions, and concept” are touched on, as promised within the useful resource’s description of itself, these “bites” are undoubtedly geared towards the sensible, which makes the positioning complementary to a lot of the fabric lined within the three beforehand talked about cheatsheets. If I had been trying to cowl all the materials in all 4 of the sources on this submit, I will surely have a look at this after the opposite three.
So there you’ve gotten 4 cheatsheets (or three cheatsheets and one cheatsheet-adjacent useful resource) to make use of on your studying or assessment. Hopefully one thing right here is helpful for you, and I invite anybody to share the cheatsheets they’ve discovered helpful within the feedback under.
[ad_2]
Source link