See also: DS Commandline tools, Machine Learning, R, Statistics, Bookmarks on pinboard.in, Visualizations
A Subjective and Anecdotal FAQ on Becoming a Data Scientist | tdhopper.com
- Example Machine Learning Notebook.ipynb – a Jupyter notebook. See also zeppelin – notebook for interactive data analytics.
- Continually updated data science Python notebooks: Deep learning (TensorFlow, Theano, Caffe), scikit-learn, Kaggle, Spark, Hadoop MapReduce, HDFS, matplotlib, pandas, NumPy, SciPy, Python essentials, AWS, and various command lines.
Data Science for Doofuses: What Toolbox to Use | CyberSmashup a little primer on what tools to use.
2018 BE/Bi 103 home Data Analysis in the Biological Sciences, Caltech, Fall term, 2018. Uses Python, Jupyter etc., Looks good!
Introduction to data analysis. stat405. by Hadley Wickham of ggplot2 fame. Fall 2012, Rice University : - lecture notes - R code - data sets
- Data Science Courses | Harvard Extension Online and On-Campus Courses
Stat645. Data visualisation. at Rice by Hadley Wickham : - Reading papers - Code heavy assignemnts and projects - Heavy use of ggplot2
hon322f. Escape from flatland at Rice by Hadley Wickham : - explore data visually - think in more than 3 dimensions - interactive graphics software
Applied Statistical computing at Rice Hadley Wickham : - how to deal with complex, messy, real data - Use graphics to explore and understand data - Gain familiarity with basic data collection, storage and manipulation. - Fluently reshape data into the most convenient form for analysis or reporting - Uses Excel, R and SAS
… and a lot of short courses by Hadley Wickham
- Presenting Data and Information workshops by Tufte [PAID]
- Data Analytics for Beginners : Part 1,2 and 3 using the Titanic dataset from Kaggle.
- Emacs for Data Science
Graphlab interactive python console with pandas, numpy, graphlab engine in a hosted environment. Graphlab engine implements clustering, CV, graphical models, graph analysis etc.,
- Deedle: Exploratory data library for .NET
- Seaborn: Improved matplotlib for statistical data visualization (like ggplot2)
- Building data science teams – DJ Patil
- Some ideas on communicating risks to the general public
- My Amazon wishlist of datascince books
- Tidy data (pdf) how to create tidy datasets; how to deal with un-tidy ones.
List of related things
- The Unofficial Google Data Science Blog
- correlation - What happens if the explanatory and response variables are sorted independently before regression? - Cross Validated
- (1) Sean McClure’s answer to Why are there so many fake data scientists and machine learning engineers? - Quora
- Chris Albon - Data Science, Machine Learning, and Artificial Intelligence
- Towards Data Science