A weekly newsletter with the latest developments in Data Science and Machine Learning and Artificial Intelligence.
Welcome to the 147th edition of the Sunday Briefing.
This week we continue on hiatus from blogging, but you can catch up with our recent posts. In the Visualization for Science substack we have “3D Surface Plot: The US population distribution”. We also have recently published Epidemic Models: the role of degree correlations on the Graphs for Science Substack, while on Medium we have a recap of the Top 10 Books we read in 2021.
On our regularly scheduled content we have a look at the Tweet Downloader from Twitter, an Introduction to K-Means Clustering, Python Design Patterns and how a researcher used a 379-year-old algorithm to crack crypto keys found in the wild.
From the Ivory Tower we consider how Machine learning and phone data can improve targeting of humanitarian aid, how the dynamic importance of network nodes is poorly predicted by static structural features and review the Mathematics of Artificial Intelligence and the dynamics on higher-order networks.
This weeks ‘Data Science Book’ highlight is Data Science Book is “Causality” by J. Pearl. As always you can find all the previous book recommendations on our website. In the video of the week we have a lecture on Node centrality and ranking on networks.
Data shows that the best way for a newsletter to grow is by word of mouth, so if you think one of your friends or colleagues would enjoy this newsletter, just go ahead and forward this email to them. This will help us spread the word!
The D4S Team
The latest post in the CoVID-19 series, ‘How to model the effects of vaccination’ takes a look at how simple modifications of the SIR model can help us better understand how vaccines work. As usual, all the code is available in GitHub: http://github.com/DataForScience/Epidemiology101
The latest post in the Causality series covers section ‘3.7 — Mediation’, a recipe to calculate the controlled directed effect. The code for each blog post in this series is hosted by a dedicated GitHub repository: https://github.com/DataForScience/Causality
This weeks Data Science Book is “Causality” by J. Pearl. Causal Inference is a lively and fast developing area in Data Science that we believe has the potential to be truly revolutionary in coming years (you can get a quick overview of the main ideas in our Causal Inference series over at Medium). Judea Pearl is one of the most prominent founding fathers of this field that he introduces masterfully in this textbook. While the approach Pearl chooses is mathematically rigorous, thanks to his rich use of toy examples, the key ideas and concepts are easily grasped and adapted to real world datasets. Causal Inference is a powerful arrow in any Data Scientist’s quiver and this is the ideal starting point if you’re interested in taking the first steps in this exciting area.
Tutorials and blog posts that came across our desk this week.
- How I Discovered Thousands of Open Databases on AWS [infosecwriteups.com]
- Introduction to K-Means Clustering [pinecone.io]
- Vectorization, dependencies and outer loop vectorization: if you can’t beat them, join them [johnysswlab.com]
- Python Design Patterns [python-patterns.guide]
- Bayes Rules! An Introduction to Applied Bayesian Modeling [bayesrulesbook.com]
- Researcher uses 379-year-old algorithm to crack crypto keys found in the wild [arstechnica.com]
- Official Tweet Downloader [developer.twitter.com]
Some of the most interesting academic papers published recently
- Machine learning and phone data can improve targeting of humanitarian aid (Emily Aiken, Suzanne Bellue, Dean Karlan, Chris Udry, J. E. Blumenstock)
- Group interactions modulate critical mass dynamics in social convention (I. Iacopini, G. Petri, A. Baronchelli, A. Barrat)
- The effect of anti-money laundering policies: an empirical network analysis (P. Gerbrands, B. Unger, M. Getzner, J. Ferwerda)
- Reconstructing social mixing patterns via weighted contact matrices from online and representative surveys (J. Koltai, O. Vásárhelyi, G. Röst, M. Karsai)
- Dynamic importance of network nodes is poorly predicted by static structural features (C. van Elteren, R. Quax, P. Sloot)
- The Mathematics of Artificial Intelligence (G. Kutyniok)
- Dynamics on higher-order networks: A review (S. Majhi, M. Perc, D. Ghosh)
Interesting discussions, ideas or tutorials that came across our desk.
Node centrality and ranking on networks
All the videos of the week are now available in our Youtube playlist.
Opportunities to learn from us:
- Apr 20, 2022 — Natural Language Processing (NLP) for Everyone [Register]
- Apr 27, 2022 — NLP with Deep Learning for Everyone [Register]
- May 06, 2022 — Applied Probability Theory for Everyone [Register] 🆕
- May 20, 2022 — Transforming Excel Analysis into Python and pandas Data Models [Register] 🆕
Long form tutorials:
- Natural Language Processing 5.5h, covering basic and advancing techniques using NLTK and Keras
- Times Series Analysis for Everyone 6h covering data pre-processing, visualization, ARIMA, ARCH and Deep Learning models
Thank you for subscribing to our weekly newsletter with a quick overview of the world of Data Science and Machine Learning. Please share with your contacts to help us grow!
Publishes on Sunday.