A weekly newsletter with the latest developments in Data Science and Machine Learning and Artificial Intelligence.
Welcome to the March 13th edition of the Sunday Briefing.
This week we’re on hiatus from blogging, but you can catch up with our recent posts. In the Visualization for Science substack we have “3D Surface Plot: The US population distribution”. We also have recently published Epidemic Models: the role of degree correlations on the Graphs for Science Substack, while on Medium we have a recap of the Top 10 Books we read in 2021.
We’re also proud to announce two new webinars coming up in May. On May 6th, we’ll have Applied Probability Theory for Everyone and on May 20th we’ll dive into Transforming Excel Analysis into Python and pandas Data Models.
On our regularly scheduled content we take a look at Facebook Libra: the inside story of how the company’s cryptocurrency dream died, The Broken-Stick Model for Principal Component Selection and Shopify’s Data Science & Engineering Foundations.
From the Ivory Tower we explore Covid-19 Vaccine Effectiveness against the Omicron (B.1.1.529) Variant, whether or not It is high time we let go of the Mersenne Twister and Bandit Sampling for Multiplex Networks.
This weeks ‘Data Science Book’ highlight is Data Science Book is “Causality” by J. Pearl. As always you can find all the previous book recommendations on our website. In the video of the week we have a tutorial on Dask.
Data shows that the best way for a newsletter to grow is by word of mouth, so if you think one of your friends or colleagues would enjoy this newsletter, just go ahead and forward this email to them. This will help us spread the word!
The D4S Team
The latest post in the CoVID-19 series, ‘How to model the effects of vaccination’ takes a look at how simple modifications of the SIR model can help us better understand how vaccines work. As usual, all the code is available in GitHub: http://github.com/DataForScience/Epidemiology101
The latest post in the Causality series covers section ‘3.7 — Mediation’, a recipe to calculate the controlled directed effect. The code for each blog post in this series is hosted by a dedicated GitHub repository: https://github.com/DataForScience/Causality
This weeks Data Science Book is “Causality” by J. Pearl. Causal Inference is a lively and fast developing area in Data Science that we believe has the potential to be truly revolutionary in coming years (you can get a quick overview of the main ideas in our Causal Inference series over at Medium). Judea Pearl is one of the most prominent founding fathers of this field that he introduces masterfully in this textbook. While the approach Pearl chooses is mathematically rigorous, thanks to his rich use of toy examples, the key ideas and concepts are easily grasped and adapted to real world datasets. Causal Inference is a powerful arrow in any Data Scientist’s quiver and this is the ideal starting point if you’re interested in taking the first steps in this exciting area.
Tutorials and blog posts that came across our desk this week.
- Facebook Libra: the inside story of how the company’s cryptocurrency dream died [ft.com]
- Principal Component Selection: The Broken-Stick Model [mohanwugupta.com]
- Shopify’s Data Science & Engineering Foundations [shopify.engineering]
- Why Graph Computing is STELLAR [juliustech.co]
- How to use undocumented web APIs [jvns.ca]
- Damn Cool Algorithms: Levenshtein Automata [blog.notdot.net]
- 5 Python Libraries That Will Help Automate Your Life [medium.com/geekculture]
- What You Must Know about Memory, Caches, and Shared Memory [eidos.ic.i.u-tokyo.ac.jp/~tau]
Some of the most interesting academic papers published recently
- Covid-19 Vaccine Effectiveness against the Omicron (B.1.1.529) Variant (N. Andrews, J. Stowe, F. Kirsebom, S. Toffa, T. Rickeard, E. Gallagher, C. Gower, M. Kall et al)
- Homophily in Voting Behavior: Evidence from Preferential Voting (L. Coufalová, Š. Mikula, M. Ševčík)
- Modeling Communicable Diseases, Human Mobility, and Epidemics: A Review (D. Soriano-Paños, W. Cota, S. C. Ferreira, G. Ghoshal, A. Arenas, J. Gómez-Gardeñes)
- Changes in social contacts in England during the COVID-19 pandemic between March 2020 and March 2021 as measured by the CoMix survey: A repeated cross-sectional study (A. Gimma, J. D. Munday, K. L. M. Wong, P. Coletti, K. van Zandvoort, K. Prem, CMMID COVID-19 working group, P. Klepac, G. James Rubin, S. Funk, W. J. Edmunds, C. I. Jarvis)
- No More Than 6ft Apart: Robust K-Means via Radius Upper Bounds (A. I. Humayun, R. Balestriero, A. Kyrillidis, R. Baraniuk)
- It is high time we let go of the Mersenne Twister (S. Vigna)
- Bandit Sampling for Multiplex Networks (C. Baykal, V. K. Potluru, S. Shah, M. M. Veloso)
- The mathematics of adversarial attacks in AI — Why deep learning is unstable despite the existence of stable neural networks (A. Bastounis, A. C. Hansen, V. Vlačić)
Interesting discussions, ideas or tutorials that came across our desk.
All the videos of the week are now available in our Youtube playlist.
Opportunities to learn from us:
- Apr 20, 2022 — Natural Language Processing (NLP) for Everyone [Register]
- Apr 27, 2022 — NLP with Deep Learning for Everyone [Register]
- May 06, 2022 — Applied Probability Theory for Everyone [Register] 🆕
- May 20, 2022 — Transforming Excel Analysis into Python and pandas Data Models [Register] 🆕
Long form tutorials:
- Natural Language Processing 5.5h, covering basic and advancing techniques using NLTK and Keras
- Times Series Analysis for Everyone 6h covering data pre-processing, visualization, ARIMA, ARCH and Deep Learning models
Thank you for subscribing to our weekly newsletter with a quick overview of the world of Data Science and Machine Learning. Please share with your contacts to help us grow!
Publishes on Sunday.