Let’s Build!
Resources
Bleeding-edge resources
Tutorial on MNIST Digit Classification Using ClearML
Introduction If you are a Data Scientist or MLOps Engineer, at some point, you would have faced problems tracking code, data, and models for different versions of the same task while collaborating with fellow members. To reduce the complexity revolving around...
Data Science on AWS
Data Science Book of the WeekThis weeks Data Science Book is “Data Science on AWS” by Chris Fregly and Antje Barth. We’ve all had the experience of running out of memory or compute power while performing our data analysis or training our models. In this...
The Ultimate Guide To Developing A Winning Trading Strategy With NLP Sentiment Analysis
With all the hype about generative AI generated by the rapid adoption of ChatGPT, it’s no surprise that AI is starting to play a major role outside the tech sector. In order to stay ahead of the times, it’s crucial not only to understand the inner workings of...
Happy Hour Diaries: Through employees’ lens at Data Science Dojo | Data Science Dojo
Who says you cannot have a happy hour while working from home? At Data Science Dojo, we have cracked the code on how to chat, laugh, and connect – virtually! No more worrying about awkward small talk with the boss’s boss – we are all on the same virtual playing...
Multi-label NLP: An Analysis of Class Imbalance and Loss Function Approaches – KDnuggets
Multi-label NLP refers to the task of assigning multiple labels to a given text input, rather than just one label. In traditional NLP tasks, such as text classification or sentiment analysis, each input is typically assigned a single label based on its content....
Science Teacher to Data Analyst: Kevin Johnson’s Dataquest Success Story – Dataquest
Kevin Johnson, Data Analyst Kevin Johnson is a data analyst with the SDG Group, a role he landed in 2022 after completing the Data Scientist track at Dataquest. But Johnson didn't start in data science. In fact, he started in what you might call...
A Guide to Using RAID for Database Servers – Programmathically
RAID, or Redundant Array of Independent Disks, is a technology that helps protect against the loss of data in case one disk fails by providing redundancy through multiple disks. While there are many different types of RAID configurations available, it’s...
How to Build a CI/CD MLOps Pipeline [Case Study] – neptune.ai
Based on the McKinsey survey, 56% of orgs today are using machine learning in at least one business function. It’s clear that the need for efficient and effective MLOps and CI/CD practices is becoming increasingly vital. This article is a real-life study of...
How to convert images to text? : 3 ways to extract text from images
Images are everywhere. Images are used to share memories and information across conversations. However, there are times when you want might need to convert images into text.Think about your employees sending receipt images to your accounting department or your...
Level up with Databricks at Game Developers Conference
We're thrilled to announce that Databricks will be sponsoring Game Developers Conference (GDC), the world's largest professional game industry event, taking place March 20th-24th in San Francisco.At Databricks, we specialize in helping game studios build better...
Accelerate time to insight with Amazon SageMaker Data Wrangler and the power of Apache Hive | Amazon Web Services
Amazon SageMaker Data Wrangler reduces the time it takes to aggregate and prepare data for machine learning (ML) from weeks to minutes in Amazon SageMaker Studio. Data Wrangler enables you to access data from a wide variety of popular sources (Amazon S3, Amazon...
How to Overcome Position Bias in Recommendation and Search?
Introduction People click on top items in search and recommendations more often because they are on top, not because of their relevancy. If you order your search results with an ML model, they may eventually degrade in quality because of such a positive...