1 Introduction & Basic Ideas of Language Models| A Pinch of Data
Embeddings are numerical representations of machine learning features used as input to deep learning models. One-hot encoding, TF-IDF and PCA were early ways of compressing large amounts of textual data. Word2Vec was first step forward in moving on from simple statistical representations to semantic meaning of words. Transformers, transfer learning, generative methods etc. have all contributed to the explosion in use of embeddings and establishing them as a foundational ML data structure.| A Pinch of Data
Accessing secret keys in our usual desktop file based development environments is just storing the secret keys in config / txt files and calling them in directly. But there is no direct file system that a Google Colab notebook refers to, unless you ask them to.| A Pinch of Data
First go to the repo of your interest. I am going to create a new repo in Github just to try this out.| A Pinch of Data
I did not know this was possible. Most of my dev work is split into writing Python code in VS Code and SQL in Snowflake’s Snowsight UI. Turns out you could connect to your Snowflake warehouse using a Snowflake Extension for VS Code.| A Pinch of Data
Why track metrics?| A Pinch of Data
Independent variables - variables on which the predictions are calculated Predictions - results of the model Label - The data that we’re trying to predict, such as “dog” or “cat” Architecture - The template of the model that we’re trying to fit; the actual mathematical function that we’re passing the input data and parameters to Weights - parameters of the model Model - The combination of the architecture with a particular set of parameters Fit - Update the parameters of the mod...| A Pinch of Data
I was reading Vicki Boykis’ blog post - Both Pyramids Are White on how groups collectively think. The entire post revolves around a 1971 Russian video that is difficult for any non Russian speakers to understand.| A Pinch of Data