My initial aim was to build a document processing model. But the idea was far fetched for my skill at the time. So, I settled down for building a toy version of visual language model for better understanding of VLM. I’m documenting my intuition for the benefit of myself and others. My model will receive image as an input and return its caption as an output. Luckily, I found a dataset with image and it’s caption to train.| Poonai's space
My Journey to Building Agentic Apps My childhood dream of having a personal J.A.R.V.I.S has come true. The recent advancement in LLMs (Large Language Models) made me look at it as everyone. As an industry, we all are figuring out how to build agentic apps “the right way.”| Poonai's space
I’m a mediocre engineer who does systems work and never had experience in the typical user-facing software space. I’ve contributed to software that scales but never really had a chance to experience the vibe of serving millions of users. Recently, one of my friends explained to me the kind of events they track in their startup. I felt sick after hearing about the kind of events that they track, which are typically very personal to the user. Companies collect data ranging from the user’s...| Poonai's space