Skip to main content

Engrain - Beefing Up

It was now time to scale this project a bit. Static websites were working but I needed something which can handle a large frontend codebase and it's complexities (different modes in UI and stuff). I then went ahead with Nextjs. Nextjs allows you to split your code into different component folders making it very efficient and effective way to manage your code.


You can either choose page routing or app routing. App routing has better performance. The components in the app routing renders on the server - fast loading time, whereas, in page routing it renders on the client browser. The app routing supports streaming, the page routing does not.


I then moved from the local storage to Supabase for storing all my books, chapters, highlights and logs. It has a free-tier, easy to set up and also provides various authentication methods. It has PostgreSQL under the hood. It also supports pgvector to store vectors into the database (I will demonstrate the need for this in a bit). 


Now was the time for the most basic feature: searching the highlights. Since at some point, there would be hundreds of highlights per user in the database, it would be difficult to search a set of highlights based on a keyword. Here comes the RAG (Retrieval Augmented Generation). An embedding model vectorizes the highlights based on the keywords and semantics, making retrieval of the query-related highlights easy (the query is also vectorized and based on the similar vectors, points in space, top k vectors/highlights are retrieved). There are many embedding models - paid ones and open-source ones. For the initial version, I needed an open-source model. I went ahead with BAAI/bge-m3 model. It has a strong performance, is multi-lingual, and has 3 retrieval functionalities: Dense, Sparse and Multi-Vector. There are many algorithms to measure vector similarity. I chose cosine similarity as it is simple, effective, and works well regardless of vector magnitude.


When all of this came together, I needed to test if the backend, containing multiple endpoints, works out. For this, I created a test python file which hits each endpoint with test inputs and prints the responses, acting as a quick sanity check to confirm the entire backend: Summarize, Brainstorm, Socratic, Query (Ingestion, Embedding, and Retrieval), is working end to end.

Comments

Popular posts from this blog

Does God exist?

Does God exist? There is an emotional perspective to this question (called as religion) and then there is a logical perspective. Let us touch upon the latter one in this blog. If God doesn't exist, who created this universe? How are we conscious? Who designed everything so perfectly that we are alive? These are some of the typical questions of the "logical" class of people. But what they forget to take into consideration is that how old the life on earth really is compared to the age of the universe? The age of the universe is estimated to be around 13.8 billion years since the Big Bang. The life on earth started to exist from about 3.7 billion years ago. What does that mean? If there really was a creator, what took him/her so long to form life on earth? And if indeed he designed every position of the planet, comet and space-time fabric so perfectly, why was there an imperfection for about 10.1 billion years? What do you mean by "perfect"? Anything that supports...

What is mind?

What is mind? Think about it. Vow, isn't it an irony? How can you even think about thought? That's right. Do not think about it. Observe your thought. What is it exactly? Isn't it something that you say to yourself? It is basically a bundle of words. But wait, where do these words come from? And are you even controlling it? Can you control all of your thoughts? How do we find answers to these questions? Observe. You don't control most of your thoughts. They just happen subconsciously. Where do the words come from? From all of the information that you have collected till now from birth. Think of it like ChatGPT, someone is prompting and it is answering. Is ChatGPT aware and in control? No. It is answering what it knows, what it is trained for. Similar is our mind, except, the prompt is either the sensory inputs or the mind itself. Aren't we the best AI Language Model possible already? Not in terms of latency, but in terms of accuracy.

Top Prompting Techniques for LLMs

 1. Ask it to imitate the person you look up to. Example, MS Dhoni. You don't have to meet Dhoni to ask your question any more. LLM can answer your query pretending to be him. 2. GRWC - Specify Goal, Return format, Warnings, Context. 3. Reverse prompting - Give LLM a piece of art that you want to get inspiration from and ask it to design a prompt that will generate that art piece. Now, tune that prompt to your liking.  4. When reasoning, ask the LLM for a tree of thought to consider all possibilities and explore all options rationally. Think of what frameworks you apply when you think and replicate the same thing with the LLM.